Using Subqueries to Define Unknown Data 225

Một phần của tài liệu 780 sams teach yourself SQL in 24 hours, 5th edition (Trang 242 - 272)

Using Subqueries to Define Unknown Data

What You’ll Learn in This Hour:

. What a subquery is

. The justifications of using subqueries

. Examples of subqueries in regular database queries . Using subqueries with data manipulation commands . Embedded subqueries

In this hour, you are introduced to the concept of subqueries. Using sub- queries enables you to more easily preform complex queries.

What Is a Subquery?

Asubquery, also known as a nested query, is a query embedded within the WHEREclause of another query to further restrict data returned by the query.

A subquery returns data that is used in the main query as a condition to further restrict the data to be retrieved. Subqueries are employed with the SELECT,INSERT,UPDATE, and DELETEstatements.

You can use a subquery in some cases in place of a join operation by indi- rectly linking data between the tables based on one or more conditions.

When you have a subquery in a query, the subquery is resolved first, and then the main query is resolved according to the condition(s) resolved by the subquery. The results of the subquery process expressions in the WHERE clause of the main query. You can use the subquery either in the WHERE clause or the HAVINGclause of the main query. You can use logical and rela- tional operators, such as =,>,<,<>,!=,IN,NOT IN,AND,OR, and so on, within the subquery as well as to evaluate a subquery in the WHEREorHAVING

Did You Know?

The Rules of Using Subqueries

The same rules that apply to standard queries also apply to subqueries. You can use join operations, functions, conversions, and other options within a subquery.

Use Indentation for Neater Statement Syntax

Notice the use of indentation in our examples. The use of indentation is merely for readability. The neater your statements are, the easier it is to read and find syntax errors.

Subqueries must follow a few rules:

. Subqueries must be enclosed within parentheses.

. A subquery can have only one column in the SELECTclause, unless multiple columns are in the main query for the subquery to com- pare its selected columns.

. You cannot use an ORDER BYclause in a subquery, although the main query can use an ORDER BYclause. You can use the GROUP BY clause to perform the same function as the ORDER BYclause in a subquery.

. You can only use subqueries that return more than one row with multiple value operators, such as the INoperator.

. TheSELECTlist cannot include references to values that evaluate to aBLOB,ARRAY,CLOB, or NCLOB.

. You cannot immediately enclose a subquery in a SETfunction.

. You cannot use the BETWEENoperator with a subquery; however, you can use the BETWEENoperator within the subquery.

The basic syntax for a subquery is as follows:

SELECT COLUMN_NAME FROM TABLE

WHERE COLUMN_NAME = (SELECT COLUMN_NAME FROM TABLE

WHERE CONDITIONS);

The following examples show how you can and cannot use the BETWEEN operator with a subquery. Here is an example of a correct use of BETWEENin the subquery:

SELECT COLUMN_NAME FROM TABLE_A

By the Way

What Is a Subquery? 227

WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME FROM TABLE_B)

WHERE VALUE BETWEEN VALUE)

You cannot use BETWEENas an operator outside the subquery. The following is an example of an illegal use of BETWEENwith a subquery:

SELECT COLUMN_NAME FROM TABLE_A

WHERE COLUMN_NAME BETWEEN VALUE AND (SELECT COLUMN_NAME FROM TABLE_B)

Subqueries with the SELECT Statement

Subqueries are most frequently used with the SELECTstatement, although you can use them within a data manipulation statement as well. The sub- query, when employed with the SELECTstatement, retrieves data for the main query to use.

The basic syntax is as follows:

SELECT COLUMN_NAME [, COLUMN_NAME ] FROM TABLE1 [, TABLE2 ]

WHERE COLUMN_NAME OPERATOR

(SELECT COLUMN_NAME [, COLUMN_NAME ] FROM TABLE1 [, TABLE2 ]

[ WHERE ])

The following is an example:

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

AND EP.PAY_RATE < (SELECT PAY_RATE FROM EMPLOYEE_PAY_TBL WHERE EMP_ID = ‘443679012’);

The preceding SQL statement returns the employee identification, last name, first name, and pay rate for all employees who have a pay rate greater than that of the employee with the identification 443679012. In this case, you do not necessarily know (or care) what the exact pay rate is for this particular employee; you only care about the pay rate for the purpose of getting a list of employees who bring home more than the employee specified in the subquery.

Using Subqueries for Unknown Values

Subqueries are frequently used to place conditions on a query when the exact

Did You Know?

The next query selects the pay rate for a particular employee. This query is used as the subquery in the following example.

SELECT PAY_RATE FROM EMPLOYEE_PAY_TBL WHERE EMP_ID = ‘220984332’;

PAY_RATE --- 11

1 row selected.

The previous query is used as a subquery in the WHEREclause of the follow- ing query:

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

AND EP.PAY_RATE > (SELECT PAY_RATE FROM EMPLOYEE_PAY_TBL WHERE EMP_ID = ‘220984332’);

EMP_ID LAST_NAME FIRST_NAME PAY_RATE --- --- --- --- 442346889 PLEW LINDA 14.75 443679012 SPURGEON TIFFANY 15 2 rows selected.

The result of the subquery is 11(shown in the last example), so the last con- dition of the WHEREclause is evaluated as

AND EP.PAY_RATE > 11

You did not know the value of the pay rate for the given individual when you executed the query. However, the main query was able to compare each individual’s pay rate to the subquery results.

Subqueries with the INSERT Statement

Always Remember to COMMIT Your DML

Remember to use the COMMITandROLLBACKcommands when using DML com- mands such as the INSERTstatement.

By the Way

You can also use subqueries in conjunction with Data Manipulation Language (DML)statements. The INSERTstatement is the first instance you examine. It uses the data returned from the subquery to insert into another table. You

What Is a Subquery? 229

can modify the selected data in the subquery with any of the character, date, or number functions.

The basic syntax is as follows:

INSERT INTO TABLE_NAME [ (COLUMN1 [, COLUMN2 ]) ] SELECT [ *|COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ] [ WHERE VALUE OPERATOR ]

The following is an example of the INSERTstatement with a subquery:

INSERT INTO RICH_EMPLOYEES

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

AND EP.PAY_RATE > (SELECT PAY_RATE FROM EMPLOYEE_PAY_TBL WHERE EMP_ID = ‘220984332’);

2 rows created.

ThisINSERTstatement inserts the EMP_ID,LAST_NAME,FIRST_NAME, and

PAY_RATEinto a table called RICH_EMPLOYEESfor all records of employees who have a pay rate greater than the pay rate of the employee with identifica- tion220984332.

Subqueries with the UPDATE Statement

You can use subqueries in conjunction with the UPDATEstatement to update single or multiple columns in a table. The basic syntax is as follows:

UPDATE TABLE

SET COLUMN_NAME [, COLUMN_NAME) ] = (SELECT ]COLUMN_NAME [, COLUMN_NAME) ] FROM TABLE

[ WHERE ]

Examples showing the use of the UPDATEstatement with a subquery follow.

The first query returns the employee identification of all employees who reside in Indianapolis. You can see that four individuals meet this criterion.

SELECT EMP_ID FROM EMPLOYEE_TBL

WHERE CITY = ‘INDIANAPOLIS’;

EMP_ID --- 442346889 313782439

443679012 4 rows selected.

The first query is used as the subquery in the following statement; it proves how many employee identifications are returned by the subquery. The fol- lowing is the UPDATEwith the subquery:

UPDATE EMPLOYEE_PAY_TBL SET PAY_RATE = PAY_RATE * 1.1 WHERE EMP_ID IN (SELECT EMP_ID

FROM EMPLOYEE_TBL

WHERE CITY = ‘INDIANAPOLIS’);

4 rows updated.

As expected, four rows are updated. One important thing to notice is that, unlike the example in the first section, this subquery returns multiple rows of data. Because you expect multiple rows to be returned, you use the IN operator instead of the equal sign. Remember that INcompares an expres- sion to values in a list. If you had used the equal sign, an error would have been returned.

Subqueries with the DELETE Statement

You can also use subqueries in conjunction with the DELETEstatement. The basic syntax is as follows:

DELETE FROM TABLE_NAME [ WHERE OPERATOR [ VALUE ]

(SELECT COLUMN_NAME FROM TABLE_NAME) [ WHERE) ]

In the following example, you delete the BRANDON GLASSrecord from

EMPLOYEE_PAY_TBL. You do not know Brandon’s employee identification num- ber, but you can use a subquery to get his identification number from EMPLOYEE_TBL, which contains the FIRST_NAMEandLAST_NAMEcolumns.

DELETE FROM EMPLOYEE_PAY_TBL WHERE EMP_ID = (SELECT EMP_ID

FROM EMPLOYEE_TBL

WHERE LAST_NAME = ‘GLASS’

AND FIRST_NAME = ‘BRANDON’);

1 row deleted.

Embedded Subqueries 231

Embedded Subqueries

Check the Limits of Your System

You must check your particular implementation for limits on the number of sub- queries, if any, that you can use in a single statement. It might differ between vendors.

By the Way

You can embed a subquery within another subquery, just as you can embed the subquery within a regular query. When a subquery is used, that sub- query is resolved before the main query. Likewise, the lowest level subquery is resolved first in embedded or nested subqueries, working out to the main query.

The basic syntax for embedded subqueries is as follows:

SELECT COLUMN_NAME [, COLUMN_NAME ] FROM TABLE1 [, TABLE2 ]

WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME FROM TABLE

WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME FROM TABLE

[ WHERE COLUMN_NAME OPERATOR VALUE ]))

The following example uses two subqueries, one embedded within the other.

You want to find out what customers have placed orders in which the quan- tity multiplied by the cost of a single order is greater than the sum of the cost of all products.

SELECT CUST_ID, CUST_NAME FROM CUSTOMER_TBL

WHERE CUST_ID IN (SELECT O.CUST_ID

FROM ORDERS_TBL O, PRODUCTS_TBL P WHERE O.PROD_ID = P.PROD_ID

AND O.QTY + P.COST < (SELECT SUM(COST) FROM

PRODUCTS_TBL));

CUST_ID CUST_NAME

--- --- 090 WENDY WOLF

232 LESLIE GLEASON 287 GAVINS PLACE 43 SCHYLERS NOVELTIES 432 SCOTTYS MARKET 560 ANDYS CANDIES

Always Use a WHERE Clause

Do not forget the use of the WHEREclause with the UPDATEandDELETEstate- ments. All rows are updated or deleted from the target table if the WHEREclause is not used. You can utilize a SELECTstatement with the WHEREclause first to ensure that you are modifying the correct rows. See Hour 5, “Manipulating Data.”

Watch Out!

Six rows that meet the criteria of both subqueries were selected.

The following two examples show the results of each of the subqueries to aid your understanding of how the main query was resolved:

SELECT SUM(COST) FROM PRODUCTS_TBL;

SUM(COST) ---

138.08

1 row selected.

SELECT O.CUST_ID

FROM ORDERS_TBL O, PRODUCTS_TBL P WHERE O.PROD_ID = P.PROD_ID

AND O.QTY + P.COST > 138.08;

CUST_ID --- 43 287

2 rows selected.

In essence, the main query, after the substitution of the second subquery, is evaluated as shown in the following example:

SELECT CUST_ID, CUST_NAME FROM CUSTOMER_TBL

WHERE CUST_ID IN (SELECT O.CUST_ID

FROM ORDERS_TBL O, PRODUCTS_TBL P WHERE O.PROD_ID = P.PROD_ID

AND O.QTY + P.COST > 138.08);

The following shows how the main query is evaluated after the substitution of the first subquery:

SELECT CUST_ID, CUST_NAME FROM CUSTOMER_TBL

WHERE CUST_ID IN (287,43);

The following is the final result:

Correlated Subqueries 233

CUST_ID CUST_NAME

--- --- 43 SCHYLERS NOVELTIES 287 GAVINS PLACE 2 rows selected.

Multiple Subqueries Can Cause Problems

The use of multiple subqueries results in slower response time and might result in reduced accuracy of the results due to possible mistakes in the statement coding.

Watch Out!

Correlated Subqueries

Correlated subqueriesare common in many SQL implementations. The con- cept of correlated subqueries is discussed as an ANSI-standard SQL topic and is covered briefly in this hour. A correlated subquery is a subquery that is dependent upon information in the main query. This means that tables in a subquery can be related to tables in the main query.

In the following example, the table join betweenCUSTOMER_TBLand ORDERS_TBLin the subquery is dependent on the alias forCUSTOMER_TBL (C) in the main query. This query returns the name of all customers who have ordered more than 10 units of one or more items.

SELECT C.CUST_NAME FROM CUSTOMER_TBL C

WHERE 10 < (SELECT SUM(O.QTY) FROM ORDERS_TBL O

WHERE O.CUST_ID = C.CUST_ID);

CUST_NAME

--- SCOTTYS MARKET SCHYLERS NOVELTIES MARYS GIFT SHOP 3 rows selected.

You can extract and slightly modify the subquery from the previous state- ment in the next statement to show you the total quantity of units ordered for each customer, allowing the previous results to be verified:

SELECT C.CUST_NAME, SUM(O.QTY) FROM CUSTOMER_TBL C,

ORDERS_TBL O

CUST_NAME SUM(O.QTY) --- --- ANDYS CANDIES 1

GAVINS PLACE 10 LESLIE GLEASON 1 MARYS GIFT SHOP 100 SCHYLERS NOVELTIES 25 SCOTTYS MARKET 20 WENDY WOLF 2 7 rows selected.

TheGROUP BYclause in this example is required because another column is being selected with the aggregate function SUM. This gives you a sum for each customer. In the original subquery, a GROUP BYclause is not required becauseSUMachieves a total for the entire query, which is run against the record for each customer.

Subquery Performance

Subqueries do have performance implications when used within a query.

You must consider those implications prior to implementing them in a pro- duction environment. Consider that a subquery must be evaluated prior to the main part of the query, so the time that it takes to execute the subquery has a direct effect on the time it takes for the main query to execute. Let’s look at our previous example:

Proper Use of Correlated Subqueries

In the case of a correlated subquery, you must reference the table in the main query before you can resolve the subquery.

By the Way

SELECT CUST_ID, CUST_NAME FROM CUSTOMER_TBL

WHERE CUST_ID IN (SELECT O.CUST_ID

FROM ORDERS_TBL O, PRODUCTS_TBL P WHERE O.PROD_ID = P.PROD_ID

AND O.QTY + P.COST < (SELECT SUM(COST) FROM

PRODUCTS_TBL));

Imagine what would happen if PRODUCTS_TBLcontained a couple thousand product lines and ORDERS_TBLcontained a few million lines of customer orders. The resulting effect of having to do a SUMacrossPRODUCTS_TBLand then join it with ORDERS_TBLcould slow the query down quite considerably.

So always remember to evaluate the effect that using a subquery has on

Q&A 235

performance when deciding on a course of action to take for getting infor- mation out of the database.

Summary

By simple definition and general concept, a subquery is a query that is per- formed within another query to place further conditions on a query. You can use a subquery in an SQL statement’s WHEREclause or HAVINGclause.

Queries are typically used within other queries (Data Query Language), but you can also use them in the resolution of DML statements such as INSERT, UPDATE, and DELETE. All basic rules for DML apply when using subqueries with DML commands.

The subquery’s syntax is virtually the same as that of a standalone query, with a few minor restrictions. One of these restrictions is that you cannot use the ORDER BYclause within a subquery; you can use a GROUP BYclause, however, which renders virtually the same effect. Subqueries are used to place conditions that are not necessarily known for a query, providing more power and flexibility with SQL.

Q&A

Q. In the examples of subqueries, I noticed quite a bit of indentation. Is this nec- essary in the syntax of a subquery?

A. Absolutely not. The indentation is used merely to break the statement into separate parts, making the statement more readable and easier to follow.

Q. Is there a limit on the number of embedded subqueries that can be used in a single query?

A. Limitations such as the number of embedded subqueries allowed and the number of tables joined in a query are specific to each implemen- tation. Some implementations might not have limits, although the use of too many embedded subqueries could drastically hinder SQL statement performance. Most limitations are affected by the actual hardware, CPU speed, and system memory available, although there are many other considerations.

Q. It seems that debugging a query with subqueries can prove to be confusing, especially with embedded subqueries. What is the best way to debug a query

A. The best way to debug a query with subqueries is to evaluate the query in sections. First evaluate the lowest-level subquery, and then work your way to the main query (the same way the database evaluates the query). When you evaluate each subquery individually, you can sub- stitute the returned values for each subquery to check your main query’s logic. An error with a subquery often results from the use of the operator that evaluates the subquery, such as (=),IN,>,<, and so on.

Workshop

The following workshop is composed of a series of quiz questions and practi- cal exercises. The quiz questions are designed to test your overall under- standing of the current material. The practical exercises are intended to afford you the opportunity to apply the concepts discussed during the cur- rent hour, as well as build upon the knowledge acquired in previous hours of study. Please take time to complete the quiz questions and exercises before continuing. Refer to Appendix C, “Answers to Quizzes and Exercises,”

for answers.

Quiz

1. What is the function of a subquery when used with a SELECT statement?

2. Can you update more than one column when using the UPDATE statement in conjunction with a subquery?

3. Do the following have the correct syntax? If not, what is the correct syntax?

a.

SELECT CUST_ID, CUST_NAME FROM CUSTOMER_TBL WHERE CUST_ID =

(SELECT CUST_ID

FROM ORDERS_TBL

WHERE ORD_NUM = ‘16C17’);

b.

SELECT EMP_ID, SALARY FROM EMPLOYEE_PAY_TBL WHERE SALARY BETWEEN ‘20000’

AND (SELECT SALARY FROM EMPLOYEE_ID

WHERE SALARY = ‘40000’);

Workshop 237

c.

UPDATE PRODUCTS_TBL SET COST = 1.15 WHERE CUST_ID =

(SELECT CUST_ID FROM ORDERS_TBL

WHERE ORD_NUM = ‘32A132’);

4. What would happen if you ran the following statement?

DELETE FROM EMPLOYEE_TBL WHERE EMP_ID IN

(SELECT EMP_ID

FROM EMPLOYEE_PAY_TBL);

Exercises

1. Write the SQL code for the requested subqueries, and compare your results to ours. Use the following tables to complete the exercises:

EMPLOYEE_TBL

EMP_ID VARCHAR(9) NOT NULL primary key LAST_NAME VARCHAR(15) NOT NULL

FIRST_NAME VARCHAR(15) NOT NULL MIDDLE_NAME VARCHAR(15)

ADDRESS VARCHAR(30) NOT NULL CITY VARCHAR(15) NOT NULL STATE VARCHAR(2) NOT NULL ZIP INTEGER(5) NOT NULL PHONE VARCHAR(10)

PAGER VARCHAR(10) EMPLOYEE_PAY_TBL

EMP_ID VARCHAR(9) NOT NULL primary key POSITION VARCHAR(15) NOT NULL

DATE_HIRE DATETIME

PAY_RATE DECIMAL(4,2) NOT NULL DATE_LAST_RAISE DATETIME

CONSTRAINT EMP_FK FOREIGN KEY (EMP_ID_ REFERENCES EMPLOYEE_TBL (EMP_ID)

CUSTOMER_TBL

CUST_ID VARCHAR(10) NOT NULL primary key CUST_NAME VARCHAR(30) NOT NULL

CUST_ADDRESS VARCHAR(20) NOT NULL CUST_CITY VARCHAR(15) NOT NULL CUST_STATE VARCHAR(2) NOT NULL CUST_ZIP INTEGER(5) NOT NULL CUST_PHONE INTEGER(10)

Một phần của tài liệu 780 sams teach yourself SQL in 24 hours, 5th edition (Trang 242 - 272)

Tải bản đầy đủ (PDF)

(497 trang)