Microsoft SQL Server 2008 R2 Unleashed- P134 ppsx

ptg 1274 CHAPTER 35 Understanding Query Optimization FIGURE 35.29 A graphical execution plan of a query using parallel query techniques. Common Query Optimization Problems So you’ve written a query and examined the query plan, and performance isn’t what you expected. It might appear that SQL Server isn’t choosing the appropriate query plan that you expect. Is something wrong with the query or with the Query Optimizer? Before delving into a detailed discussion about how to debug and analyze query plans (covered in detail in Chapter 36), the following sections look at some of the most common problems and SQL coding issues that can lead to poor query plan selection. Out-of-Date or Insufficient Statistics Admittedly, having out-of-date or unavailable statistics is not as big a problem as it was in SQL Server releases prior to 7.0. Back in those days, the first question asked when someone was complaining of poor performance was, “When did you last update statistics?” If the answer was “Huh?” we usually found the culprit. With the Auto-Update Statistics and Auto-Create Statistics features in SQL Server 2008, this problem is not as prevalent as it used to be. If a query detects that statistics are out of date or missing, it causes them to be updated or created and then optimizes the query plan based on the new statistics. Download from www.wowebook.com ptg 1275 Common Query Optimization Problems 35 NOTE If statistics are missing or out of date, the first running query that detects this condi- tion might run a bit more slowly as it updates or creates the statistics first, especially if the table is relatively large, and also if it has been configured for FULLSCAN when indexes are updated. However, SQL Server 2008 provides the AUTO_UPDATE_STATISTICS_ASYNC database option. When this option is set to ON, queries do not wait for the statistics to be updated before compiling. Instead, the out-of-date statistics are put on a queue for updating by a worker thread in a background process, and the query and any other concurrent queries compile immediately, using the existing out-of-date statistics. Although there is no delay for updated statistics, the out-of-date statistics may cause the Query Optimizer to choose a less efficient query plan, but the response times are more predictable. Any queries invoked after the updated statistics are ready will use the updated statistics in generating a query plan. This may cause the recompilation of any cached plans that depend on the older statistics. You should consider setting the AUTO_UPDATE_STATISTICS_ASYNC option to ON when any of your applications have experienced client request timeouts caused by queries waiting for updated statistics or when it is acceptable for your application to run queries with less efficient query plans due to outdated statistics so that you can main- tain predictable query response times. You could have insufficient statistics to properly optimize a query if the sample size used when the statistics were generated wasn’t large enough. Depending on the nature of your data and size of the table, the statistics might not accurately reflect the actual data distribution and cardinality. If you suspect that this is the case, you can update statistics by specifying the FULLSCAN option or a larger sample size, so SQL Server examines more records to derive the statistics. For more information on understanding and managing index statistics, see Chapter 34. Poor Index Design Poor index design is another reason—often a primary reason—why queries might not optimize as you expect them to. If no supporting indexes exist for a query, or if a query contains SARGs that cannot be optimized effectively to use the available indexes, SQL Server ends up performing either a table scan, an index scan, or another hash or merge join strategy that is less efficient. If this appears to be the problem, you need to reevaluate your indexing decisions or rewrite the query so it can take advantage of an available index. For more information on designing useful indexes, see Chapter 34. Download from www.wowebook.com ptg 1276 CHAPTER 35 Understanding Query Optimization Search Argument Problems It’s the curse of SQL that there are a number of ways to write a query and get the same result set. Some queries, however, might not be as efficient as others. A good understanding of the Query Optimizer can help you avoid writing search arguments that SQL Server can’t optimize effectively. The following sections highlight some of the common “gotchas” encountered in SQL Server SARGs that can lead to poor or unexpected query performance. Using Optimizable SARGs As mentioned previously, in the section “Identifying Search Arguments,” the Query Optimizer uses search arguments to help it narrow down the set of rows to evaluate. The search argument is in the form of a WHERE clause that equates a column to a constant. The SARGs that optimize most effectively are those that compare a column with a constant value that is not an expression or a variable, and with no operation performed against the column itself. The following is an example: SELECT column1 FROM table1 WHERE column1 = 123 You should try to avoid using any negative logic in your SARGs (for example, !=, <>, not in) or performing operations on, or applying functions to, the columns in the SARG. No SARGs You need to watch out for queries in which the SARG might have been left out inadver- tently, such as this: select title_id from titles A SQL query with no search argument (that is, no WHERE clause) always performs a table or clustered index scan unless a nonclustered index can be used to cover the query. (See Chapter 34 for a discussion of index covering.) If you don’t want the query to affect the entire table, you need to be sure to specify a valid SARG that matches an index on the table to avoid table scans. Unknown Values in WHERE Clauses You need to watch out for expressions in which the search value in the SARG cannot be evaluated until runtime. In these expressions, often the search value is a local variable or subquery that can be materialized to a single value. SQL Server treats these expressions as SARGs but can’t use the statistics histogram to estimate the number of matching rows because it doesn’t have a value to compare against the histogram values during query optimization. The values for the expressions aren’t known until the query is actually executed. In this situation, the Query Optimizer uses the index density information. The Query Optimizer is generally able to better estimate the number of rows affected by a query when it can compare a known value against the statistics Download from www.wowebook.com ptg 1277 Common Query Optimization Problems 35 histogram than when it has to use the index density to estimate the average number of rows that match an unknown value. This is especially true if the data in a table isn’t distributed evenly. When you can, you should try to avoid using constant expressions that can’t be evaluated until runtime so that the statistics histogram can be used rather than the density value. To avoid using constant expressions in WHERE clauses that can’t be evaluated until runtime, you should consider putting the queries into stored procedures and passing in the constant expression as a parameter. Because the Query Optimizer evaluates the value of a parameter prior to optimization, SQL Server evaluates the expression prior to optimizing the stored procedure. For best results when writing queries inside stored procedures, you should use stored procedure parameters rather than local variables in your SARGs whenever possible. This strategy allows the Query Optimizer to optimize the query by using the statistics histogram, comparing the parameter value against the statistics histogram to estimate the number of matching rows. If you use local variables as SARGs in stored procedures, the Query Optimizer is restricted to using index density, even if the local variable is assigned the value of a parameter. Other types of constructs for which it is difficult for the Query Optimizer to accurately estimate the number of qualifying rows or the data distribution using the statistics histogram include aggregations in subqueries, scalar expressions, user-defined functions, and noninline table-valued functions. Data Type Mismatches Another common problem is data type mismatches. If you attempt to join tables on columns of different data types, the Query Optimizer might not be able to effectively use indexes to evaluate the join. This can result in a less efficient join strategy because SQL Server has to convert all values first before it can process the query. You should avoid this situation by maintaining data type consistency across the join key columns in your database. Large Complex Queries For complex queries with a large number of tables and join conditions, the number of possible execution plans can be enormous. The full optimization phase of the Query Optimizer has a time limit to restrict how long it spends analyzing all the possible query plans. There is no known general and effective shortcut to arrive at the optimal plan. To deal with such a large selection of plans, SQL Server 2008 implements a number of heuristics to deal with very large queries and attempt to come up with an efficient query plan within the time available. When it is not possible to analyze the entire set of plan alternatives and the heuristics are applied, it is not uncommon to encounter suboptimal query plans being chosen. Download from www.wowebook.com ptg 1278 CHAPTER 35 Understanding Query Optimization When is your query large enough to be a concern? Answering this question is difficult because the answer depends on the number of tables involved, the form of filter and join predicates, and the operations performed. If a query involves more than 12 tables, it is likely that the Query Optimizer is having to rely on heuristics and shortcuts to generate a query plan and may miss some optimal strategies. In general, you get more optimal query plans if you can simplify your queries as much as possible. Triggers If you are using triggers on INSERT, UPDATE, or DELETE, it is possible that your triggers can cause performance problems. You might think that INSERT, UPDATE, or DELETE is performing poorly when actually it is the trigger that needs to be tuned. In addition, you might have triggers that fire other triggers. If you suspect that you are having performance problems with the triggers, you can monitor the SQL they are executing and the response time, as well as execution plans generated for statements within triggers using SQL Server Profiler. For more information on monitoring performance with SQL Server Profiler, see Chapter 6, “SQL Server Profiler.” You can also see the query plans for statements executed in triggers by using SSMS if you enable the Include Actual Execution Plan option. For more information on using SSMS to view and analyze query plans, see Chapter 36. Managing the Optimizer Because the Query Optimizer might sometimes make poor decisions as to how to best process a query, you need to know how and when you may need to override the Query Optimizer and force SQL Server to process a query in a specific manner. How often does SQL Server require manual intervention to execute a query optimally? Considering the overwhelming number of query types and circumstances in which those queries are run, SQL Server does a surprisingly effective job of query optimization in most instances. For all but the most grueling, complex query operations, experience has shown that SQL Server’s Query Optimizer is quite clever—and very, very good at wringing the best performance out of any hardware platform. For this reason, you should treat the material covered in this chapter as a collection of techniques to be used only where other methods of getting optimal query performance have already failed. Before indiscriminately applying the techniques discussed in this section, remember one very important point: use of these features can effectively hide serious fundamental design or coding flaws in your database, application, or queries. In fact, if you’re tempted to use these features (with a few moderate exceptions), it should serve as an indicator that the problems might lie elsewhere in the application or queries. If you are satisfied that no such flaws exist and that SQL Server is choosing the wrong plan to optimize your query, you can use the methods discussed in this section to override two of the three most important decisions the Query Optimizer makes: Download from www.wowebook.com ptg 1279 Managing the Optimizer 35 . Choosing which index, if any, to resolve the query . Choosing the join strategy to apply in a multitable query The other decision made by the Query Optimizer is the locking strategy to apply. Using table hints to override locking strategies is discussed in Chapter 37, “Locking and Performance.” Throughout this and following sections, one point must remain clear in your mind: these options should be used only in exception cases to cope with specific optimization problems in specific queries in specific applications. There are therefore no standard or global rules to follow because the application of these features, by definition, means that normal SQL Server behavior isn’t taking place. The practical result of this idea is that you should test every option in your environment, with your data and your queries, and use the techniques and methods discussed in this chapter and the other performance-related chapters to optimize and fine-tune the performance of your queries. The fastest-performing query wins, so you shouldn’t be afraid to experiment with different alternatives—but you shouldn’t think that these statements and features are globally applicable or fit general categories of problems, either! There are, in fact, only three rules: Test, test, and test! TIP As a general rule, Query Optimizer and table hints should be used only as a last resort, when all other methods to get the Query Optimizer to generate a more efficient query plan have failed. Always try to find other ways to rewrite the queries to encourage the Query Optimizer to choose a better plan. This includes adding additional SARGs, substi- tuting unknown values for known values in SARGS or trying to replace unknown values with known values, breaking up queries, converting subqueries to joins or joins to subqueries, and so on. Essentially, you should try other coding variations on the query itself to get the same result in a different way and try to see if one of the variations ends up using the more efficient query plan that you expect it to. In reality, about the only time you should use these hints is when you’re testing the performance of a query and want to see if the Query Optimizer is actually choosing the best execution plan. You can enable the various query analysis options, such as STATISTICS PROFILE and STATISTICS IO, and then see how the query plan and statistics change as you apply various hints to the query. You can examine the output to determine whether the I/O cost and/or runtime improves or gets worse if you force one index over another or if you force a specific join strategy or join order. The problem with hard-coding table and Query Optimizer hints into application queries is that the hints prevent the Query Optimizer from modifying the query plan as the data in the tables changes over time. Also, if subsequent service packs or releases of SQL Server incorporate improved optimization algorithms or strategies, the queries with hard-coded hints will not be able to take advantage of them. Download from www.wowebook.com ptg 1280 CHAPTER 35 Understanding Query Optimization If you find that you must incorporate any of these hints to solve query performance problems, you should be sure to document which queries and stored procedures con- tain Query Optimizer and table hints. It’s a good idea to periodically go back and test the queries to determine whether the hints are still appropriate. You might find that, over time, as the data values in the table change, the forced query plan generated because of the hints is no longer the most efficient query plan, and the Query Optimizer now generates a more efficient query plan on its own. Optimizer Hints You can specify three types of hints in a query to override the decisions made by the Query Optimizer: . Table hints . Join hints . Query hints The following sections examine and describe each type of table hint. Forcing Index Selection with Table Hints In addition to locking hints that can be specified for each table in a query, SQL Server 2008 allows you to provide table-level hints that enable you to specify the index SQL Server should use for accessing the table. The syntax for specifying an index hint is as follows: SELECT column_list FROM tablename WITH (INDEX (indid | index_name [, ]) ) This syntax allows you to specify multiple indexes. You can specify an index by name or by ID. It is recommended that you specify indexes by name as the IDs for nonclustered indexes can change if they are dropped and re-created in a different order than that in which they were created originally. You can specify an index ID of 0, or the table name itself, to force a table scan. When you specify multiple indexes in the hint list, all the indexes listed are used to retrieve the rows from the table, forcing an index intersection or index covering via an index join. If the collection of indexes listed does not cover the query, a regular row fetch is performed after all the indexed columns are retrieved. To get a list of indexes on a table, you can use sp_helpindex. However, the stored procedure doesn’t display the index ID. To get a list of all user-defined tables and the names of the indexes defined on them, you can execute a query against the sys.indexes catalog view similar to the one shown in Listing 35.6, which was run against the bigpubs2008 database. LISTING 35.6 Query Against sys.indexes Catalog View to Get Index Names and IDs select ‘Table name’ = convert(char(20), object_name(object_id)), ‘Index name’ = convert(char(30), name), ‘Index ID’ = index_id, ‘Index Type’ = convert(char(15), type_desc) Download from www.wowebook.com ptg 1281 Managing the Optimizer 35 from sys.indexes where object_id > 99 —only system tables have id less than 99 and index_id between 1 and 254 /* do not include rows for text columns or tables without a clustered index*/ /* do not include auto statistics */ and is_hypothetical = 0 and objectproperty(object_id, ‘IsUserTable’) = 1 order by 1, 3 go Table name Index name Index ID Index Type authors UPKCL_auidind 1 CLUSTERED authors aunmind 2 NONCLUSTERED employee employee_ind 1 CLUSTERED employee PK_emp_id 2 NONCLUSTERED jobs PK__jobs__job_id__25319086 1 CLUSTERED PARTS PK__PARTS__09746778 1 CLUSTERED PARTS UQ__PARTS__0A688BB1 2 NONCLUSTERED pub_info UPKCL_pubinfo 1 CLUSTERED publishers UPKCL_pubind 1 CLUSTERED roysched titleidind 2 NONCLUSTERED sales UPKCL_sales 1 CLUSTERED sales titleidind 2 NONCLUSTERED sales ord_date_idx 7 NONCLUSTERED sales qty_idx 8 NONCLUSTERED sales_big ci_sales_big 1 CLUSTERED sales_big idx1 2 NONCLUSTERED sales_noclust idx1 2 NONCLUSTERED sales_noclust ord_date_idx 3 NONCLUSTERED sales_noclust qty_idx 4 NONCLUSTERED stores UPK_storeid 1 CLUSTERED stores nc1_stores 2 NONCLUSTERED titleauthor UPKCL_taind 1 CLUSTERED titleauthor auidind 2 NONCLUSTERED titleauthor titleidind 3 NONCLUSTERED titles UPKCL_titleidind 1 CLUSTERED titles titleind 2 NONCLUSTERED titles ytd_sales_filtered 11 NONCLUSTERED SQL Server 2008 introduces the new FORCESEEK table hint, which provides an additional query optimization option. This hint specifies that the query optimizer use only an index seek operation as the access path to the data in the table or view referenced in the query rather than a index or table scan. If a query plan contains table or index scan operators, forcing an index seek operation may yield better query performance. This is especially true Download from www.wowebook.com ptg 1282 CHAPTER 35 Understanding Query Optimization when inaccurate cardinality or cost estimations cause the optimizer to favor scan operations at plan compilation time. Before using the FORCESEEK table hint, you should make sure that statistics on the table are current and accurate. Also, you should evaluate the query for items that can cause poor cardinality or cost estimates and remove these items if possible. For example, replace local variables with parameters or literals and limit the use of multistatement table-valued functions and table variables in the query. Also, be aware that if you specify the FORCESEEK hint in addition to an index hint, the FORCESEEK hint can cause the optimizer to use an index other than one specified in the index hint. Forcing Join Strategies with Join Hints Join hints let you force the type of join that should be used between two tables. The join hints correspond with the three types of join strategies: . LOOP . MERGE . HASH You can specify join hints only when you use the ANSI-style join syntax—that is, when you actually use the keyword JOIN in the query. The hint is specified between the type of join and the keyword JOIN, which means you can’t leave out the keyword INNER for an inner join. Thus, the syntax for the FROM clause when using join hints is as follows: FROM table1 {INNER | OUTER} [LOOP | MERGE | HASH} JOIN table2 The following example forces SQL Server to use a hash join: select st.stor_name, ord_date, qty from stores st INNER HASH JOIN sales s on st.stor_id = s.stor_id where st.stor_id between ‘B100’ and ‘B599’ You can also specify a global join hint for all joins in a quer y by using a query processing hint. Specifying Query Processing Hints SQL Server 2008 enables you to specify additional query hints to control how your queries are optimized and processed. You specify query hints at the end of a query by using the OPTION keyword. There can be only one OPTION clause per query, but you can specify multiple hints in an OPTION clause, as shown in the following syntax: OPTION (hint1 [, hintn]) Query hints are grouped into four categories: GROUP BY, UNION, join, and miscellaneous. Download from www.wowebook.com ptg 1283 Managing the Optimizer 35 GROUP BY Hints GROUP BY hints specify how GROUP BY or COMPUTE operations should be performed. The following GROUP BY hints can be specified: . HASH GROUP—This option forces the Query Optimizer to use a hashing function to perform the GROUP BY operation. . ORDER GROUP—This option forces the Query Optimizer to use a sorting operation to perform the GROUP BY operation. Only one GROUP BY hint can be specified at a time. UNION Hints The UNION hints specify how UNION operations should be performed. The following UNION hints can be specified: . MERGE UNION—This option forces the Query Optimizer to use a merge operation to perform the UNION operation. . HASH UNION—This option forces the Query Optimizer to use a hash operation to perform the UNION operation. . CONCAT UNION—This option forces the Query Optimizer to use the concatenation method to perform the UNION operation. Only one UNION hint can be specified at a time, and it must come after the last query in the UNION. The following is an example of forcing concatenation for a UNION: select stor_id from sales where stor_id like ‘B19%’ UNION select title_id from titles where title_id like ‘C19%’ OPTION (CONCAT UNION) Join Hints The join hint specified in the OPTION clause specifies that all join operations in the query are performed as the type of join specified in the hint. The join hints that can be specified in the query hints are the same as the table hints: . LOOP JOIN . MERGE JOIN . HASH JOIN If you also specify a join hint for a specific pair of tables, the table-level hints specified must be compatible with the query-level join hint. Miscellaneous Hints The following miscellaneous hints can be used to override various query operations: . FORCE ORDER—This option tells the Query Optimizer to join the tables in the order in which they are listed in the FROM clause and not to determine the optimal join order. Download from www.wowebook.com . that can be specified for each table in a query, SQL Server 2008 allows you to provide table-level hints that enable you to specify the index SQL Server should use for accessing the table. The. SQL Server Profiler. For more information on monitoring performance with SQL Server Profiler, see Chapter 6, SQL Server Profiler.” You can also see the query plans for statements executed. when you may need to override the Query Optimizer and force SQL Server to process a query in a specific manner. How often does SQL Server require manual intervention to execute a query optimally?

Định dạng
Số trang	10
Dung lượng	284,48 KB