ptg 1254 CHAPTER 35 Understanding Query Optimization Query Plan Caching SQL Server 2008 has a pool of memory used to store both execution plans and data. The amount of memory allocated to execution plans or data changes dynamically, depending on the needs of the system. The portion of memory used to store execution plans is often referred to as the plan cache. The first time a cacheable query is submitted to SQL Server, the query plan is compiled and put into the plan cache. Query plans are read-only re-entrant structures shared by multiple users. At most, there are two instances of a query plan at any time in the plan cache: a serial execution plan and parallel query execution plan. The same parallel execu- tion plan is used for all parallel executions, regardless of the degree of parallelism. When you execute subsequent SQL statements, the Database Engine first checks to see whether an existing execution plan for the same SQL statement already resides in the plan cache. If it finds one, SQL Server attempts to reuse the matching execution plan, thereby saving the overhead of having to recompile an execution plan for each ad hoc SQL state- ment issued. If no matching execution plan is found, SQL Server is forced to generate a new execution plan for the query. The ability to reuse query plans for ad hoc queries in addition to caching query plans for stored procedures can help improve the performance for complex queries that are executed frequently because SQL Server can avoid having to compile a query plan every time it’s executed if a matching query plan is found in memory first. Query Plan Reuse Query plan reuse for stored procedures is pretty straightforward. The whole idea behind stored procedures is to promote plan reuse. For stored procedures and triggers, plan reuse is simply based on the procedure or trigger name. The first time a stored procedure is executed, the query plan is generated based on the initial parameters. On subsequent executions, SQL Server checks the plan cache to see whether a query plan exists for a procedure with the same name, and if one is found, it simply substitutes the new parame- ter values into the existing query plan for execution. Another method that promotes query plan reuse is using the sp_executesql stored proce- dure for executing dynamic SQL statements. When using sp_executesql, typically you specify a dynamic query with explicitly identified parameters for SARGs. Here’s an example: sp_executesql N’select t.title, pubdate from bigpubs2008.dbo.authors a join bigpubs2008.dbo.titleauthor ta on a.au_id = ta.au_id join bigpubs2008.dbo.titles t on ta.title_id = t.title_id where a.au_lname = @name’, N’@name varchar(30)’, ‘Smith’ When the same query is executed again via sp_executesql, SQL Server reuses the exist- ing query plan (if it is still in the plan cache) and simply substitutes the different para- meter values. Download from www.wowebook.com ptg 1255 Query Plan Caching 35 Although SQL Server can also match query plans for ad hoc SQL statements, there are some limitations as to when a plan can be reused. For SQL Server to match SQL state- ments to existing execution plans in the plan cache for ad hoc queries, all object refer- ences in the query must be qualified with at least the schema name, and fully qualified object names (database plus schema name) provide increased likelihood of plan reuse. In addition, plan caching for ad hoc queries requires an exact text match between the queries. The text match is both case sensitive and space sensitive. For example, the follow- ing two queries are logically identical, but because they are not textually identical, they would not share the same query plan: select a.au_lname, t.title, pubdate from authors a join titleauthor ta on a.au_id = ta.au_id join titles t on ta.title_id = t.title_id select a.au_lname, t.title, pubdate from authors a join titleauthor ta on a.au_id = ta.au_id join titles t on ta.title_id = t.title_id Another factor that can prevent query plan reuse by matching queries is differences in certain SET options, database options, or configuration options in effect for the user session when the query is invoked. For example, a query might optimize differently for one session if the ANSI_NULLS option is turned on than it would if it were turned off. The following list of SET options must match for a query plan to be reused by a session: . ANSI_PADDING . FORCEPLAN . CONCAT_NULL_YIELDS_NULL . ANSI_WARNINGS . ANSI_NULLS . QUOTED_IDENTIFIER . ANSI_NULL_DFLT_ON . ANSI_NULL_DFLT_OFF If any one of these setting values does not match the setting options for a cached plan, the session generates a new query plan. Likewise, if the session is using a different language or DATEFORMAT setting than that used by a cached plan, it needs to generate a new execution plan. As you can see, sometimes fairly subtle differences can prevent plan reuse. Download from www.wowebook.com ptg 1256 CHAPTER 35 Understanding Query Optimization Simple Query Parameterization For certain simple queries executed without parameters, SQL Server 2008 automatically replaces constant literal values with parameters and compiles the query plan. This simple parameterization of the query plan increases the possibility of query plan matching for subsequent queries. If a subsequent query differs in only the values of the constants, it matches with the parameterized query plan and reuses the query plan. Consider this query: SELECT * FROM AdventureWorks.Production.Product WHERE ProductSubcategoryID = 1 The search value 1 at the end of the statement can be treated like a parameter. When the query plan is generated for this query, the Query Optimizer replaces the search value with a placeholder parameter, such as @p1. This process is called simple parameterization. Using the method of simple parameterization, SQL Server 2008 recognizes that following state- ment is identical to the first except for the search value of 9 and would generate essentially the same execution plan: SELECT * FROM AdventureWorks.Production.Product WHERE ProductSubcategoryID = 9 This query will reuse the query plan generated by the first query. NOTE You can determine whether simple parameterization has been used for a quer y by examining the query plan information for the query. If the query plan information con- tains such placeholders as @p1 and @p2 in the search predicates when literal values were specified in the actual query, you know simple parameterization has been applied for the query. You can see an example of this in Figure 35.13 where parameters were substituted in the query plan for the search arguments against qty and ord_date. Query Plan Aging A query plan is saved in cache along with a cost factor that reflects the cost of actually creating the plan when compiling the query. For ad hoc query plans, SQL Server sets its cost to 0, which indicates that the plan can be removed from the plan cache immediately if space is needed for other plans. For other query plans, such as for a stored procedure, the query plan cost is a measure of the amount of resources required to produce it. This cost is calculated in “number of ticks.” The maximum plan cost is 31. The plan cost is determined as follows: Every 2 I/Os required by the plan = 1 tick (with a maximum of 19 ticks) Every 2 context switches in the plan = 1 tick (with a maximum of 8 ticks) Every 16 pages (128KB) of memory required for the plan = 1 tick (with a maximum of 4 ticks) Download from www.wowebook.com ptg 1257 Query Plan Caching 35 All reusable query plans remain in cache until space is needed in the plan cache for a new plan. When space is needed, SQL Server removes the oldest unused execution plan from the plan cache that has the lowest plan cost. As plans age in cache, the plan cost is not decremented until the size of the plan cache reaches 50% of the buffer pool size. When this occurs, the next access of the plan cache results in the plan cost for all query plans being decremented by 1. As plans reside in the plan cache over a period of time and are not reused, they eventually reach a plan cache cost of 0 and thus become eligible to be removed from cache the next time plan cache space is needed. However, when a query plan is reused, its plan cost is reset back to its initial value. This helps ensure that frequently accessed query plans remain in the plan cache. Recompiling Query Plans Certain changes in a database over time can cause existing execution plans to become either inefficient or invalid, based on the new state of the database. SQL Server detects the changes that invalidate an execution plan and marks the plan as not valid. A new plan is then automatically recompiled the next time the query that uses that query plan is invoked. Most query plan recompilations are required either for statement correctness or to obtain potentially faster query execution plans. The types of conditions that can invali- date a query plan include the following: . Modifications made to the definition of a table or view referenced by the query using ALTER TABLE and ALTER VIEW . Changes made to any indexes used by the execution plan . Updates to the statistics used by the execution plan via either the UPDATE STATIS- TICS command or automatically . Dropping of an index or indexed view used by the execution plan . Execution of sp_recompile on a table referenced by the query plan . Large numbers of changes to keys (generated by INSERT or DELETE statements from other users that modify a table referenced by the query) . Adding or dropping a trigger on a table . When the number of rows in the inserted or deleted tables grows significantly within a trigger defined on a table referenced in the query plan . Execution of a stored procedure with the WITH RECOMPILE option specified To avoid the unnecessary recompilation of statements that do not require it, SQL Server 2008 performs statement-level recompilation: only the statement inside the batch or stored procedure that requires recompilation is recompiled. Statement-level recompilation Download from www.wowebook.com ptg 1258 CHAPTER 35 Understanding Query Optimization helps improve query performance because, in most cases, only a small number of state- ments within a batch or stored procedure cause recompilations and their associated penal- ties, in terms of CPU time and locks. These penalties are therefore avoided for the other statements in the batch that do not have to be recompiled. Forcing Query Plan Recompiles If you suspect that a query plan that is being reused is not appropriate for the current execution of a query, you can also manually force the query plan to be recompiled for the query. This capability can be especially useful for parameterized queries. Query parameteri- zation provides a performance benefit by minimizing compilation overhead, but a para- meterized query often provides less specific costing information to the Query Optimizer and can result in the creation of a more general plan, which can be less efficient than a more specific plan created for a specific set of literal values. If the initial parameterized query plan generated for the query was not based on a repre- sentative set of parameters, or if you are invoking an instance of the query with a nonrep- resentative set of search values, you might find it necessary to force the Query Optimizer to generate a new query plan. You can force query recompilation for a specific execution of a query by specifying the RECOMPILE query hint. For more information on specifying the RECOMPILE query hint, see the “Managing the Optimizer” section, later in this chapter. Monitoring the Plan Cache You can view and get information about the query plans currently in plan cache memory by using some of the DMVs available in SQL Server 2008. Following are some of the useful ones related to monitoring the plan cache: . sys.dm_exec_cached_plans—Returns general information about the query execution plans currently in the plan cache. . sys.dm_exec_query_stats—Returns aggregate performance statistics for cached query plans. . sys.dm_exec_sql_text—Returns the text of the SQL statement for a specified plan handle. . sys.dm_exec_cached_plan_dependent_objects—Returns one row for every dependent object of a compiled plan. . sys.dm_exec_plan_attributes—Returns one row per attribute associated with the plan for a specified plan handle. sys.dm_exec_cached_plans The sys.dm_exec_cached_plans DMV provides information on all the execution plans currently in the plan cache. Because the cache can have a large number of plans, you usually want to limit the results returned from sys.dm_exec_cached_plans by using a filter on the cacheobjtype column and also using the TOP clause. For example, the query shown in Listing 35.1 returns the top 10 compiled plans currently in the plan cache, sorted in descending order by the number of times the plan has been reused (usecounts). Download from www.wowebook.com ptg 1259 Query Plan Caching 35 LISTING 35.1 Returning the Top 10 Compiled Plans, by Usage Count select top 10 objtype, usecounts, size_in_bytes, plan_handle from sys.dm_exec_cached_plans where cacheobjtype = ‘Compiled Plan’ order by usecounts desc go objtype usecounts size_in_bytes plan_handle Prepared 127 65536 0x06000100962E9C11B820A207000000000000000000000000 Adhoc 110 49152 0x06000100804AD300B8E02D0C000000000000000000000000 Adhoc 40 16384 0x060001006CC40F18B860D80A000000000000000000000000 Adhoc 26 8192 0x0600040023900901B820A106000000000000000000000000 Adhoc 26 8192 0x060004003E77102CB8E0A306000000000000000000000000 Proc 17 8192 0x05000400F578A275B8405F07000000000000000000000000 Adhoc 17 8192 0x06000400EBC44D2AB880A006000000000000000000000000 Adhoc 15 8192 0x060001001AF2320BB8801A08000000000000000000000000 Proc 12 212992 0x05000400744F1F67B8604F0E000000000000000000000000 Proc 12 49152 0x050004006A934A11B8C0550E000000000000000000000000 The types of plans in the plan cache are listed under the cacheobjtype column and can be any of the following: . Compiled Plan—The actual compiled plan generated that can be shared by sessions running the same procedure or query. . Compiled Plan Stub—A small, compiled plan stub generated when a batch is compiled for the first time and the Optimize for Ad Hoc Workloads option is enabled. It helps to relieve memory pressure by not allowing the plan cache to become filled with compiled plans that are not reused. . Executable Plan—The actual execution plan and the environment settings for the session that ran the compiled plan. Caching the environment settings for an execution plan makes subsequent executions more efficient. Each concurrent execu- tion of the same compiled plan will have its own executable plan. All executable plans are associated with a compiled plan having the same plan_handle, but not all compiled plans have an associated executable plan. . Parse Tree—The internal parsed form of a query generated before compilation and optimization. . CLR Compiled Func—Execution plan for a CLR-based function. . CLR Compiled Proc—Execution plan for a CLR-based procedure. . Extended proc—The cached information for an extended stored procedure. Download from www.wowebook.com ptg 1260 CHAPTER 35 Understanding Query Optimization The type of object or query for which a plan is cached is stored in the objtype column. This column can contain one of the following values: . Proc—The cached plan is for a stored procedure or inline function. . Prepared—The cached plan is for queries submitted using sp_executesql or for queries using the prepare and execute method. . Adhoc—The cached plan is for queries that don’t fall into any other category. . ReplProc—The cached plan is for replication agents. . Trigger—The cached plan is for a trigger. . View—The cached plan is for a view or a noninline function. You typically see a parse tree only for a view or noninline function, not a compiled plan. The view or function typically does not have its own separate plan because it is expanded as part of another query. . UsrTab or SysTab—The cached plan is for a user or system table that has computed columns. This is typically associated with a parse tree. . Default, Check, or Rule—The cached plan is simply a parse tree for these types of objects because they are expanded as part of another query in which they are applied. To determine how often a plan is being reused, you can examine the value in the usecounts columns. The usecounts value is incremented each time the cached plan is looked up and reused. sys.dm_exec_sql_text Overall, the information returned by sys.dm_exec_cached_plans is not overly useful unless you know what queries or stored procedures these plans refer to. You can view the SQL text of these query plans by writing a query that joins sys.dm_exec_cached_plans with the sys.dm_exec_sql_text DMV. For example, you can use the query shown in Listing 35.2 to return the SQL text for the top 10 largest ad hoc query plans currently in the plan cache. LISTING 35.2 Returning the Top 10 Largest Ad Hoc Query Plans select top 10 objtype, usecounts, size_in_bytes, plan_handle, the following removes newline and carriage return from the sql text replace(replace( text, char(13), ‘ ‘), char(10), ‘ ‘) as sqltext from sys.dm_exec_cached_plans as p cross apply sys.dm_exec_sql_text (p.plan_handle) where cacheobjtype = ‘Compiled Plan’ and objtype = ‘Adhoc’ order by size_in_bytes desc, usecounts desc Download from www.wowebook.com ptg 1261 Query Plan Caching 35 sys.dm_exec_query_stats The plan cache also keeps track of useful statistics about each cached plan, such as the amount of CPU or the number of reads and writes performed by the query plan since it was placed into the plan cache. This information can be examined using the sys.dm_exec_query_stats DMV, which returns statistics for each statement in a stored procedure or a SQL batch. To provide statistics for the procedure or batch as a whole, you need to summarize the data. Listing 35.3 provides a sample query that returns the I/O, CPU, and elapsed time statistics for the 10 most recently executed stored procedures. LISTING 35.3 Returning Query Plan Stats for the 10 Most Recently Executed Procedures select TOP 10 usecounts, size_in_bytes, max(last_execution_time) as last_execution_time, sum(total_logical_reads) as total_logical_reads, sum(total_physical_reads) as total_physical_reads, sum(total_worker_time/1000) as total_CPU_time, sum(total_elapsed_time/1000) as total_elapsed_time, replace(substring (text, patindex(‘%create procedure%’, text), datalength(text)), ‘create procedure’, ‘’) as procname from sys.dm_exec_query_stats s join sys.dm_exec_cached_plans p on s.plan_handle = p.plan_handle CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) as st where p.objtype = ‘Proc’ and p.cacheobjtype = ‘Compiled Plan’ group by usecounts, size_in_bytes, text order by max(last_execution_time) desc Table 35.1 describes some of the most useful columns returned by the sys.dm_exec_query_stats DMV. TABLE 35.1 Description of Columns for sys.dm_exec_query_stats Column Name Description statement_start_offset The starting position of the query that the row describes within the text of its batch or stored procedure, indicated in bytes, beginning with 0. statement_end_offset The ending position of the query that the row describes within the text of its batch or stored proc. A value of -1 indicates the end of the batch. plan_generation_num The number of times the plan has been recompiled while it has remained in the cache. Download from www.wowebook.com ptg 1262 CHAPTER 35 Understanding Query Optimization TABLE 35.1 Description of Columns for sys.dm_exec_query_stats Column Name Description plan_handle A pointer to the plan. This value can be passed to the dm_exec_query_plan dynamic management function. creation_time The time the plan was compiled. last_execution_time The last time the plan was executed. execution_count The number of times the plan has been executed since it was last compiled. total_worker_time The total amount of CPU time, in microseconds, consumed by executions of this plan for the statement. last_worker_time The CPU time, in microseconds, consumed the last time the plan was executed. min_worker_time The minimum CPU time, in microseconds, this plan has ever consumed during a single execution. max_worker_time The maximum CPU time, in microseconds, this plan has ever consumed during a single execution. total_physical_reads The total number of physical reads performed by executions of this plan since it was compiled. last_physical_reads The number of physical reads performed the last time the plan was executed. min_physical_reads The minimum number of physical reads this plan has ever performed during a single execution. max_physical_reads The maximum number of physical reads this plan has ever performed during a single execution. total_logical_writes The total number of logical writes performed by executions of this plan since it was compiled. last_logical_writes The number of logical writes performed the last time the plan was executed. min_logical_writes The minimum number of logical writes this plan has ever performed during a single execution. max_logical_writes The maximum number of logical writes this plan has ever performed during a single execution. total_logical_reads The total number of logical reads performed by executions of this plan since it was compiled. last_logical_reads The number of logical reads performed the last time the plan was executed. Download from www.wowebook.com ptg 1263 Query Plan Caching 35 The query_hash and query_plan_hash values are new for SQL Server 2008. You can use these values to determine the aggregate resource usage for queries that differ only by literal values or with similar execution plans. You can use these values to write queries that you can use to help determine the aggregate resource usage for similar queries and similar query execution plans. For example, Listing 35.4 provides a query to find the query_hash and query_plan_hash values for queries that select from the titles table searching by ytd_sales. Looking at the results, you can see that even with different search arguments, each of the matching queries generates the same query hash value, but they have different query plan hash values for queries that use different query plans. LISTING 35.4 Returning Query and Query Plan Hash Values for a Query SELECT convert(varchar(41), substring(st.text, 1, 42)) AS ‘Query Text’, qs.query_hash AS ‘Query Hash’, qs.query_plan_hash as ‘Query Plan Hash’ FROM sys.dm_exec_query_stats qs CROSS APPLY sys.dm_exec_sql_text (qs.sql_handle) st WHERE st.text like ‘SELECT * from titles where ytd_sales%’ go TABLE 35.1 Description of Columns for sys.dm_exec_query_stats Column Name Description min_logical_reads The minimum number of logical reads this plan has ever performed during a single execution. max_logical_reads The maximum number of logical reads this plan has ever performed during a single execution. total_elapsed_time The total elapsed time, in microseconds, for completed executions of this plan. last_elapsed_time The elapsed time, in microseconds, for the most recently completed execution of this plan. min_elapsed_time The minimum elapsed time, in microseconds, for any completed execution of this plan. max_elapsed_time The maximum elapsed time, in microseconds, for any completed execution of this plan. query_hash The binary hash value calculated on the query and used to identify queries with similar logic. query_plan_hash The binary hash value calculated on the query execution plan and used to identify similar query execution plans. Download from www.wowebook.com . Caching 35 Although SQL Server can also match query plans for ad hoc SQL statements, there are some limitations as to when a plan can be reused. For SQL Server to match SQL state- ments to existing. Here’s an example: sp_executesql N’select t.title, pubdate from bigpubs2008.dbo.authors a join bigpubs2008.dbo.titleauthor ta on a.au_id = ta.au_id join bigpubs2008.dbo.titles t on ta.title_id. subsequent SQL statements, the Database Engine first checks to see whether an existing execution plan for the same SQL statement already resides in the plan cache. If it finds one, SQL Server attempts