Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 572 Part IV Developing with SQL Server Examining SQL Server with Code One of the benefits of using SQL Server is the cool interface it offers to develop and administer the database. Management Studio is great for graphically exploring a database; T-SQL code, while more complex, exposes even more detail within a programmer’s environment. Dynamic Management Views Introduced in SQL Server 2005, dynamic management views (DMVs) offer a powerful view into the structure of SQL Server and the databases, as well as the current SQL Server status (memory, IO, etc.). As an example of using DMVs, the next query looks at three DMVs concerning objects and primary keys: SELECT s.NAME + ‘.’ + o2.NAME AS ‘Table’, pk.NAME AS ‘Primary Key’ FROM sys.key_constraints AS pk JOIN sys.objects AS o ON pk.OBJECT_ID = o.OBJECT_ID JOIN sys.objects AS o2 ON o.parent_object_id = o2.OBJECT_ID JOIN sys.schemas AS s ON o2.schema_id = s.schema_id; Result: Table Primary Key dbo.ErrorLog PK_ErrorLog_ErrorLogID Person.Address PK_Address_AddressID Person.AddressType PK_AddressType_AddressTypeID dbo.AWBuildVersion PK_AWBuildVersion_SystemInformationID Production.BillOfMaterials PK_BillOfMaterials_BillOfMaterialsID Production.Document UQ Document F73921F730F848ED A complete listing of all the DMVs and sample queries can be found on www .SQLServerBible.com . The Microsoft SQL Server 2008 System Views Map (a 36’’ x 36’’ . pdf file) can be downloaded from http://tinyurl.com/dbbw78. If it changes, I’ll keep a link on my website, www.SQLServerBible.com. I keep a copy of this document on my desktop. sp_help Sp_help, and its variations, return information regarding the server, the database, objects, connections, and more. The basic sp_help lists the available objects in the current database; the other variations provide detailed information aboutthevariousobjectsorsettings. Adding an object name as a parameter to sp_help returns additional appropriate information about the object: 572 www.getcoolebook.com Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 573 Programming with T-SQL 21 USE OBXKites; EXEC sp_help Price; The result here is seven data sets of information about the Price table: ■ Name, creation date, and owner ■ Columns ■ Identity columns ■ Row GUID columns ■ FileGroup location ■ Indexes ■ Constraints System functions A system function, sometimes called a global variable, returns information about the current system or connection status. System functions can’t be created. There’s a fixed set of system functions, all beginning with two @ signs (the more significant ones are listed in Table 21-1). The most commonly used global variables are @@NestLevel, @@Rowcount, @@ServerName,and@@Version. The system functions are slowly being replaced by DMV information. TABLE 21-1 System Functions System Function Returns Scope @@DateFirst The day of the week currently set as the first day o f the week; 1 represents Monday, 2 represents Tuesday, and so on. For example, if Sunday is the first day of the week, @@DateFirst returns a 7. Connection @@Error The error value for the last T-SQL statement executed Connection @@Fetch_Status The row status from the last cursor fetch command Connection @@LangID The language ID used by the current connection Connection @@Language The language, by name, used by the current connection Connection @@Lock_TimeOut The lock timeout setting for the current connection Connection @@Nestlevel Current number of nested stored procedures Connection 573 www.getcoolebook.com Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 574 Part IV Developing with SQL Server TABLE 21-1 (continued ) System Function Returns Scope @@ProcID The stored procedure identifier for the current stored procedure. This can be used with sys.objects to determine the name of the current stored procedure, as follows: SELECT name FROM sys.objects WHERE object_id = @@ProcID; Connection @@RemServer Name of the login server when running remote stored procedures Connection @@RowCount Number of rows returned by the last T-SQL statement Connection @@ServerName Name of the current server Server @@ServiceName SQL Server’s Windows service name Server @@SPID The current connection’s server-process identifier — the ID for the connection Connection @@TranCount Number of active transactions for the current connection Connection @@Version SQL Server edition, version, and service pack Server Temporary Tables and Table Variables Temporary tables and table variables play a different role from standard user tables. By their temporary nature, these objects are useful as a vehicle for passing data between objects or as a short-term scratch- pad table intended for very temporary work. Local temporary tables A temporary table is created the same way as a standard user-defined table, except the temporary table must have a pound, or hash, sign ( #) preceding its name. Temporary tables are actually created on the disk in tempdb: CREATE TABLE #ProductTemp ( ProductID INT PRIMARY KEY ); A temporary table has a short life. When the batch or stored procedure that created it ends, the temporary table is deleted. If the table is created during an interactive session (such as a Query 574 www.getcoolebook.com Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 575 Programming with T-SQL 21 Editor window), it survives only until the end of that session. Of course, a temporary table can also be normally dropped within the batch. The scope of a temporary table is also limited. Only the connection that created the local temporary table can see it. Even if a thousand users all create temporary tables with the same name, each user will only see his or her own temporary table. The temporary table is created in tempdb with a unique name that combines the assigned table name and the connection identifier. Most objects can have names up to 128 characters in length, but temporary tables are limited to 116 so that the last 12 characters can make the name unique. To demonstrate the unique name, the following code creates a temporary table and then examines the name stored in sys.objects: SELECT name FROM tempdb.sys.objects WHERE name LIKE’#Pro%’; Result (shortened to save space; the real value is 128 characters wide): name #ProductTemp 00000000002D Despite the long name in sys.objects, SQL queries still reference any temporary tables with the original name. Global temporary tables Global temporary tables are similar to local temporary tables but they have a broader scope. All users can reference a global temporary table, and the life of the table extends until the last session accessing the table disconnects. To create a global temporary table, begin the table name with two pound signs, e.g., ##TableName. The following code sample tests to determine whether the global temporary table exists, and creates one if it doesn’t: IF NOT EXISTS( SELECT * FROM tempdb.sys.objects WHERE name = ‘##TempWork’) CREATE TABLE ##TempWork( PK INT PRIMARY KEY, Col1 INT ); When a temporary table is required, it’s likely being used for a work in progress. Another alternative is to simply create a standard user table in tempdb. Every time the SQL Server is restarted, it dumps and rebuilds tempdb, effectively clearing the alternative temporary worktable. 575 www.getcoolebook.com Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 576 Part IV Developing with SQL Server Table variables Table variables are similar to temporary tables. The main difference, besides syntax, is that table vari- ables have the same scope and life as a local variable. They are only seen by the batch, procedure, or function that creates them. To be seen by called stored procedures, the table variables must be passed in as table-valued parameters, and then they are read-only in the called routine. The life span of a table variable is also much shorter than a temp table. Table variables cease to exist when the batch, procedure, or function concludes. Table variables have a few additional limitations: ■ Table variables may not be created by means of the select * into or insert into @tablename exec table syntax. ■ Table variables may not be created within functions. ■ Table variables are limited in their allowable constraints: no foreign keys or check constraints are allowed. Primary keys, defaults, nulls, and unique constraints are OK. ■ Table variables may not have any dependent objects, such as triggers or foreign keys. Table variables are declared as variables, rather than created with SQL DDL statements. When a table variable is being referenced with a SQL query, the table is used as a normal table but named as a vari- able. The following script must be executed as a single batch or it will fail: DECLARE @WorkTable TABLE ( PK INT PRIMARY KEY, Col1 INT NOT NULL); INSERT INTO @WorkTable (PK, Col1) VALUES ( 1, 101); SELECT PK, Col1 FROM @WorkTable; Result: PK Col1 101 Memory vs. Disk; Temp Tables vs. Table Variables A common SQL myth is that table variables are stored in memory. They’re not. They exist in tempdb just like a temporary table. However, the life span of a table variable (as well as that of most temporary tables) is such that it’s extremely unlikely that it would every actually be written to disk. The truth is that the table variable lives in tempdb pages in memory. continued 576 www.getcoolebook.com Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 577 Programming with T-SQL 21 continued So if the difference isn’t memory vs. disk, how do you choose between using a temp table or a table variable? Size and scope. Rule of thumb: If the temp space will hold more than about 250 rows, then go with a temp table, otherwise choose a table variable. The reason is because temp tables have the overhead of statistics, whereas table variables do not. This means that for more data, the temp table’s statistics can help the Query Optimizer choose the best plan. Of course, one always has to consider the overhead of maintaining the statistics. Table variables don’t have statistics, so they save on the overhead; but without statistics, the Query Optimizer always assumes the table variable will result in one row, and may therefore choose a poor plan if the table variable contains a lot of data. Scope is the other consideration. If the temp space must be visible and updatable by called routines, then you’ll have to choose a temp table. Summary T-SQL extends the SQL query with a set of procedural commands. While it’s not the most advanced programming language, T-SQL gets the job done. T-SQL batch commands can be used in expressions, or packaged as stored procedures, user-defined functions, or triggers. A few key points to remember from this chapter: ■ The batch terminator, GO, is only a Query Editor command, and it can send the batch multiple times when followed by a number. ■ DDL commands ( CREATE, ALTER, DROP) must be the only command in the batch. ■ Ctrl+K+C converts the current lines to comments, and Ctrl+K+U uncomments the lines. ■ IF only controls execution of the next line, unless it is followed by a BEGIN END block. ■ Variables can now be incremented with +=. ■ If the temporary space needs to hold more than 250 rows, then use a temp table; otherwise, use a table variable. The next chapter moves into a technology fraught with negative passion. ‘‘Cursors are evil!’’ is a rally- ing call for most of the SQL community. But rather than just echo the common sentiment, I’ll show you when cursors should be used, and how to refactor cursors into set-based operations. 577 www.getcoolebook.com Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 578 www.getcoolebook.com Nielsen c22.tex V4 - 07/23/2009 4:52pm Page 579 Kill the Cursor! IN THIS CHAPTER Iterating through data Strategically avoiding cursors Refactoring cursors to a high-performance set-based solution Measuring cursor performance S QL excels at handling sets of rows. However, the current database world grew out of the old ISAM files structures, and the vestige of looping through data one row at a time remains in the form of the painfully slow SQL cursor. The second tier of Smart Database Design (a framework for designing high- performance systems, covered in Chapter 2) is developing set-based code, rather than iterative code. How slow are cursors? In my consulting practice, the most dramatic cursor-to- set-based refactoring that I’ve worked on involved three nested cursors and about a couple hundred nested stored procedures that ran nightly taking seven hours. Reverse engineering the cursors and stored procedures, and then developing the query took me about three weeks. The query was three pages long and involved several case subqueries, but it ran in 3–5 seconds. When testing a well-written cursor against a well-written set-based solution, I have found that the set-based solutions usually range from three to ten times faster than the c ursors. Why are cursors slow? Very hypothetically, let’s say I make a cool million from book royalties. A cursor is like depositing the funds at the bank one dollar at a time, with a million separate transactions. A set-based transaction deposits the entire million in one transaction. OK, that’s not a perfect analogy, but if you view cursors with that type of mindset, you’ll be a better database developer. While there are legitimate reasons to use a cursor (and I’ll get to those), the most common reason is that programmers with a procedural background feel more comfortable thinking in terms of loops and pointers than set-based relational algebra. 579 www.getcoolebook.com Nielsen c22.tex V4 - 07/23/2009 4:52pm Page 580 Part IV Developing with SQL Server SQL cursors also appear deceptively tunable. Programmers see the long list of cursor options and assume it means the cursor can be tweaked for high performance. The types of cursors have names such as fast forward, dynamic, and scrollable. SQL Server cursors are server-side cursors, which are different from client-side ADO cur- sors. The SQL Server cursor occurs inside the server before any data is ever sent to the client. Client-side cursors are frequently used to scroll through the rows in an ADO record set within the application to populate a grid or combo box. ADO cursors are covered in Chapter 32, ‘‘Program- ming with ADO.NET.’’ While this chapter e xplains how to iterate through data using a cursor, my emphasis is c learly on strate- gically identifying the few appropriate uses of a cursor, but exterminating unnecessary cursors by refac- toring them with set-based code. Anatomy of a Cursor A cursor is essentially a pointer to a single row of data. A WHILE loop is used to cycle through the data until the cursor reaches the end of the data set. SQL Server supports the standard ANSI SQL-92 syntax and an enhanced T-SQL cursor syntax, which offers additional options. The five steps to cursoring A cursor creates a result set from a SELECT statement and then fetches a single row at a time. T he five steps in the life of a cursor are as follows: 1. Declaring the cursor establishes the type and behavior of the cursor and the SELECT statement from which the cursor will pull data. Declaring the cursor doesn’t retrieve any data; it only sets up the SELECT statement. This is the one case in which DECLARE doesn’t require an ampersand as a prefix for a variable. A SQL-92 cursor is declared using CURSOR FOR (this is only the basic syntax): DECLARE CursorName [CursorOptions] CURSOR FOR Select Statement; The enhanced T-SQL cursor is very similar: DECLARE CursorName CURSOR [CursorOptions] FOR Select Statement; 2. Opening the cursor retrieves the data and fills the cursor: OPEN CursorName; 3. Fetching moves to the next row and assigns the values from each column returned by the cursor into a local variable, or to the client. The variables must have been previously declared: FETCH [Direction] CursorName [INTO @Variable1, @Variable2, ]; By default, FETCH moves to the next row; however, FETCH can optionally move to the prior, first, or last row in the data set. FETCH can even move an absolute row position in the result set, or move forward or backward a relative n number of rows. The problem with these 580 www.getcoolebook.com Nielsen c22.tex V4 - 07/23/2009 4:52pm Page 581 Kill the Cursor! 22 options is that row position is supposed to be meaningless in a relational database. If the code must move to specific positions to obtain a correct logical result, then there’s a major flaw in the database design. 4. Closing the cursor releases the data locks but retains the SELECT statement. The cursor can be opened again a t this point ( CLOSE is the counterpart to OPEN): Close CursorName; 5. Deallocating the cursor releases the memory and removes the definitions of the cursor ( DEALLOCATE is the counterpart to CREATE): DEALLOCATE CursorName; These are the five basic commands required to construct a cursor. Wrap the FETCH command with a method to manage the iterative loops and the cursor code is a basic, but working, cursor. Managing the cursor Because a cursor fetches a single row, T-SQL code is required to repeatedly fetch the next row. To manage the looping process, T-SQL offers two cursor-related system functions that provide cursor status information. The @@cursor_rows system function returns the number of rows in the cursor. If the cursor is popu- lated asynchronously, then @@cursor_rows returns a negative number. Essential to developing a cursor is the @@fetch_status system function, which reports the state of the cursor after the last FETCH command. This information is useful to control the flow of the cursor as it reaches the end of the result set. The possible @@fetch_status values indicate the following: ■ 0: The last FETCH successfully retrieved a row. ■ -1: The last FETCH failed by reaching the end of the result set, trying to fetch prior to a row before the beginning of the r esult set, or the fetch simply failed. ■ -2: The last row fetched was not available; the row has been deleted. Combining @@fetch_status with the WHILE command builds a useful loop for moving through the rows. Typically, the batch will include step 3 — the FETCH — twice. The first FETCH primes the cursor with two important r esults. Priming the cursor places data from the first row into the variables so that the action part of the cursor loop can be placed early in the WHILE loop. Priming the cursor also sets up the @@fetch_status system function for the first iteration of the WHILE command. The second FETCH command occurs within the WHILE loop, fetching the second row and every follow- ing row through to the end of the cursor. The WHILE loop examines the @@Fetch_Status global variable to determine whether the cursor is done. 581 www.getcoolebook.com . Nielsen c21.tex V4 - 07/23/2009 4:48pm Page 572 Part IV Developing with SQL Server Examining SQL Server with Code One of the benefits of using SQL Server is the cool interface it offers to develop. complete listing of all the DMVs and sample queries can be found on www .SQLServerBible.com . The Microsoft SQL Server 2008 System Views Map (a 36’’ x 36’’ . pdf file) can be downloaded from http://tinyurl.com/dbbw78 forward, dynamic, and scrollable. SQL Server cursors are server- side cursors, which are different from client-side ADO cur- sors. The SQL Server cursor occurs inside the server before any data is ever