ptg 1634 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 tion interprets the SYS_CHANGE_COLUMNS bitmap value returned by the CHANGETABLE(CHANGES ) function and returns a 1 if the column was modified or 0 if it was not: declare @last_synchronization_version bigint set @last_synchronization_version = 0 SELECT CT.CustomerID as CustID, TerritoryChanged = CHANGE_TRACKING_IS_COLUMN_IN_MASK (COLUMNPROPERTY(OBJECT_ID(‘MyCustomer’), ‘TerritoryID’, ‘ColumnId’), CT.SYS_CHANGE_COLUMNS), CT.SYS_CHANGE_OPERATION, CT.SYS_CHANGE_COLUMNS FROM CHANGETABLE(CHANGES MyCustomer, @last_synchronization_version) AS CT go CustID TerritoryChanged SYS_CHANGE_OPERATION SYS_CHANGE_COLUMNS 4 1 U 0x0000000004000000 5 1 U 0x0000000004000000 In the query results, you can see that both update operations (SYS_CHANGE_OPERATION = ‘U’ ) modified the TerritoryID column (TerritoryChanged = 1). Change Tracking Overhead Although Change Tracking has been optimized to minimize the performance overhead on DML operations, it is important to know that there are some performance overhead and space requirements within the application databases when implementing Change Tracking. The performance overhead associated with using Change Tracking on a table is similar to the index maintenance overhead incurred for insert, update, and delete operations. For each row changed by a DML operation, a row is added to the internal Change Tracking table. The amount of overhead incurred depends on various factors, such as . The number of primary key columns . The amount of data being changed in the user table row . The number of operations being performed in a transaction . Whether column Change Tracking is enabled Change Tracking also consumes some space in the databases where it is enabled as well. Change Tracking data is stored in the following types of internal tables: . Internal change tables—There is one internal change table for each user table that has Change Tracking enabled. ptg 1635 Summary 42 . Internal transaction table—There is one internal transaction table for the database. These internal tables affect storage requirements in the following ways: . For each change to each row in the user table, a row is added to the internal change table. This row has a small fixed overhead plus a variable overhead equal to the size of the primary key columns. The row can contain optional context information set by an application. In addition, if column tracking is enabled, each changed column requires an additional 4 bytes per row in the tracking table. . For each committed transaction, a row is added to an internal transaction table. If you are concerned about the space usage requirements of the internal Change Tracking tables, you can determine the space they use by executing the sp_spaceused stored proce- dure. The internal transaction table is called sys.syscommittab. The names of the internal change tables for each table are in the form change_tracking_object_id. The following example returns the size of the internal transaction table and internal change table for the MyCustomer table: exec sp_spaceused ‘sys.syscommittab’ declare @tablename varchar(128) set @tablename = ‘sys.change_tracking_’ + CONVERT(varchar(16), object_id(‘MyCustomer’)) exec sp_spaceused @tablename Summary Transact-SQL has always been a powerful data access and data modification language, providing additional features, such as functions, variables, and commands, to control execution flow. SQL Server 2008 further expands the power and capabilities of T-SQL with the addition of a number of new features. These new T-SQL features can be incorporated into the building blocks for creating even more powerful SQL Server database compo- nents, such as views, stored procedures, triggers, and user-defined functions. In addition to the powerful features available in T-SQL for developing SQL code and stored procedures, triggers, and user-defined functions, SQL Server 2008 also enables you to define custom-managed database objects such as stored procedures, triggers, functions, data types, and custom aggregates using .NET code. The next chapter, “Creating .NET CLR Objects in SQL Server 2008,” provides an overview of using the .NET common language runtime (CLR) to develop these custom-managed objects. ptg This page intentionally left blank ptg CHAPTER 43 Transact-SQL Programming Guidelines, Tips, and Tricks IN THIS CHAPTER . General T-SQL Coding Recommendations . General T-SQL Performance Recommendations . T-SQL Tips and Tricks . In Case You Missed It: New T-SQL Features in SQL Server 2005 . The xml Data Type . The max Specifier . TOP Enhancements . The OUTPUT Clause . Common Table Expressions . Ranking Functions . PIVOT and UNPIVOT . The APPLY Operator . TRY CATCH Logic for Error Handling . The TABLESAMPLE Clause One of the things you’ll discover with Transact-SQL (T- SQL) is that there are a number of different ways to write queries to get the same results, but some approaches are more efficient than others. This chapter provides some general guidelines and best practices for programming in the T-SQL language to ensure robust code and optimum performance. Along the way, it provides tips and tricks to help you solve various T-SQL problems as well. NOTE This chapter is not intended to be a comprehensive list of guidelines, tips, and tricks. The intent of this chapter is to provide a collection of some of our favorite guidelines, tips, and tricks that are not pre- sented elsewhere in this book. A number of other T- SQL guidelines, tips, and tricks are provided throughout many of the other chapters in this book. For example, a number of performance-related T-SQL coding guidelines and tips are presented in Chapter 35, “Understanding Query Optimization,” and addition- al T-SQL coding guidelines can be found in Chapter 28, “Creating and Managing Stored Procedures,” and 44, “Advanced Stored Procedure Programming and Optimization. ptg 1638 CHAPTER 43 Transact-SQL Programming Guidelines, Tips, and Tricks General T-SQL Coding Recommendations Writing good T-SQL code involves establishing and following T-SQL best practices and guidelines. The following sections provide some common recommendations for general T- SQL coding guidelines to help ensure reliable, robust SQL code. Provide Explicit Column Lists When writing SELECT or INSERT statements in application code or stored procedures, you should always provide the full column lists for the SELECT or INSERT statement. If you use SELECT * in your code or in a stored procedure, the column list is resolved each time the SELECT statement is executed. If the table is altered to add or remove columns, the SELECT statement returns a different set of columns. This can cause your application or SQL code to generate an error if the number of columns returned is different than expected. For example, consider the following sample table: create table dbo.explicit_cols (a int, b int) insert explicit_cols (a, b) values (10, 20) Now, suppose there is a stored procedure with a cursor that references the explicit_cols table, using SELECT * in the cursor declaration, similar to the following: create proc dbo.p_fetch_explicit_cols as declare @a int, @b int declare c1 cursor for select * from explicit_cols open c1 fetch c1 into @a, @b while @@fetch_Status = 0 begin print ‘the proc works!!’ fetch c1 into @a, @b end close c1 deallocate c1 return If you run the p_fetch_explicit_cols procedure, it runs successfully: exec dbo.p_fetch_explicit_cols go the proc works!! Now, if you add a column to the explicit_cols table and rerun the procedure, it fails: alter table explicit_cols add c int null go ptg 1639 General T-SQL Coding Recommendations 43 exec dbo.p_fetch_explicit_cols go Msg 16924, Level 16, State 1, Procedure p_fetch_explicit_cols, Line 7 Cursorfetch: The number of variables declared in the INTO list must match that of selected columns. The p_fetch_explicit_cols procedure fails this time because the cursor is now returning three columns of data, and the FETCH statement is set up to handle only two columns. If the cursor in the p_fetch_explicit_cols procedure were declared with an explicit list of columns a and b instead of SELECT *, this error would not occur. Not providing an explicit column list for INSERT statements can lead to similar problems. Consider the following stored procedure: create proc p_insert_explicit_cols (@a int, @b int, @c int) as insert explicit_cols output inserted.* values (@a, @b, @c) return go exec dbo.p_insert_explicit_cols 10, 20, 30 go a b c 10 20 30 With three columns currently on the explicit_cols table, this procedure works fine. However, if you alter the explicit_cols table to add another column, the procedure fails: alter table explicit_cols add d int null go exec dbo.p_insert_explicit_cols 10, 20, 30 go Msg 213, Level 16, State 1, Procedure p_insert_explicit_cols, Line 4 Insert Error: Column name or number of supplied values does not match table definition. If the procedure were defined with an explicit column list for the INSERT statement, it would still execute successfully: alter proc p_insert_explicit_cols (@a int, @b int, @c int) as insert explicit_cols (a, b, c) ptg 1640 CHAPTER 43 Transact-SQL Programming Guidelines, Tips, and Tricks output inserted.* values (@a, @b, @c) return go exec dbo.p_insert_explicit_cols 11, 12, 13 go a b c d 11 12 13 NULL NOTE If a procedure specifies fewer columns for an INSERT statement than exist in the table, the INSERT statement succeeds only if the columns not specified allow NULL values or have default values associated with them. Notice in the previous example that column d allows nulls, and the OUTPUT clause used in the procedure shows that the INSERT statement inserted a NULL value into that column. Qualify Object Names with a Schema Name In SQL Server 2005, the behavior of schemas was changed from earlier versions of SQL Server. SQL Server 2005 introduced definable schemas, which means schemas are no longer limited to database usernames only. Each schema is now a distinct namespace that exists independently of the database user who created it. Essentially, a schema is now simply a container for objects. A schema can be owned by any user, and the ownership of a schema is transferable to another user. This new feature provides greater flexibility in creating schemas and assigning objects to schemas that are not simply related to a specific database user. At the same time, it can create confusion if objects with the same name exist in multiple schemas. By default, if a user has CREATE permission (or has the db_ddladmin role) in a database and that user creates an object without explicitly qualifying it with a schema name, the object is created in that user’s default schema. If a user is added to a database with the CREATE USER command and a specific default schema is not specified, the default schema is the dbo schema. NOTE To further complicate matters, if you use the old sp_adduser system procedure to add a user to a database, sp_adduser also creates a schema that has the name of the user and makes that the user’s default schema. However, sp_adduser is a deprecated feature that will be dropped in a future release of SQL Server. You should therefore instead use the CREATE USER command. ptg 1641 General T-SQL Coding Recommendations 43 For example, consider the following SQL commands that create a user called testuser43 in the bigpubs2008 database: use bigpubs2008 go sp_addlogin testuser43, ‘TestUser#43’, bigpubs2008 go create user testuser43 go exec sp_addrolemember ‘db_ddladmin’, testuser43 exec sp_addrolemember ‘db_datareader’, testuser43 exec sp_addrolemember ‘db_datawriter’, testuser43 go If you then log in and create a table under the testuser43 account without qualifying the table with a schema name, it is created in the default dbo schema: Verify name of current schema select schema_name() go dbo create table test43(a varchar(10) default schema_name() null) go insert test43 (a) values (DEFAULT) go select * from test43 go a dbo From these commands, you can see that the default schema for the testuser43 user is dbo, and the test43 table gets created in the dbo schema. Now, if you create a schema43 schema and want to create the test43 table in the schema43 schema, you need to fully qualify it or change the default schema for testuser43 to schema43. To demonstrate this, you run the following commands while logged in as the testuser43 user: create schema schema43 go alter user testuser43 with default_schema = schema43 go ptg 1642 CHAPTER 43 Transact-SQL Programming Guidelines, Tips, and Tricks create table test43(a varchar(10) default schema_name() null) go insert test43 (a) values (DEFAULT) go select * from test43 go a schema43 As you can see from this example, now the same CREATE TABLE and INSERT commands as entered before create and insert into a table in the schema43 schema. When the user testuser43 runs a query against the test43 table, which table the SELECT statement runs against depends on the current default schema for testuser43. The first query runs in the schema43 schema: alter user testuser43 with default_schema = schema43 go select * from test43 go a schema43 The next query runs from the dbo schema: alter user testuser43 with default_schema = dbo go select * from test43 go a dbo You can see that the current schema determines which table is queried when the table name is not fully qualified in the query. There are two ways to avoid this ambiguity. The first is to create objects only in the dbo schema and not to have additional schemas defined in the database. If you are working with a database that has multiple schemas, the only other way to avoid ambiguity is to always fully qualify your object names with the explicit schema name. In the following example, because you fully qualify the table name, it doesn’t matter what the current schema is for user testuser43; the query always retrieves from the dbo.test43 table: alter user testuser43 with default_schema = dbo ptg 1643 General T-SQL Coding Recommendations 43 go select * from dbo.test43 go a dbo alter user testuser43 with default_schema = schema43 go select * from dbo.test43 go a dbo Along these same lines, when you are creating objects in a database, it is recommended that you specify the schema name in the CREATE statement to ensure that the object is created in the desired schema, regardless of the user’s current schema. Avoid SQL Injection Attacks When Using Dynamic SQL The EXEC () (or EXECUTE ()) command in SQL Server enables you to execute queries built dynamically into a character string. This is a great feature for building queries on the fly in your T-SQL code when it may not be possible to account for all possible search criteria in a stored procedure or when static queries may not optimize effectively. However, when coding dynamic SQL, it’s important to make sure your code is protected from possible SQL injection attacks. A SQL injection attack is, as its name suggests, an attempt by a hacker to inject T-SQL code into the database without permission. Typically, the hacker’s goal is to retrieve confidential data such as Social Security or credit card numbers or to possibly vandalize or destroy data in the database. SQL injection is usually the result of faulty application design—usually an unvalidated entry field in the application user interface. For example, this could be a text box where the user would enter a search value. A hacker may attempt to inject SQL statements into this entry field to try to gain access to information in the database. Although SQL injection is essentially an application flaw, you can minimize the possibility of SQL injection attacks by following some coding practices in your stored procedures that make use of the EXEC() statement to dynamically build and execute a query. For example, consider the stored procedure shown in Listing 43.1, which might support a search page in a web application where the user is able to enter one or more optional search parameters. . column. Qualify Object Names with a Schema Name In SQL Server 2005, the behavior of schemas was changed from earlier versions of SQL Server. SQL Server 2005 introduced definable schemas, which means. addition to the powerful features available in T -SQL for developing SQL code and stored procedures, triggers, and user-defined functions, SQL Server 2008 also enables you to define custom-managed. capabilities of T -SQL with the addition of a number of new features. These new T -SQL features can be incorporated into the building blocks for creating even more powerful SQL Server database compo- nents,