Nielsen c65.tex V4 - 07/21/2009 4:10pm Page 1362 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1363 Managing Transactions, Locking, and Blocking IN THIS CHAPTER Transactional integrity theory The transaction log and why it’s important SQL Server locks and performance Handling and preventing deadlocks Implementing optimistic and pessimistic locking A ny route is fast at 4:00 A.M. with no traffic and all green lights. The trick is designing a route that works during rush hour. Concurrency is about contention. Contention is ‘‘a struggling together in opposition’’ in which two users are trying to get hold of the same thing, and it can apply to system resources or data. Managing resource contention is about writing code that uses as few resources as possible to enable as many users as possible to execute code at the same time. With database contention, more than one user is trying to access the same resource within the database. For any complex system, as the number of users increases, performance decreases as each user competes for the same resources and the contention increases. This chapter focuses on the contention for data. Most people focus on system resource contention, not on contention f or database resources. Chapter 2 defined six database architecture design goals: usability, integrity, scalability, extensibility, availability, and security. Scalability is all about concurrency — multiple users simultaneously attempting to retrieve and modify data. Here’s why: To ensure transactional integrity, SQL Server (by default) uses locks to protect transactions from affecting other transactions. Specifically, transactions that are reading data will lock that data, which prevents, or blocks, other transactions from writing the same data. Similarly, a transaction that’s writing will prevent other transactions from writing or reading the same data. SQL Server maintains locks for the duration of a transaction, so as more transactions occur simultaneously, especially long transactions, more resources are locked, which results in more transactions being blocked. This can create a locking and blocking domino effect that bogs down the whole system. 1363 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1364 Part IX Performance Tuning and Optimization To reduce c ontention you need to reduce the amount of resources being locked and reduce the length of time those are locks are held. Relational database design (as described in Chapter 3) states that a well-defined schema, set-based queries, and a sound indexing strategy all work together to reduce the aggregate workload of the database and thus reduce transaction duration. Reducing the aggregate workload of the database addresses both of the requirements listed previously and thus increases concurrency (the number of users who can share a resource) and sets up the database for the more advanced high-scalability features such as table partitioning. That’s why concurrency is relational database design’s fourth level, following schema, set-based queries, and indexing. I can’t stress enough that if you have a transaction locking and blocking problem, the solution isn’t found at the concurrency level, but in the deeper layer of Smart Database Design — that is, in modifying either the schema, query, or indexing. This chapter has four goals: ■ Detail how transactions affect other transactions ■ Explain the database theory behind transactions ■ Illustrate how SQL Server maintains transactional integrity ■ Explain how to get the best performance from a high-concurrency system What’s New with Transactions? S QL Server has always had transactional integrity, but Microsoft has improved it over the versions. SQL Server 7 saw row locking, which eliminated the ‘‘insert last page hot spot’’ issue and dramatically improved scalability. SQL Server 2000 improved how deadlocks are detected and rolled back. SQL Server 2005 introduced an entirely rewritten lock manager, which simplified lock escalation and improved performance. Beyond the ANSI standard isolation levels, SQL Server 2005 added snapshot isolation, which makes a copy of the data being updated in its own physical space, completely isolated from any other transactions, which enables readers to not block writers. Try-catch error handling, introduced in SQL Server 2005, can catch a 1205 deadlock error. SQL Server 2008 continues the performance advances with the new capability to restrict lock escalation on a table, which forces row locks and can improve scalability. Transactions and locking in SQL Server can be complicated, so this chapter explains the foundation of ACID transactions (described in the next section) and SQL Server’s default behavior first, followed by potential problems and variations. If you want the very short version of a long story, I believe that the READ COMMITTED transaction isolation level (SQL Server’s default) is the best practice for most OLTP databases. The exceptions are explained in the section ‘‘Transaction Isolation Levels.’’ 1364 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1365 Managing Transactions, Locking, and Blocking 66 The ACID Properties Transactions are defined by the ACID properties. ACID is an acronym for four interdependent prop- erties: atomicity, consistency, isolation,anddurability. Much of the architecture of any modern relational database is founded on these properties. Understanding the ACID properties of a transaction is a prereq- uisite for understanding SQL Server. Atomicity A transaction must be atomic, meaning all or nothing. At the e nd of the transaction, either all of the transaction is successful, or all of the transaction fails. If a partial transaction is written to disk, the atomic property is violated. The ability to commit or roll back transactions is required for atomicity. Consistency A transaction must preserve database consistency, which means that the database must begin the transac- tion in a state of consistency and return to a state of consistency once the transaction is complete. For the purposes of ACID, consistency means that every row and value must agree with the reality being modeling, and every constraint must be enforced. For example, if the order rows were written to disk but the order detail rows are not written, the consistency between the Order and OrderDetail tables, or more specifically, the OrderDetail table’s OrderID foreign key constraint, would have been vio- lated, and the database would be in an inconsistent state. This is not allowed. Consistency allows the database to be in an inconsistent state during the transaction. The key is that the database is consistent at the completion of the transaction. Like atomicity, the database must be able to commit the whole transaction or roll back the whole transaction if modifications resulted in the database being inconsistent. Isolation Each transaction must be isolated, or separated, from the effects of other transactions. Regardless of what any other transaction is doing, a transaction must be able to continue with the exact same data sets it started with. Isolation is the fence between two transactions. A proof of isolation is the ability to replay a serialized set of transactions on the original set of data and always receive the same result. For example, assume Joe is updating 100 rows. While Joe’s transaction is under way, Sue tries to read one of the rows Joe is working on. If Sue’s read takes place, then Joe’s transaction is affecting Sue’s transaction, and their two transactions are not fully isolated from each another. This property is less critical in a read-only database or a database with only a few users. SQL Server enforces isolation with locks and row versioning. Durability The durability of a transaction refers to its permanence regardless of system failure. Once a transaction is committed, it stays committed in the state it was committed. Another transaction that does not modify the data from the first transaction should not affect the data from the first transaction. In addition, the Database Engine must be designed so that even if the data drive melts, the database can be restored up to the last transaction that was committed a split second before the hard drive died. SQL Server ensures durability with the write-ahead transaction log. 1365 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1366 Part IX Performance Tuning and Optimization Programming Transactions A transaction is a sequence of tasks that together constitute a logical unit of work. All the tasks must complete or fail as a single unit. For example, in the case of an inventory movement transaction that reduces the inventory count from one bin and increases the inventory count in another bin, both updates to the bins must be written to the disk, or neither should be written to the disk. If this didn’t happen, then your total inventory count would be wrong. In SQL Server, every DML operation ( SELECT, INSERT, UPDATE, DELETE, MERGE) is a transaction, whether or not it has been executed within a BEGIN TRANSACTION. For example, an INSERT command that inserts 25 rows is a logical unit of work. Each and every one of the 25 rows must be inserted. An UPDATE to even a single row operates within a transaction so that the row in the clustered index (or heap) and the row’s data in every non-clustered index are all updated. Even SELECT commands are transactions; a SELECT that should return 1,000 rows must return all 1,000 rows. Any partially completed transaction would violate transactional integrity. In ancient Hebrew poetry (a passion of mine), an inclusio is a line or phrase that begins a poem and is repeated at the close of the poem, providing a themeorwrapperforthepoem.Inthesameway,you can think of a transaction as a wrapper around a unit of work. Logical transactions If the logical unit of work involves multiple operations, some code is needed to define the perimeter of a transaction: two markers — one at the beginning of the transaction, and the other at its completion, at which time the transaction is committed to disk. If the code detects an error, then the entire transaction can be rolled back, or undone. The following three commands appear simple, but a volume o f sophistica- tion lies behind them: ■ BEGIN TRANSACTION ■ COMMIT TRANSACTION ■ ROLLBACK TRANSACTION (The text in bold is the r equired portion of the command.) A transaction, once begun, should be either committed to disk or rolled back. A transaction left hanging will eventually cause an error — either a real error or a logical data error, as data is never committed. Putting T-SQL code to the inventory movement example, if Michael Ray, Production Supervisor at Adventure Works, moves 100 bicycle wheel spokes from miscellaneous storage to the subassembly area, the next code example records the move in the database. The two updates that constitute the logical unit of work (the update to LocationID = 6 and the update to LocationID = 50) are wrapped inside a BEGIN TRANSACTION and a COMMIT TRANSACTION . The transaction is then wrapped in a TRY block for error handling: BEGIN TRY; BEGIN TRANSACTION; UPDATE Production.ProductInventory SET Quantity -= 100 1366 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1367 Managing Transactions, Locking, and Blocking 66 WHERE ProductID = 527 AND LocationID = 6 misc storage AND Shelf = ‘B’ AND Bin = 4; UPDATE Production.ProductInventory SET Quantity += 100 WHERE ProductID = 527 AND LocationID = 50 subassembly area AND Shelf = ‘F’ AND Bin = 11; COMMIT TRANSACTION; END TRY BEGIN CATCH; ROLLBACK TRANSACTION; RAISERROR(’Inventory Transaction Error’, 16, 1); RETURN; END CATCH; If you’re not familiar with Try–Catch, the improved error-handling code introduced in SQL Server 2005, it’s covered in Chapter 23, ‘‘T-SQL Error Handling.’’ If all goes as expected, both updates are executed, the transaction is committed, and the TRY block completes execution. However, if either UPDATE operation fails, execution immediately transfers down to the CATCH block, the COMMIT is never executed, and the CATCH block’s ROLLBACK TRANSACTION will undo any work that was done within the transaction. SQL Server 2008 Books Online and some other sources refer to transactions using BEGIN TRANSACTION as explicit transactions . I prefer to call these logical transactions instead because the name makes more sense to me and helps avoid any confusion with implicit transactions (covered soon). When coding transactions, the minimum required syntax is only BEGIN TRAN, COMMIT, ROLLBACK,so you’ll often see these commands abbreviated as such in production code. Xact_State() Every user c onnection is in one of three possible transaction states, which may be queried using the Xact_State() function, introduced in SQL Server 2005: ■ 1: Active, healthy transaction ■ 0: No transaction ■ -1: Uncommittable transaction. It’s possible to b egin a transaction, experience an error, and not be able to commit that transaction (consider the consistency part of ACID). In prior versions of SQL server these were called doomed transactions. Typically, the error-handling catch block will test the Xact_State() function to determine whether the transaction can be committed or must be rolled back. The next CATCH block checks Xact_State() and determines whether it can COMMIT or ROLLBACK the transaction: 1367 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1368 Part IX Performance Tuning and Optimization BEGIN CATCH IF Xact_State() = 1 –- there’s an active committable transaction COMMIT TRAN; IF Xact_State() = -1 –- there’s an uncommittable transaction BEGIN ROLLBACK TRANSACTION; RAISERROR(’Inventory Transaction Error’, 16, 1); END END CATCH; Use Xact_State() for single DML transactions, but avoid it if the transaction includes multiple DML commands that must be committed or rolled back as an atomic unit. If one of the DML commands failed and the transaction were still committable, committing the logical transaction would write a partial transaction. Horrors! Although the XactState() function is normally used within the error-handling catch block, it’s not restricted to the catch block and may be called at a ny time to determine whether the code is in a transaction. Xact_Abort A common SQL Server myth is that an error condition will roll back the transaction. In fact, unless there’s try-catch error handling in place, many error conditions only abort the statement. The batch continues, and the transaction is completed even though an error occurred. Turning on Xact_Abort solves some of these problems by doing two things to the error. First, it promotes statement-level errors into batch-level errors, solving the single-statement error issue. Second, Xact_Abort automatically rolls back any pending transaction. Therefore, Xact_Abort is a very good thing and should often be set in code. Xact_Abort also triggers the try-catch code and sends execution into the catch block. Heads up: There’s significant overlap between transaction error handling in this chapter and T-SQL error handling in general in Chapter 23, ‘‘T-SQL Error Handling.’’ Nested transactions Multiple transactions can be nested, although they are rarely nested within a single stored procedure. Typically, nested transactions occur because a stored procedure with a logical transaction calls another stored procedure that also has a logical transaction. These nested transactions behave as one large transaction: Changes made in one transaction can be read in a nested transaction — they do not behave as isolated transactions, where actions of the nested transaction can be committed independently of a parent transaction. When transactions are nested, a COMMIT only marks the current nested transaction level as complete. It does not commit anything to disk, but a rollback undoes all pending transactions. At first this sounds inconsistent, but it actually makes sense, because an error within a nested transaction is also an error in the outer transaction. The @@TranCount indicates the current nesting level. A commit when the trancount > 1 has no effect except to reduce trancount by 1. Only when trancount is 1 are the actions 1368 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1369 Managing Transactions, Locking, and Blocking 66 within all levels of the nested transaction committed to disk. To prove this behavior, the next code sample examines the @@TranCount global variable, which returns the current transaction nesting level: SELECT @@TRANCOUNT; 0 BEGIN TRAN; SELECT @@TRANCOUNT; 1 BEGIN TRAN; SELECT @@TRANCOUNT; 2 BEGIN TRAN; SELECT @@TRANCOUNT; 3 ROLLBACK; – undoes all nested transactions SELECT @@TRANCOUNT; 0 Results: 0 1 2 3 0 If the code might have nested transactions, then it’s a good idea to examine @@TranCount (or XactState()) because attempting to COMMIT or ROLLBACK a transaction if no pending transactions exist will raise a 3902 or 3903 error with a 16 severity code to the client. Implicit transactions While SQL Server requires an explicit BEGIN TRANSACTION to initiate a logical transaction, this behavior can be modified so that every DML statement starts a logical transaction if one is not already started (so you don’t end up with numerous nested transactions). It’s as if there were a hidden BEGIN TRANSACTION before every DML statement. This means that once a SQL DML command is issued, a COMMIT or ROLLBACK is required. To demonstrate implicit transactions, the following code alone will not commit the UPDATE: USE AdventureWorks2008; SET Implicit_Transactions ON; UPDATE HumanResources.Department SET Name = ‘Department of Redundant Departments’ WHERE DepartmentID = 2; Viewing the @@TranCount global variable does indeed show that there’s one pending transaction level awaiting a COMMIT or rollback: SELECT @@TRANCOUNT; Result: 1 1369 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1370 Part IX Performance Tuning and Optimization Adding a COMMIT TRANSACTION to the end of the batch commits the transaction, and the update is finalized: COMMIT TRANSACTION; Multiple DML commands or batches will occur within a single logical transaction, so it doesn’t create a bunch of nested transactions — what a mess that would be. Turning off implicit transactions, as shown here, only affects future batches. It does not commit any pending transactions: SET Implicit_Transactions OFF; Implicit_transactions ON is the default behavior for Oracle, and adjusting to explicit transactions takes getting used to for Oracle developers moving up to SQL Server. On the other hand, setting your buddy’s connection (to the development server) to Implicit_transactions ON might make for a good April Fool’s Day joke! Save points It is also possible to declare a save point within the sequence of tasks and then roll back to that save point only. However, I believe that this mixes programmatic flow of control ( IF, ELSE, WHILE)with transaction handling. If an error makes it necessary to redo a task within the transaction, it’s cleaner to handle the error with standard error handling than to jury-rig the transaction handling. Default Locking and Blocking Behavior When two transactions both need the same resource, SQL Server uses locks to provide transactional integrity between the two transactions. Locking and blocking isn’t necessarily a bad thing — in fact, I think it’s a good thing. It ensures transactional integrity. There are different types of locks, including shared (reading), update (getting ready to write), exclusive (writing) and more. Some of these locks work well together (e.g., two people can have shared locks on a resource); however, once someone has an exclusive lock on a resource, no one can get a shared lock on that resource — this is blocking. The different types of locks, and how compatible they are with each other, are documented in BOL. SQL Server’s default transaction isolation is read committed, meaning that SQL Server ensures that only committed data i s read. While a writer is updating a row, and the data is still yet uncommitted, SQL Server makes other transactions that want to read that data wait until the data is committed. To demonstrate SQL Server’s default locking and blocking behavior, the following code walks through two transactions accessing the same row. Transaction 1 will update the row, while transaction 2 will attempt to select the row. The best way to see these two transactions is with two Query Editor windows, as shown in Figure 66-1. 1370 www.getcoolebook.com Nielsen c66.tex V4 - 07/21/2009 4:13pm Page 1371 Managing Transactions, Locking, and Blocking 66 FIGURE 66-1 Opening multiple Query Editor windows and sending the second tab into a New Vertical Tab Group (using the tab’s context menu) is the best way to experiment with transactions. Transaction 1 opens a logical transaction and updates the Department table: Transaction 1 USE AdventureWorks2008; BEGIN TRANSACTION; UPDATE HumanResources.Department SET Name = ‘New Name’ WHERE DepartmentID = 1; Transaction 1 (on my machine it’s on connection, or SPID, 54) now has an exclusive (X)writelock on the row being updated by l ocking the key of the record I’m updating. The locks can be viewed using the DMV sys.dm_tran_locks (the full query and more details about locks appear later in this chapter). 1371 www.getcoolebook.com . integrity, but Microsoft has improved it over the versions. SQL Server 7 saw row locking, which eliminated the ‘‘insert last page hot spot’’ issue and dramatically improved scalability. SQL Server 2000. readers to not block writers. Try-catch error handling, introduced in SQL Server 2005, can catch a 1205 deadlock error. SQL Server 2008 continues the performance advances with the new capability to. rolled back. SQL Server 2005 introduced an entirely rewritten lock manager, which simplified lock escalation and improved performance. Beyond the ANSI standard isolation levels, SQL Server 2005