Detection and Locking

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	7
Dung lượng	28,39 KB

Nội dung

Chapter 18. Detection and Locking As I pointed out in Chapter 17 when discussing Oracle's implicit locking mechanism during a transaction, just because you lock a resource before updating it does not prevent someone else from corrupting your update with his or her own update. As a matter of fact, database locks unto themselves do not solve the problem of multiuser data access integrity. Instead, you as the programmer are responsible for employing a methodology that will prevent application users from overwriting each other's data. In this chapter, we'll look into the problem of multiuser update integrity and at how you can use locks with detection (a pessimistic approach) or update detection (an optimistic approach) to ensure the integrity of data in a multiuser application. First, we'll examine the locking options available when utilizing an Oracle database. Then we'll review the reasons why locks alone don't solve the update integrity problem. We'll continue by exploring detection techniques, that is, detecting that a change has taken place outside the current session and transaction. Next, we'll discuss several pessimistic, high-contention approaches to solving the problem of maintaining data integrity. Finally, we'll discuss an optimistic approach. Since there's a popular notion that locking alone ensures data integrity, let's start by examining Oracle's locking mechanisms in order to debunk this notion. 18.1 Oracle's Locking Mechanisms Oracle provides three locking mechanisms. The first is the implicit locking that automatically takes place when you execute an INSERT, UPDATE, or DELETE statement. The second is the ability to lock rows for an update by first selecting the desired rows using the FOR UPDATE clause in a SELECT statement. The third is the LOCK TABLE command. Let's review implicit locking first. 18.1.1 Implicit Locking As we discussed in Chapter 17, if you execute an INSERT, UPDATE, or DELETE statement for a particular row, then the database implicitly locks that row until you commit or roll back the current transaction. This means that if you perform DML on a table with a primary key constraint or unique index, and you are not in auto-commit mode, another user in another session with its own transaction can see the database as it existed before you started your transaction. You may be thinking to yourself, "Well, that's good, then they can't step all over my data." But you're wrong. All implicit locking does is prevent the second user from updating the row in question until your transaction ends. At that point, her update can overwrite any changes you made without her ever knowing that you've made them. If you, as the first user, insert a new row, a second user inserting, updating, or deleting a row with the same primary key or unique index value will wait indefinitely until you end your transaction. At that time, if the second user is inserting, her insert will fail with a primary key constraint violation. However, if the second user is updating or deleting, her update or delete will be successful. If you update a particular row instead of inserting a row, then a second user updating or deleting the same row will once again wait indefinitely until your transaction ends. And finally, if you delete a row, then a second user updating or deleting the same row will wait until your transaction ends, at which point, her update or delete will succeed, but it will succeed without affecting any rows. In none of these instances will the second user have any indication that your actions had changed the row between the time when your SQL statement's implicit lock took place and the time when the second user's statement executed. No detection at all! This lack of detection is the data integrity problem we're concerned about. Now, let's look at an example of explicit locking. 18.1.2 Row Locking Using Oracle's FOR UPDATE clause in a SELECT statement, you can prelock any desired rows before updating them. The rows you lock with your SELECT statement will remain locked until you end your transaction with a commit or rollback. For example, to lock all the rows in the person table with a last name of "O'Reilly," use a SELECT statement such as the following: select * from person where last name = 'O''Reilly' for update; Of course, using FOR UPDATE means you have an extra step to perform in your program; you'll have to add a SELECT statement before each UPDATE or DELETE to explicitly lock the rows you intend to affect. This will allow you to detect if another user has already locked the desired rows by making your program wait to acquire the lock. However, this will still not solve the integrity problem completely, because you won't see if rows you are about to INSERT have already been inserted. In addition, your SELECT statement will wait indefinitely until it can acquire a lock. So, for the FOR UPDATE clause to be useful as a means of detecting if another session has a lock on something for which you wish to acquire a lock, you'll have to use it with the NOWAIT modifier. If you execute a SELECT statement with FOR UPDATE NOWAIT, and it can't acquire a lock immediately, it will generate an SQLException with the Oracle error "ORA-00054: resource busy and acquire with NOWAIT specified." You can then use the generation of this error to control how you respond when a row you want is already locked by another session. Using explicit locks to maintain update integrity requires a great deal of additional programming effort and works only if every application that updates the database uses the technique. And that's unlikely! 18.1.3 Table Locking Now that you know how to explicitly lock rows, let's look at how to lock an entire table. Locking an entire table is a very high-contention action. Regardless of locking mode, no other user on the system will be able to modify the table until you end your transaction, so use this approach only as a last resort. In 15 years of using Oracle and building applications, I have had only one instance in which it was necessary to lock an entire table. The LOCK TABLE command syntax is: lock table table_name in mode [nowait] which breaks down as: table_name The name of the table you wish to lock mode One of five possible lock modes: share share update exclusive row share row exclusive nowait An optionally specified modifier that makes the LOCK TABLE command return an error if it cannot acquire the lock immediately Although you can lock an entire table, another user can still queue an update that will wait until your transaction is finished with no knowledge of the changes you are making while the table is locked, and hence, no update detection. By now you must be coming to the realization that locking alone does not ensure data integrity. But just in case you aren't fully convinced, let's take a look at an example that proves my point. 18.1.4 Locks Alone Don't Solve the Problem The easiest way to demonstrate that locks alone don't solve the data integrity problem is to open two SQL*Plus sessions and issue some SQL statements to show how one user can overwrite another user's updates. For this experiment, we'll use the TEST_TRANS table created in Chapter 17. If that doesn't exist, and you want to follow along, you'll need to create it now. Recall that TEST_TRANS has two columns: COL1, which is the table's primary key, and COL2. Both are VARCHAR2 columns. Let's start our experiment by inserting a new row into TEST_TRANS from session one: SQL> insert into test_trans values ( '1', 'X' ); 1 row created. Then let's commit: SQL> commit; Commit complete. Next, still from session one, let's update the row we just inserted: SQL> update test_trans set col2 = 'Y' where col1 = '1'; 1 row updated. Now, from session two, let's select the row from TEST_TRANS where COL2 is equal to X: SQL> select * from test_trans where col2 = 'X'; C C - - 1 X At this point, session two knows that the row with a primary key equal to 1 has a value of X. Let's now say that session two wants the row to have a value of Z in COL2. Unbeknownst to session two, however, the row no longer has a value of X in COL2. Session one has changed the value of COL2 to Y, but session two can't see that change because session one has not committed the change. As far as session two is concerned, the row has the value X. To change that value to Z, session two executes the following UPDATE statement: SQL> update test_trans set col2 = 'Z' where col1 = '1'; After issuing this statement, session two waits indefinitely until session one commits, thus releasing its lock. So let's proceed by committing session one's transaction: SQL> commit; Commit complete. Now session two's update succeeds, and session two also commits its changes: 1 row updated. SQL> commit; Commit complete. If you requery the row from session one, you'll see that the value is not Y, which was just set in session one, but is instead Z. While this value seems legitimate to session two, it's probably not the result the session one user expected to see after having just changed the value to Y. What went wrong? Session two had no opportunity to detect that the row with a value of 1 in COL1 and a value of X in COL2 no longer had a value of X in COL2, because the original value of COL2 was not used in the WHERE clause of the UPDATE statement it issued to change the value to Z. To solve this problem, we need to include the original value of COL2 in the WHERE clause as a form of update detection. 18.2 Detection Detection, in our current discussion, is the ability to detect if data you are about to modify has changed since the point when you selected it to be updated. There are several tactics you can employ for detection. Let me clarify that we are no longer discussing locking, but detection. Detection is mutually exclusive of locking. The first two detection tactics we will discuss are pessimistic. By pessimistic, I mean it is assumed that a user in another session will most likely modify all the columns of a row of data you just selected to be updated by your program. One pessimistic detection approach is to use an updatestamp. As an alternative to using an updatestamp, you can compare all the columns of a table, or attributes of an object, to their original values in the WHERE clause of any UPDATE statement that you issue. The third detection tactic is optimistic. It operates under the premise that a user in another session is not likely to modify the same data that you intend to modify. It entails comparing only modified columns or attributes in a WHERE clause. Let's examine each tactic in detail, beginning with an updatestamp. 18.2.1 Using an Updatestamp An updatestamp is a number column in a table or a number attribute in an object. The database increments its value by 1 each time you modify a given row or object. That way, you can compare the updatestamp you retrieved from the database to its current value to detect a modification of the row or object by another session. The benefit of using an updatestamp is that it makes formulating a WHERE clause for an UPDATE statement fairly simple. Typically, you need to include only the primary key and the updatestamp in the WHERE clause. The drawback is that you have to add code to your application, or even to the database in the form of triggers, to increment the updatestamp every time a row or object is updated. Adding an updatestamp to a row or object also means that you have to add an additional column or attribute to every table or object type you use in your database. If you find this to be undesirable, you can use the second detection method, in which you compare all columns and attributes of the table (or object) that you are updating to their original values in the WHERE clause of your UPDATE statement. An Updatestamp Versus a Timestamp In databases other than Oracle, you might be able to add an additional column or attribute of a timestamp data type and then have the database update the timestamp every time a row or object is updated. Then, you can compare the value of the timestamp in the database to the original value you retrieved prior to your update in the SQL statement's WHERE clause. However, this does not work with Oracle because Oracle's timestamp data type, DATE, holds values only down to the second. In addition, if you create a custom data type, such as a number, to hold the time value down to milliseconds, you'll find that Oracle can still perform several hundred updates within that time frame. So a feasible approach is to use an updatestamp. 18.2.2 Comparing All Columns or Attributesto Their Original Values A second detection tactic is to compare all the columns (or attributes) in the table (or object) that you are updating to their original values. You do this as part of your UPDATE statement in the WHERE clause; so the UPDATE statement fails if someone else has modified the row in question. With reference to my earlier example, when session two retrieved the row in which COL2 contained an X, it found that COL1 was equal to 1. To rewrite session two's UPDATE statement to include detection, add a comparison of COL2 to its original value. For example: update test_trans set col2 = 'Z' where col1 = '1' and col2 = 'X'; Execute this UPDATE statement from session two in the earlier example, and no rows will be affected. The reason no rows will be affected is because COL2 has been modified by session one and is no longer equal to X. You can use the fact that the executeUpdate( ) method returns 0 for the number of rows affected to determine that the update was not successful. You then know that the row was changed between the time you selected it and the time you attempted your update. The benefit to using all columns or attributes in a WHERE clause is that you don't have to add an additional updatestamp column or attribute to your tables and objects. The drawback is that you have to formulate a more complex, and larger, WHERE clause. If a table has 20 columns, you have to compare all 20 columns to their original values. One problem with both pessimistic methods of detection is that they prevent more than one user from updating a row in a table at any point in time without an update failure, in the sense that a second updator will always fail. Accordingly, we call these low-concurrency methods. The fact that they essentially check on an entire row or object is what makes them pessimistic. 18.2.3 Comparing Modified Columns or Attributes to Their Original Values The third method, which provides a high level of concurrency, is to compare only modified columns to their original values. To facilitate a high amount of concurrency, that is, the ability for multiple users to update the same row or object without update failure, you can detect changes that are relevant only to your UPDATE statement by including only the primary key and the modified columns in the WHERE clause of the UPDATE statement. For example, let's use the person table we created in Chapter 8. Let's assume that we insert a row: insert into person ( person_id, last_name, first_name, middle_name, birth_date, mothers_maiden_name ) values ( 1, 'O''Reilly', 'Tom' null, to_date( '19800315', 'YYYYMMDD' ), 'Unknown' ); Let's further assume that two sessions select the values of the row just inserted. After both sessions select the row, session one executes the following UPDATE statement and commits: update person set first_name = 'Tim' where person_id = 1 and first_name = 'Tom'; Session two then executes the following UPDATE statement and commits: update person set birth_date = to_date( '19800317', 'YYYYMMDD' ) where person_id = 1 and birth_date = to_date( '19800315', 'YYYYMMDD' ); Both statements will execute successfully. The trickery here is that while these two UPDATE statements both update the same row, they each modify different columns in that row. As a result, they don't overwrite each other's changes during the update. This technique of using the primary key and modified columns provides the highest amount of concurrency, the least amount of update failure, and prevents unintended corruption of data due to lack of detection. Since it allows more than one user to update a row, I consider it an optimistic approach. The downside of this tactic is that the formulation of WHERE clauses becomes quite complex to code. Now that you've seen what locking and detection can do individually, let's examine how to combine them to effectively protect the integrity of updates in a multiuser environment. 18.3 Data Integrity Solutions Whenever more than one user accesses a database, there is the possibility that one user will inadvertently overwrite another user's data. As we have seen from this chapter's earlier discussions, locks alone do not guarantee data integrity. Indeed, some form of change detection is also needed. In this section, we'll take what we've learned about locking and detection and formulate two pessimistic solutions and one optimistic solution to maintaining data integrity. 18.3.1 Pessimistic Data Integrity Solutions Let's start our discussion of maintaining data integrity by taking a look at two pessimistic approaches. The first is to use row locking by selecting a row FOR UPDATE NOWAIT before updating it. The second is to use implicit locking and detection. 18.3.1.1 SELECT FOR UPDATE NOWAIT If every user of a database uses the technique of selecting a row FOR UPDATE with NOWAIT, then data integrity will be maintained, because a second user will not be able to acquire a lock on the data until the first has committed his changes. But what about detection? Detection is implicit in the fact that an application will get an SQLException with Oracle error "ORA-00054" if it cannot immediately lock the desired row. A row that can't be locked is one that is already being modified by someone else. Hence, the contention between updates is detected before it even exists. Although the SELECT FOR UPDATE NOWAIT approach works well, it has major concurrency and coding drawbacks. First, as soon as one user locks a row FOR UPDATE, it remains locked until that person commits or rolls back. This means that if he selects a row, then go on vacation for two weeks leaving his computer running, the row remains locked until he returns and commits or rolls back. Now that's a great deal of contention for the same row of data! A workaround to this problem is to retrieve a row to present its data to a user and then re-retrieve the row FOR UPDATE NOWAIT after the user has made changes. Then, compare the original values returned by the first query to those returned by the reselect just before updating the row. This brings us to another drawback: this method can require a great deal of extra coding. Finally, there is also the fact that every user of the database must play by the rules in order for this technique to work. A less contentious tactic is to use pessimistic detection with implicit locking. Let's examine that tactic next. 18.3.1.2 Pessimistic detection and implicit locking Pessimistic detection is the use of an updatestamp or of all columns of a table (or all attributes of an object) in a SQL statement's WHERE clause to detect if changes have occurred in a target row or object. Implicit locking happens automatically when any INSERT, UPDATE, or DELETE statement is executed. If you combine implicit locking with the use of an updatestamp or with the checking of all columns, you will maintain data integrity. The drawback to this approach is straightforward: any change to a row by another session after the row has been selected in your session will result in an update failure. You'll then have to notify your software user that he will need to start over, essentially reselecting the data and updating once again. With the technique described in this section, you get lower contention, because locks occur only when a change has taken place, but you trade contention for update failures. Since you'll create a simple WHERE clause, the construction of which is predictable, the coding burden is minimized, but you trade WHERE clause complexity for the need to notify the user that an update has failed. Overall, this is a better technique than using explicit locking when you have a large application with many users. An even better tactic for a large application, however, is to use an optimistic data integrity solution. 18.3.2 An Optimistic Data Integrity Solution An optimistic solution to the multiuser data integrity problem is to combine optimistic detection with implicit locking. In this scenario, you store the original values of columns or attributes when you retrieve a row or an object and use them in a dynamically constructed WHERE clause when you later update that row or object. In the WHERE clause, you compare the values of the primary key, and any modified columns or attributes, to their original values. If the original and new values all match, you know that no other user has updated the same columns that you yourself are updating. While dynamic coding of the WHERE clause requires a good deal of programming, the fact that you employ implicit locking along with optimistic detection provides the highest level of concurrency while reducing the number of update failures. This technique works regardless of how other applications use the database. You still have to code update failure notification, but your program's users will not see many such messages. 18.3.3 Which Approach to Use? So which data integrity approach is the right tactic for you? Many development tools employ the pessimistic approaches but suffer when a second or third tool is used to develop against the same database. With an optimistic approach you can't get into data integrity trouble regardless of what other tools are used. Ultimately, you have to decide based on how differing technologies access the same database. Personally, I always choose an optimistic approach. Then I don't have to be concerned about scalability. Now that you're an expert at detection and locking and have several tactics you can employ to maintain data integrity, let's move on to Chapter 19, in which we'll examine myths and facts of JDBC performance. Chapter 19. Performance . employ for detection. Let me clarify that we are no longer discussing locking, but detection. Detection is mutually exclusive of locking. The first two detection. pessimistic detection with implicit locking. Let's examine that tactic next. 18.3.1.2 Pessimistic detection and implicit locking Pessimistic detection

Ngày đăng: 29/09/2013, 09:20

Xem thêm