ptg 1044 CHAPTER 32 Database Snapshots the primary transactional database, without any data loss impact whatsoever. This is a very powerful reporting and availability configuration. What’s New with Database Snapshots With SQL Server 2005, everything about database snapshots was new because this was a completely new feature for SQL Server. With SQL Server 2008, there is little new to this feature other than under-the-cover improvements to the copy-on-write mechanisms and three more years of production implementations under their belt. One hundred percent of the SQL code you have set up for creating and managing snapshots will work perfectly with SQL Server 2008. No upgrade pain here. Database snapshots have solved many companies’ reporting, data safeguarding, and performance issues and directly contributed to higher availability across the board. Be aware, though, there are plenty of restrictions with doing database snapshots. In fact, these restrictions may prohibit you from using snapshots at all. We talk about these restrictions and when you can safely do database snapshots in a bit. NOTE The examples in this chapter are based on the SQL Server 2005 version of the AdventureWorks database rather than the newer AdventureWorks2008 or AdventureWorks2008R2 sample databases used for many of the examples in the other chapters in this book. The reason for this is because of the examples presented that create a snapshot from a Database Mirror. Database Mirroring cannot be implemented on a database that is also configured for FILESTREAM storage. The 2008 and 2008R2 versions of the AdventureWorks database make use of FILESTREAM storage. Fortunately, the 2005 version of the AdventureWorks database can be installed using the same installer that installs the AdventureWorks2008 or AdventureWorks2008R2 database. If you didn’t install AdventureWorks when you installed either of these sample databases, simply relaunch the installer and choose to install the AdventureWorks OLTP database. For more information on downloading and installing the AdventureWorks sample data- bases, see the Introduction chapter. What Are Database Snapshots? Microsoft has kept up its commitment of providing a database engine foundation that can be highly available 7 days a week, 365 days a year. Database snapshots contribute to this goal in several ways: . They decrease recovery time of a database because you can restore a troubled data- base with a database snapshot—referred to as reverting. . They create a security blanket (safeguard) prior to running mass updates on a critical database. If something goes wrong with the update, the database can be reverted in a very short amount of time. Download from www.wowebook.com ptg 1045 What Are Database Snapshots? SQL Server 2008 Source Server CREATE DB ‘xyz’ AS SNAPSHOT OF AdventureWorks Transactional Users Point-in-time Reporting Users (Read-Only) Reverting if needed RESTORE DB AdventureWorks FROM DATABASE_SNAPSHOT = ‘xyz’ Adventure Works DB translog Database Snapshot FIGURE 32.1 Basic database snapshot concept: a source database and its database snap- shot, all on a single SQL Server instance. . They provide a read-only, point-in-time reporting database for ad hoc or canned reporting needs quickly (thus, increasing reporting environment availability). . They create a read-only, point-in-time reporting and off-loaded database for ad hoc or canned reporting needs quickly from a database mirror (again, increasing report- ing environment availability and also offloading reporting impact away from your production server/principal database server). . As a bonus, database snapshots can be used to create testing or QA synchronization points to enhance and improve all aspects of critical testing (thus decreasing bad code from going into production that directly affects the stability and availability of that production implementation). A database snapshot is simply a point-in-time full database view. It’s not a copy—at least not a full copy when it is originally created. We talk about this more in a moment. Figure 32.1 shows conceptually how a database snapshot can be created from a source database on a single SQL Server instance. 32 This point-in-time view of a database’s data never changes, even though the data (data pages) in the primary database (the source of the database snapshot) may change. It is truly a snapshot at a point in time. For a snapshot, it always simply points to data pages that were present at the time the snapshot was created. If a data page is updated in the Download from www.wowebook.com ptg 1046 SQL Server 2008 SQL Server Empty Sparse file of a Snapshot just created (no updates to original data pages have occurred yet) Source Data Pages Sparse File Pages Source Adventure Works DB Snapshot AdventureWorks DB FIGURE 32.2 Source database data pages and the sparse file data pages that comprise the database snapshot. source database, a copy of the original source data page is moved to a new page chain termed the sparse file. This utilizes copy-on-write technology. Figure 32.2 shows the sparse file that is created, alongside the source database itself. CHAPTER 32 Database Snapshots A database snapshot really uses the primary database’s data pages up until the point that one of these data pages is updated (changed in any way). As already mentioned, if a data page is updated in the source database, the original copy of the data page (which is refer- enced by the database snapshot) is written to the sparse file page chain as part of an update operation, using the copy-on-write technology. It is this new data page in the sparse file that still provides the correct point-in-time data to the database snapshot that it serves. Figure 32.3 illustrates that as more data changes (updates) occur in the source data- base, the sparse file gets larger and larger with the old original data pages. Eventually a sparse file could contain the entire original database if all data pages in the primary database were changed. As you can also see in Figure 32.3, which data pages the database snapshot uses from the original (source) database and which data pages are used Download from www.wowebook.com ptg 1047 SQL Server 2008 SQL Server Copy of original pages for snapshot only when a page is changed (Copy-on-write) Source Data Pages Source Adventure Works DB Snapshot AdventureWorks DB System Catalog of changed pages Sparse File Pages FIGURE 32.3 Data pages being copied to the sparse file for a database snapshot as pages are being updated in the source database. What Are Database Snapshots? 32 from the sparse file are all managed by references in the system catalog for the database snapshot. This setup is incredibly efficient and represents a major breakthrough of provid- ing data to others. Because SQL Server is using the copy-on-write technology, a certain amount of overhead is used during write operations. This is one of the critical factors you must sort through if you plan on using database snapshots. Nothing is free. The overhead includes the copying of the original data page, the writing of this copied data page to the sparse file, and then the subsequent metadata updating to the system catalog that manages the database snapshot data page list. Because of this sharing of data pages, it should also be clear why database snapshots must be within the same instance of a SQL Server: both the source database and snapshot start out as the same data pages and then diverge as source data pages are updated. In addition, when a database snapshot is created, SQL Server rolls back any uncommitted transactions for that database snapshot; only the committed transactions are part of a newly created database snapshot. And, as you might expect of something that shares data pages, database snapshots become unavailable if the source database becomes unavailable (for example, if it is damaged or goes offline). Download from www.wowebook.com ptg 1048 CHAPTER 32 Database Snapshots NOTE You might plan to do a new snapshot after about 30% of the source database has changed to keep overhead and file sizes in the sparse file at a minimum. The most fre- quent problem that occurs with database snapshots is related to sparse file sizes and available space. Remember, the sparse file has the potential of being as big as the source database itself (if all data pages in the source database eventually get updat- ed). Plan ahead for this situation! There are, of course, alternatives to database snapshots, such as data replication, log ship- ping, and even materialized views, but none are as easy to manage and use as database snapshots. The most common terms associated with database snapshots are . Source database—This is the database on which the database snapshot is based. A database is a collection of data pages. It is the fundamental data storage mechanism that SQL Server uses. . Snapshot databases—There can be one or more database snapshots defined against any one source database. All snapshots must reside in the same SQL Server instance. . Database snapshot sparse file—This new data page allocation contains the origi- nal source database data pages when updates occur to the source database data pages. One sparse file is associated with each database data file. If you have a source database allocated with one or more separate data files, you have corresponding sparse files of each of them. . Reverting to a database snapshot—If you restore a source database based on a particular database snapshot that was done at a point in time, you are reverting. You are actually doing a database RESTORE operation with a FROM DATABASE_SNAPSHOT statement. . Copy-on-write technology—As part of an update transaction in the source data- base, a copy of the source database data page is written to a sparse file so that the database snapshot can be served correctly (that is, still see the data page as of the snapshot point in time). As Figure 32.4 illustrates, any data query using the database snapshot looks at both the source database data pages and the sparse file data pages at the same time. And these data pages always reflect the unchanged data pages at the point in time the snapshot was created. Limitations and Restrictions of Database Snapshots Many restrictions or limitations are involved with using database snapshots in SQL Server. Some of them are pretty restrictive and may determine whether you can consider using snapshots. With the current release of SQL Server Management Studio, you cannot even Download from www.wowebook.com ptg 1049 Limitations and Restrictions of Database Snapshots 32 SQL Server 2008 SQL Server Source Data Pages Source Adventure Works DB Snapshot AdventureWorks DB System Catalog of changed pages Sparse File Pages Snapshot Users SELECT…data… FROM AdventureWorks SNAPSHOT FIGURE 32.4 A query using the database snapshot touches both source database data pages and sparse file data pages to satisfy a query. set up database snapshots with this GUI or a wizard; it must all be done using T-SQL state- ments (which is not that bad a deal). The following are some of the other restrictions: . You must drop all other database snapshots when using a database snapshot to revert a source database. . You lose visibility to the source database’s uncommitted transactions in the database snapshot when it is created. . The more updates to pages in the source database, the bigger your database snapshot sparse files become. . A database snapshot can be done only for an entire database, not for a subset of the database. . No additional changes can be made to a database snapshot. It is read-only and can’t even have additional indexes created for it to make reporting queries run faster. . Additional overhead is incurred on update operations on the source database due to the copy-on-write technique (not with SELECT statements). Download from www.wowebook.com ptg 1050 CHAPTER 32 Database Snapshots . If you’re using a database snapshot to revert (restore) a source database, neither the snapshot nor source database is available. . The source database cannot be dropped, detached, or restored until the database snapshot is dropped first. . Files on the source database or the snapshot cannot be dropped. . For the database snapshot to be used, the source database must also be online (unless the source database is a mirrored database). . The database snapshot must be on the same SQL Server instance as the source database. . Snapshots are read-only. . Database snapshot files must be on NTFS only (not FAT 32 or RAW partitions). . Full-text indexing is not supported. . If a source database ever goes into a RECOVERY_PENDING status, the database snapshot also becomes unavailable. . If a database snapshot ever runs out of disk space, it must be dropped; it is actually marked as SUSPECT. This may seem like a lot of restrictions—and it is. But look to Microsoft to address many of these restrictions in future releases. These current restrictions may disqualify many folks from getting into the database snapshot business. Others will thrive in its use out of the box. Copy-on-Write Technology The copy-on-write technology that Microsoft first introduced with SQL Server 2005 is at the core of both database mirroring and database snapshot capabilities. How it is used in database mirroring is explained in Chapter 20. In this section, we walk through a typical transactional user’s update of data in a source database. As you can see in Figure 32.5, an update transaction is initiated against the AdventureWorks database (labeled A). As the data is being updated in the source database’s data page and the change is written to the transaction log (labeled B), the copy-on-write technology also copies the original source database data page in its unchanged state to the sparse data file (also labeled B) and updates the metadata page references in the system catalog (also labeled B) with this movement. The original source data page is still available to the database snapshot. This adds extra overhead to any transaction that updates, inserts, or deletes data from the source database. After the copy-on-write technology finishes its write on the sparse file, the original update transaction is properly committed, and acknowledgment is sent back to the user (labeled C). Download from www.wowebook.com ptg 1051 When to Use Database Snapshots 32 SQL Server 2008 SQL Server Data Pages Snapshot Users Transactional Users Query data Source System Catalog of changed pages Sparse File Pages Copy of original pages for snapshot only when a page is changed (Copy-on-write) Updated Row Committed B B AC B Adventure Works DB translog B Snapshot AdventureWorks DB FIGURE 32.5 Using the copy-on-write technology with database snapshots. NOTE Database snapshots cannot be used for any of SQL Server’s internal databases— tempdb, master, msdb, or model. Also, database snapshots are supported only in the Enterprise Edition of SQL Server 2008. When to Use Database Snapshots As mentioned previously, there are a few basic ways you can use database snapshots effec- tively. Each use is for a particular purpose, and each has its own benefits. After you have factored in the limitations and restrictions mentioned earlier, you can consider these uses. Let’s look at each of them separately. Download from www.wowebook.com ptg 1052 CHAPTER 32 Database Snapshots SQL Server 2008 Source Server UPDATE AWSource.tableX set xyz = … FROM AWSnapshot6:00AM.tableX All Users Restore from Any Point-in-time Snapshot if needed 6:00AM Snapshot 12:00PM Snapshot 6:00PM Snapshot 12:00AM Snapshot Adventure Works DB translog Database Snapshot Database Snapshot Database Snapshot Database Snapshot FIGURE 32.6 Basic database snapshot configuration: a source database and one or more database snapshots at different time intervals. Reverting to a Snapshot for Recovery Purposes Probably the most basic usage of database snapshots is decreasing recovery time of a data- base by restoring a troubled database with a database snapshot—referred to as reverting. As Figure 32.6 shows, one or more regularly scheduled snapshots can be generated during a 24-hour period, effectively providing you with data recovery milestones that can be rapidly used. As you can see in this example, four database snapshots are six hours apart (6:00 a.m., 12:00 p.m., 6:00 p.m., and 12:00 a.m.). Each is dropped and re-created once per day, using the same snapshot name. Any one of these snapshots can be used to recover the source database rapidly in the event of a logical data error (such as rows deleted or a table being dropped). This technique is not supposed to take the place of a good maintenance plan that includes full database backups and incremental transaction log dumps. However, it can be extremely fast to get a database back to a particular milestone. To revert to a particular snapshot interval, you simply use the RESTORE DATABASE command with the FROM DATABASE_SNAPSHOT statement. This is a complete database restore; you cannot limit it to just a single database object. In addition, you must drop all other database snapshots before you can use one of them to restore a database. As you can also see in Figure 32.6, a targeted SQL statement variation from a complete database restore from a snapshot could be used instead if you knew exactly what you wanted to restore at the table and row level. You could simply use SQL statements (such as Download from www.wowebook.com ptg 1053 When to Use Database Snapshots 32 an UPDATE SQL statement or an INSERT SQL statement) from one of the snapshots to selec- tively apply only the fixes you are sure need to be recovered (reverted). In other words, you don’t restore the whole database from the snapshot, you only use some of the snap- shots’ data with SQL statements and bring the messed-up data row values back in line with the original values in the snapshot. This is at the row and column level and usually requires quite a bit of detailed analysis before it can be applied to a production database. It is also possible to use a snapshot to recover a table that someone accidentally dropped. There is a little data loss since the last snapshot, but it is a simple INSERT INTO statement from the latest snapshot before the table drop. So be careful here, but consider the value as well. Safeguarding a Database Prior to Making Mass Changes Often, you plan regular events against your database tables that result in some type of mass update being applied to big portions of the database. If you do a quick database snapshot before any of these types of changes, you are essentially creating a nice safety net for rapid recovery in the event you are not satisfied with the mass update results. Figure 32.7 illus- trates this type of safeguarding technique. If you are not satisfied with the entire update operation, you can use RESTORE DATABASE from the snapshot and revert it to this point. Or, if you are happy with some updates but not others, you can use the SQL statement UPDATE to selectively update (restore) particular values back to their original values using the snapshot. SQL Server 2008 Source Server UPDATE AWSource.tableX set xyz = … FROM AWSafeguard6:00AM.tableX SAFEGUARD Snapshot (Before the mass changes) All Users Generating Mass Changes Restore from snapshot (if changes are not correct) Adventure Works DB Database Snapshot FIGURE 32.7 Creating a before database snapshot prior to scheduled mass updates to a database. Download from www.wowebook.com . with Database Snapshots With SQL Server 2005, everything about database snapshots was new because this was a completely new feature for SQL Server. With SQL Server 2008, there is little new to. examples in this chapter are based on the SQL Server 2005 version of the AdventureWorks database rather than the newer AdventureWorks2008 or AdventureWorks200 8R2 sample databases used for many of. was created. If a data page is updated in the Download from www.wowebook.com ptg 1046 SQL Server 2008 SQL Server Empty Sparse file of a Snapshot just created (no updates to original data pages