Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1062 Part VI Enterprise Data Management 2. Note the logical name of the data files listed under the name column and their sizes. In the preceding example, there is only one data file with the l ogical name AdventureWorks_Data and a size of 200640 KB (or 195 MB). 3. For each data file in the source database there is a data file in the database snapshot. The data files in the database snapshot are different from the source database data files. The data files in the database snapshot are NTFS sparse files. When the database snapshot is created, the sparse files are empty and do not contain any user data. Because the sparse files can potentially grow up to the size of the data file of the source database at the time the database snapshot was created, it is important to verify that the volume where you want to place the database snapshot has enough free space. Even though you can create the database snapshot on a volume with very little space, it is recommended that you ensure that the volume has enough free space (at least the space of the source database when the database snapshot is created). If the volume runs out of space, the database snapshot will be marked suspect and become unusable, and will need to be dropped. 4. Execute the following Transact-SQL command to create the database snapshot of the AdventureWorks database: CREATE DATABASE AdventureWorks_Snapshot ON (NAME = AdventureWorks_Data, FILENAME = ’J:\ MSSQL10.INST1\MSSQL\DATABASE SNAPSHOTS\AdventureWorks _Snapshot.snap’ ) AS SNAPSHOT OF AdventureWorks; GO The FILENAME appears to wrap in the preceding code. In actual code, it should not wrap around or you will get the error shown here: Msg 5133, Level 16, State 1, Line 1 Directory lookup for the file "J:\MSSQL10.INST1\MSSQL\DATABASE SNAPSHOTS\AdventureWorks_Snapshot.snap" failed with the operating system error 123(The filename, directory name, or volume label syntax is incorrect.). Best Practice T o make it easier to use the database snapshot, think about how you want to name it before you start. One method is to include the source database name, some indication that it is a snapshot, the time it was created, and a meaningful extension. The preceding example uses the name AdventureWorks_Snapshot and .snap as the extension to differentiate the database snapshot files from regular database files. You might have observed that the preceding example creates the database snapshot on a different volume than the source database. Placing the database snapshot on a physically separate volume is a best practice because it avoids disk contention and provides better performance. 1062 www.getcoolebook.com Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1063 Database Snapshots 45 5. Once the database snapshot is created, you can view it using Management Studio. Connect to the SQL Server instance using Object Explorer in Management Studio. Expand Databases and then Database Snapshots to see all the database snapshots, as shown in Figure 45-2. FIGURE 45-2 Viewing database snapshots i n Object Explorer Query the sys.databases catalog view and review the source_database_id column. If this column is NULL, then it is a regular database; if it is not NULL, then it represents the source database ID for the database snapshot. 6. To find out the space used by the database snapshot, open Windows Explorer, right-click the data file of the database snapshot, and select properties, as shown in Figure 45-3. The Size value (195 MB in Figure 45-3) is not the actual size of the file: It is the maximum size of the file, and it should be about the same size as the source database when the database snapshot was created. The Size on disk value (16,384 bytes in Figure 45-3) is the actual size of the database snapshot data file. 1063 www.getcoolebook.com Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1064 Part VI Enterprise Data Management FIGURE 45-3 Viewing the size of the database snapshot data file Alternately, you can also find the size using the dynamic management view sys.dm_io _virtual_file_stats , as shown here: SELECT size_on_disk_bytes FROM sys.dm_io_virtual_file_stats(DB_ID(N’AdventureWorks_Snapshot’), 1); GO Results 16384 Using Your Database Snapshots Once a database snapshot is created, users can query the database snapshot as if it were a regular database. For example, the following query will retrieve the Name, ProductNumber, and ListPrice from the Product table in the AdventureWorks_Snapshot database snapshot: USE AdventureWorks_Snapshot; GO SELECT Name, ProductNumber, ListPrice 1064 www.getcoolebook.com Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1065 Database Snapshots 45 FROM Production.Product ORDER BY Name ASC; GO Users cannot make any updates to the database snapshot, however, as it is read-only. If a user tries to update the database snapshot, he or she will receive error 3906, as shown here: Msg 3906, Level 16, State 1, Line 1 Failed to update database "AdventureWorks_Snapshot" because the database is read-only. When a user reads from the database snapshot, SQL Server accesses the in-memory b itmap and determines whether the data it needs exists on the source database or the database snapshot. If the data page was not updated, it will exist on the source database and SQL Server will read it from the source database. Figure 45-4 shows the read operation accessing the updated page from the database snapshot and the remaining pages from the source database. FIGURE 45-4 Users querying the database snapshot, accessing the updated pages from the database snapshot and unchanged pages from the source database Data Page Data Page Database Snapshot Source Database Data Page at the time snapshot was created Updated Data Page Unallocated Page Some common uses of database snapshots include the following: ■ Generate reports without blocking the production/source database: Database snapshots can be used to run reports based on the data at the time the snapshot was created. When reads are executed on the database snapshot, no locks are held and hence there is no blocking. ■ Maintain historical data: You can create a database snapshot at a particular time, such as end of the financial year, and run end of financial year reports against the database snapshot. ■ Perform reporting on a mirror database: The mirror database cannot be queried by default, as it is in NORECOVERY mode. If you want to use the mirror database for reporting purposes, you can create a database snapshot on the mirror database and read the database snapshot. 1065 www.getcoolebook.com Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1066 Part VI Enterprise Data Management ■ Recover from user or administrator errors: A database snapshot can be recreated period- ically and refreshed by a SQL Server Agent job, whereby the previous database snapshot is dropped and recycled. If a user accidentally drops an important table or commits an update wiping out critical data, use the database snapshot to recover that data directly from the snap- shot without having to restore the database. For example, if a user accidently drops a table and the table existed when the database snapshot was created, there are several ways to easily recover the data: ■ Use Management Studio and script the table on the database snapshot. ■ Recreate the table by running the preceding script against the source database. ■ Populate the data using the INSERT INTO SELECT FROM statement, whereby you select the data from the table in the database snapshot and insert data into the newly created table in the source database. ■ Revert the source database to an earlier time: You can use the database snapshot to recover the source database by reverting it to the way it was when you created the snapshot using the Transact-SQL RESTORE command. Reverting to a database snapshot rebuilds the transaction log and breaks the log backup chain. This means that you cannot perform point-in-time restores in the period from the last log backup to the time when you reverted to the database snapshot. If you want to perform point-in-time restores in the future, take a full or differential backup and then start taking log backups again. The following example reverts the AdventureWorks database to the AdventureWorks _Snapshot database snapshot: USE master; GO RESTORE DATABASE AdventureWorks FROM DATABASE_SNAPSHOT = ‘AdventureWorks_Snapshot’; GO If the source database has multiple database snapshots and you attempt to revert the database to one of the snapshots, you will receive error 3137, as shown here: Msg 3137, Level 16, State 4, Line 1 Database cannot be reverted. Either the primary or the snapshot names are improperly specified, all other snapshots have not been dropped, or there are missing files. Msg 3013, Level 16, State 1, Line 1 RESTORE DATABASE is terminating abnormally. As per the error message, you will need to drop all the database snapshots except the one to which you want to revert. Database snapshot is not a replacement for your regular backup and restore strategy. For example, if a disk failure results in the loss of the source database, you cannot use the database snapshot to recover. You will need good backups. You can use database snapshots to supple- ment your restore strategy only for the purpose of quickly restoring a table that has been accidentally dropped or some rows that have been deleted. I recommend continuing to take regular backups and restoring them to test the backups to protect your data and minimize data loss when a disaster occurs. 1066 www.getcoolebook.com Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1067 Database Snapshots 45 ■ Manage a test database: Database snapshots can be effectively used in a test environment. Before running the tests, you can create a database snapshot of the test database. After the tests are completed, you can quickly r eturn the test database to the original state by reverting to the database snapshot. Before the database snapshot feature was introduced, developers and testers used a backup and restore process to maintain the test database, but the restore process typically takes longer than reverting to a database snapshot. Starting with SQL Server 2005, DBCC CHECKDB used an internal database snapshot for run- ning online. This database snapshot provides a transactionally consistent view that DBCC CHECKDB requires, and prevents blocking and concurrency problems when DBCC CHECKDB runs. The inter- nal database snapshot is created in the same location where the database being checked is stored, and you cannot control the location. Depending on the transaction load concurrent with DBCC CHECKDB,the internal database snapshot can grow in size. This means it is possible for the disk to run out of space on a SQL Server that has a heavy load. If this happens, the database snapshot files cannot grow, which will stop the progress of your workload and DBCC CHECKDB. To proactively avoid this problem, create your own database snapshot and run DBCC CHECKDB against it. This is exactly the same as running DBCC CHECKDB normally and letting it create its own internal database snapshot. As discussed earlier, you cannot back up or restore database snapshots, nor can you attach or detach database snapshots. You can run reports against the database snapshot and drop it when you no longer need it, or it becomes too big, or the disk on which the database snapshot is located runs out of space and the database snapshot becomes suspect. Dropping a database snapshot is similar to dropping a regular database. The only difference is that users do not have to be terminated before dropping a database snapshot. They will be automatically terminated when the snapshot is dropped. The following example drops the database snapshot named AdventureWorks_Snapshot: DROP DATABASE AdventureWorks_Snapshot; The preceding Transact-SQL command drops the database snapshot and deletes the files in it. You can also drop the database snapshot from Management Studio by right-clicking the database snapshot and selecting Delete. Performance Considerations and Best Practices Although Database Snapshots is an excellent feature and can be very useful, if there is heavy I/O activ- ity on the source database, database snapshots can negatively affect performance, reducing the through- put and response time of the application running on the source database. This section discusses why database snapshots affect performance and some best practices to minimize the impact. Creating a database snapshot does not take a long time if there is very little write activity on the source database. Most of the time taken to create database snapshots is spent in the recovery phase that is per- formed by SQL Server to make the database snapshot transactionally consistent. If the write activity on the source database is high when the database snapshot is being created, recovery will take longer and thereby increase the time taken to create the database snapshot. After the database snapshot is created, all first-time updates to the source database pages have to be written to the database snapshot. If there are multiple database snapshots, then all first-time updates to the source database pages have to be written to all the database snapshots. This write activity increases 1067 www.getcoolebook.com Nielsen c45.tex V4 - 07/21/2009 3:15pm Page 1068 Part VI Enterprise Data Management the I/O load on the system and affects the throughput and response time of the applications running on the source database. To monitor the waits caused by database snapshots, query the sys.dm_os_wait_stats dynamic management view. The REPLICA_WRITES wait type occurs when there is a task waiting for the completion of page writes to database snapshots. Apart from the increased I/O load on the system, it is important to note that index maintenance oper- ations take longer when there are database snapshots on the source database. Even though data is not updated during index creation or rebuild operations, rows are moved between pages and the original pages need to be written to the database snapshots. The impact is more visible during clustered index creation/rebuild operations. Best practices to minimize the performance impact of database snapshots include the following: ■ To reduce disk contention, isolate the source data files and transaction log file and database snapshot files by placing them on independent disk volumes. ■ Use Performance Monitor to see the impact of the database snapshots on the system. If there is a disk bottleneck, then consider reducing the number of database snapshots and/or using more/faster spindles to support your I/O requirements with an acceptable latency. ■ Do not create database snapshots during or just before index maintenance operations. ■ Ensure that there is enough free disk space for the database snapshot to grow. If the source database is updated frequently, the database snapshot will grow and can become as large as the source database at the time of database snapshot creation. ■ Schedule a SQL Server Agent job to drop older versions of database snapshots on the source database. Summary Database snapshots bring a new reporting, recovery, and comparison functionality to SQL Server. Some key points from the chapter are as follows: ■ Database snapshots enable maintaining previous images of a production database for reporting and recovery purposes. ■ Database snapshots enable reporting on the mirror database that is not possible otherwise. ■ Although database snapshots are powerful and flexible, they are I/O intensive. If multiple database snapshots are created on a source database that is heavily updated, then the I/O load will increase and may impact the system’s performance. As with any other feature, it is highly recommended that you take a performance baseline of your envi- ronment before and after creating database snapshots, and use your performance data to help you deter- mine whether you should use database snapshots or not. 1068 www.getcoolebook.com Nielsen c46.tex V4 - 07/21/2009 3:17pm Page 1069 Log Shipping IN THIS CHAPTER Configuring a warm standby server using Management Studio and Transact-SQL Monitoring log shipping Modifying or removing log shipping Switching roles Returning to the original primary server T he availability of a database refers to the overall reliability of the system. The Information Architecture Principle, discussed in Chapter 2, ‘‘Data Architecture,’’ lays the foundation for availability in the phrase readily available. The definition of readily available varies by the organization and the data. A database that’s highly available is one that rarely goes down. For some databases, being down for an hour is not a problem; for others, 30 seconds of downtime is a catastrophe. Organization requirements, budget constraints, and other resources dictate the proper solution. Of course, availability involves more than just the database, as there are several technologies involved outside of the database: the instance, the server OS, the physical server, the organization’s infrastructure, and so on. The quality and redundancy of the hardware, the quality of the electrical power, preventive maintenance of the machines and replacement of the hard drives, the security of the server room — all of these contribute to the availability of the primary database. An IT organization that intends to reach any level of high availability must also have the right people, training, policies, and service-level agreements (SLAs) in place. This chapter is the first of a trilogy of chapters dealing with high-availability technologies: log shipping, database mirroring, and clustering. Backup and recovery, along with replication, and even SQL Data Services (SQL in the cloud) are also part of the availability options. A well-planned availability solution will consider every option and then implement the technologies that best fit the organization’s budget and availability requirements. A complete plan for high availability will also include a plan to handle true disas- ters. If the entire data center is suddenly gone, is another off-site disaster recovery site prepared to come online? 1069 www.getcoolebook.com Nielsen c46.tex V4 - 07/21/2009 3:17pm Page 1070 Part VI Enterprise Data Management Best Practice B efore implementing an advanced availability solution, ensure that the primary server is well thought out and provides sufficient redundancy. The most common issue won’t be the data center melting, but hard drive failure or a bad NIC card. Log shipping is perhaps the most common method of providing high availability. The basic idea is that the transaction log, with its record of the most recent transactions, is regularly backed up and then sent, or shipped, to another server where the log is applied so that server has a pretty fresh copy of all the data from the primary server. Log shipping doesn’t require any special hardware or magic, and it’s rel- atively easy to set up and administer. There are three obstacles to getting log shipping to work smoothly. First, the policies and procedures must be established, implemented, and then regularly tested. The second one is a bit trickier: The client applications need a way to detect that the primary server is down and then switch over to the standby server. The third obstacle is a procedure to switch back to the primary server once it’s repaired and ready to step back into the spotlight. What’s New in SQL Server Log Shipping? S QL Server 2008 log shipping performance can be increased by taking advantage of the new backup compression feature in Enterprise Edition. Backup compression reduces the backup, copy, and restore time. This speeds up log shipping, as now it has to transfer less data. It also reduces the disk space required by the backups, enabling disk cost savings and retaining more backups on the same disks. Although backup compression is supported only in SQL Server 2008 Enterprise Edition, every SQL Server 2008 Edition can restore a compressed backup. This means that you can configure log shipping between a primary SQL Server 2008 Enterprise Edition to a secondary SQL Server 2008 Standard Edition while taking advantage of backup compression. Starting with SQL Server 2008, log shipping jobs can be scheduled to run as frequently as every 10 seconds or more both through SQL Server Management Studio and stored procedures. SQL Server 2005 allowed the frequency of the log shipping scheduled jobs to be one minute or more. Availability Testing A database that’s unavailable isn’t very useful. The availability test is a simulation of the database restore process assuming the worst. The measurement is the time required to restore the most current produc- tion data to a test server and prove that the client applications work. 1070 www.getcoolebook.com Nielsen c46.tex V4 - 07/21/2009 3:17pm Page 1071 Log Shipping 46 Warm Standby Availability Warm standby refers to a database that has a copy set up on separate hardware. A warm standby solution can be achieved with log shipping. Log shipping involves periodically restoring a transaction log backup from the primary server to a warm standby server, making that server ready to recover at a moment’s notice. In case of a failure, the warm standby server and the most recent transaction log backups are ready to go. Apart from this, log shipping has the following benefits: ■ It can be implemented without exotic hardware and may be significantly cheaper. ■ It has been used for many years and is a very robust and reliable technology. ■ It can be used for disaster recovery, high availability, and reporting scenarios. ■ Implementing log shipping is very simple because Microsoft provides a user-friendly wizard; and once implemented, it is easy to maintain and troubleshoot. ■ The primary server and the warm standby server do not have to be in the same domain or subnet. As long as they can talk to each other, log shipping will work. ■ There is no real distance limitation between the primary and warm standby servers, and log shipping can be done over the Internet. ■ Log shipping allows shipping the transaction log from one primary server to multiple warm standby servers. It also allows having different copy and restore times for each warm standby server. One of my clients is log shipping their production databases to two secondary servers. One of the secondary servers restores the transaction logs immediately as they are ready and is used for disaster recovery. The second secondary server restores the transaction logs nightly and is used for reporting during the day. ■ Log shipping can be implemented between different editions of SQL Server (Enterprise Edition to Standard Edition) and between different hardware platforms (x86, x64, or IA64-based SQL Server instance). However, log shipping has a few drawbacks: ■ Only user databases in full or bulk-logged recovery model can be log shipped. A simple recov- ery model cannot be used because it does not allow transaction log backup. This also means that log shipping will break if the recovery model for a log shipping database is changed from full/bulk-logged to simple recovery model. ■ System databases ( master, model, msdb, tempdb) cannot be log shipped. ■ Log shipping provides redundancy at the database level and not at the SQL Server instance level, like SQL Server failover clustering. Log shipping only applies changes that are captured in the transaction log or the initial full backup of the log shipping database. Any database objects such as logins, jobs, maintenance plans, SSIS packages, and linked servers that reside outside the log shipping database need to be manually created on the warm standby server. ■ When the primary server fails, any transactions made since the last time the transaction log backup was shipped to the warm standby server may be lost and result in data loss. For this reason, log shipping is usually set to occur every few minutes. 1071 www.getcoolebook.com . only in SQL Server 2008 Enterprise Edition, every SQL Server 2008 Edition can restore a compressed backup. This means that you can configure log shipping between a primary SQL Server 2008 Enterprise. Server 2008 Enterprise Edition to a secondary SQL Server 2008 Standard Edition while taking advantage of backup compression. Starting with SQL Server 2008, log shipping jobs can be scheduled to. a SQL Server Agent job to drop older versions of database snapshots on the source database. Summary Database snapshots bring a new reporting, recovery, and comparison functionality to SQL Server.