ptg 2344 CHAPTER 56 SQL Server Disaster Recovery Planning . exec sp_linkedservers—Returns the list of linked servers defined in the local server. . exec sp_helplinkedsrvlogin—Provides information about login mappings defined against a specific linked server used for distributed queries and remote stored procedures. . exec sp_server_info—Returns a list of attribute names and match- ing values for Microsoft SQL Server. . exec sp_helpdb dbnamexyz—Provides information about a specified database or all databases. This includes the database allocation names, sizes, and locations. use dbnamexyz go exec sp_spaceused . exec sp_spaceused—Set of SQL statements that provide the actual database usage information of both data and indexes for the specified database name ( dbnamexyz). use dbnamexyz go exec sp_spaceused go . exec sp_configure - Get the current SQL Server configuration values by running sp_configure (with the “show advanced option”): USE master EXEC sp_configure ‘show advanced option’, ‘1’ RECONFIGURE go EXEC sp_configure Go name minimum maximum config_value run_value — access check cache bucket count 0 65536 0 0 access check cache quota 0 2147483647 0 0 Ad Hoc Distributed Queries 0 1 0 0 affinity I/O mask -2147483648 2147483647 0 0 affinity mask -2147483648 2147483647 0 0 affinity64 I/O mask -2147483648 2147483647 0 0 affinity64 mask -2147483648 2147483647 0 0 Agent XPs 0 1 1 1 ptg 2345 The Overall Disaster Recovery Process 56 allow updates 0 1 0 0 awe enabled 0 1 0 0 backup compression default 0 1 0 0 blocked process threshold (s) 0 86400 0 0 c2 audit mode 0 1 0 0 clr enabled 0 1 0 0 common criteria compliance enabled 0 1 0 0 cost threshold for parallelism 0 32767 5 5 cross db ownership chaining 0 1 0 0 cursor threshold -1 2147483647 -1 -1 Database Mail XPs 0 1 0 0 default full-text language 0 2147483647 1033 1033 default language 0 9999 0 0 default trace enabled 0 1 1 1 disallow results from triggers 0 1 0 0 EKM provider enabled 0 1 0 0 filestream access level 0 2 2 2 fill factor (%) 0 100 0 0 ft crawl bandwidth (max) 0 32767 100 100 ft crawl bandwidth (min) 0 32767 0 0 ft notify bandwidth (max) 0 32767 100 100 ft notify bandwidth (min) 0 32767 0 0 index create memory (KB) 704 2147483647 0 0 in-doubt xact resolution 0 2 0 0 lightweight pooling 0 1 0 0 locks 5000 2147483647 0 0 max degree of parallelism 0 64 0 0 max full-text crawl range 0 256 4 4 max server memory (MB) 16 2147483647 2147483647 2147483647 max text repl size (B) -1 2147483647 65536 65536 max worker threads 128 32767 0 0 media retention 0 365 0 0 min memory per query (KB) 512 2147483647 1024 1024 min server memory (MB) 0 2147483647 0 0 nested triggers 0 1 1 1 network packet size (B) 512 32767 4096 4096 Ole Automation Procedures 0 1 0 0 open objects 0 2147483647 0 0 ptg 2346 CHAPTER 56 SQL Server Disaster Recovery Planning optimize for ad hoc workloads 0 1 0 0 PH timeout (s) 1 3600 60 60 precompute rank 0 1 0 0 priority boost 0 1 0 0 query governor cost limit 0 2147483647 0 0 query wait (s) -1 2147483647 -1 -1 recovery interval (min) 0 32767 0 0 remote access 0 1 1 1 remote admin connections 0 1 0 0 remote login timeout (s) 0 2147483647 20 20 remote proc trans 0 1 0 0 remote query timeout (s) 0 2147483647 600 600 Replication XPs 0 1 0 0 scan for startup procs 0 1 0 0 server trigger recursion 0 1 1 1 set working set size 0 1 0 0 show advanced options 0 1 1 1 SMO and DMO XPs 0 1 1 1 SQL Mail XPs 0 1 0 0 transform noise words 0 1 0 0 two digit year cutoff 1753 9999 2049 2049 user connections 0 32767 0 0 user options 0 32767 0 0 xp_cmdshell 0 1 0 0 . Disk configurations, sizes, and current size availability (use standard OS direc- tory listing commands on all disk volumes being used). . Capture the sa login password and OS administrator password so that anything can be accessed and anything can be installed (or re-installed). . Document all contact information for your vendors: . Microsoft support services contacts (do you use “Premier Product Support Services”?) . Storage vendor contact info . Hardware vendor contact info . Offsite storage contact info (to get your archived copy fast) . Network/telecom contact info . Your CTO, CIO, and other senior management contact info ptg 2347 The Overall Disaster Recovery Process 56 FIGURE 56.12 sqldiag.exe execution. . CD-ROMs available for everything (SQL Server, service packs, operating system, utilities, and so on) sqldiag.exe One good way to get a complete environmental picture is to run the sqldiag.exe program provided with SQL Server 2008 on your production box (which you would have to re- create on an alternate site if a disaster occurred). It is located in the Binn directory where all SQL Server executables reside ( C:\Program Files\Microsoft SQL Server\100\Tools\Binn ). It shows how the server is configured, all hardware and software components (and their versions), memory sizes, CPU types, operating system version and build information, paging file information, environment variables, and so on. If you run this program on your production server periodically, it serves as good environment docu- mentation to supplement your disaster recovery plan. This utility is also used to capture and diagnose SQL Server-wide issues and has a prompt that you must respond to when re- creating issues on which you want to collect diagnosis information. For the purposes of this chapter, when prompted for the SQLDIAG Collection, you can just terminate that portion by pressing Ctrl+C. Figure 56.12 shows the expected execution DOS windows and system information dialog window. To run this utility, you open a DOS command prompt and change directory to the SQL Server Binn directory. Then, at the command prompt, you run sqldiag.exe: C:\Program Files\Microsoft SQL Server\100\Tools\Binn> sqldiag.exe The results are written into several text files within the SQLDIAG subdirectory. Each file contains different types of data about the physical machine (server) that SQL Server is ptg 2348 CHAPTER 56 SQL Server Disaster Recovery Planning running on and information about each SQL Server instance. The machine (server) infor- mation is stored in a file named XYX_MSINFO32.TXT, where XYX is the machine name. It really contains a verbose snapshot of everything that relates to SQL Server (in one way or another) and all the hardware configuration, drivers, and so on. It is the tightly coupled metadata and configuration information directly related to the SQL Server instance. The following is an example of what it contains: System Information report written at: 09/11/09 22:13:16 System Name: DBARCH-LT2 [System Summary] Item Value OS Name Microsoft® Windows Vista™ Home Premium Version 6.0.6001 Service Pack 1 Build 6001 Other OS Description Not Available OS Manufacturer Microsoft Corporation System Name DBARCH-LT2 System Manufacturer Hewlett-Packard System Model HP G60 Notebook PC System Type x64-based PC Processor Pentium(R) Dual-Core CPU T4300 @ 2.10GHz, 2100 Mhz, 2 Core(s), 2 Logical Processor(s) BIOS Version/Date Hewlett-Packard F.3C, 6/23/2009 SMBIOS Version 2.4 Windows Directory C:\Windows System Directory C:\Windows\system32 Boot Device \Device\HarddiskVolume1 Locale United States Hardware Abstraction Layer Version = “6.0.6001.18000” User Name DBARCH-LT2\DBARCH Time Zone Pacific Daylight Time Installed Physical Memory (RAM) Not Available Total Physical Memory 3.90 GB Available Physical Memory 1.87 GB Total Virtual Memory 8.04 GB Available Virtual Memory 5.63 GB Page File Space 4.20 GB Page File C:\pagefile.sys and so on. A separate file is generated for each SQL Server instance you have installed on a server. These files are named XYZ_ABC_sp_sqldiag_Shutdown.OUT, where XYZ is the machine name and ABC is the SQL Server instance name. This file contains most of the internal SQL Server information regarding how it is configured, including a snapshot of the SQL Server log as this server is operating on this machine. The following example shows this critical information from the DBARCH-LT2_SQL08DE01_sp_sqldiag_Shutdown.OUT file: ptg 2349 The Overall Disaster Recovery Process 56 2009-09-07 23:50:21.540 Server Microsoft SQL Server 2008 (SP1) - 10.0.2531.0 (X64) Mar 29 2009 10:11:52 Copyright (c) 1988-2008 Microsoft Corporation Developer Edition (64-bit) on Windows NT 6.0 <X64> (Build 6001: Service Pack 1) 2009-09-07 23:50:21.560 Server (c) 2005 Microsoft Corporation. 2009-09-07 23:50:21.560 Server All rights reserved. 2009-09-07 23:50:21.560 Server Server process ID is 1884. 2009-09-07 23:50:21.560 Server Logging SQL Server messages in file ‘C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\Log\ERRORLOG’. 2009-09-07 23:50:21.570 Server Registry startup parameters: -d C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\DATA\master.mdf -e C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\Log\ERRORLOG -l C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\DATA\mast- log.ldf 2009-09-07 23:50:21.610 Server Detected 2 CPUs. This is an informational message; no user action is required. 2009-09-07 23:50:21.910 Server Using dynamic lock allocation. Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node. This is an informational message only. No user action is required. 2009-09-07 23:50:23.050 spid7s FILESTREAM: effective level = 3, configured level = 3, file system access share name = ‘SQL08DE01’. 2009-09-07 23:50:23.820 spid7s Server name is ‘DBARCH-LT2\SQL08DE01’. This is an informational message only. No user action is required. From this output, you are able to ascertain the complete SQL Server instance information as it was running on the primary site. It is excellent documentation for your SQL Server implementation. We suggest that you run this utility regularly and compare the outcome with prior executions to guarantee that you know exactly what you have to have in place in case of disaster. Planning and Executing a Disaster Recovery The process of planning and executing a complete disaster recovery is serious business, and many companies around the globe set aside a few days a year to perform this exact task. Here’s what it involves: . Simulate a disaster. . Record all actions taken. . Time all events from start to finish. Sometimes this means someone is standing around with a stopwatch. . Hold a postmortem following the DR simulation. ptg 2350 CHAPTER 56 SQL Server Disaster Recovery Planning Many companies tie the results of a DR simulation to the IT group’s salaries (their raise percentage). This is more than enough motivation for IT to get this drill right and to perform well. Correcting any failures or issues that occur is critical. The next time might not be a simulation. Have You Detached a Database Recently? We suggest that you consider all methods of backup and recovery when dealing with DR. Another crude but extremely powerful method for creating a snapshot of a database (for any purpose, even for backup and recovery) is to simply detach the database and attach it in another location—pretty much anywhere. There will be some downtime during the detach time, the compressing of the database files ( .mdf and .ldf), some time during the data transfer of these files (or single zipped file) from one location to another, some uncompress time, and the final attach time (seconds). All in all, it is a very reliable way to move an entire database from one place to another. This approach is crude, but fairly fast and extremely safe. To give you an example of what it takes, a database that is about 30GB can be detached, compressed, moved to another server across a network (with a 1GB back- bone), uncompressed, and attached in about 10 minutes. You should make sure your administrators know they can do this in a pinch. Third-Party Disaster Recovery Alternatives Third-party alternatives to replication, mirroring, and synchronization approaches of support disaster recovery are fairly prevalent. Symantec and a handful of other companies lead the way with very viable, but often expensive, solutions. However, many are bundled with their disk subsystems (which makes them easy to use and manage out-of-the-box). Following are some very strong solutions: . Symantec—The Symantec replication solutions, including Veritas Storage Replicator and Veritas Volume Replicator, can create duplicate copies of data across any distance for data protection. These are certified with SQL Server. See www.symantec. com. . SteelEye Technologies—The SteelEye LifeKeeper family of data replication, high- availability clustering, and disaster recovery products are for Linux and Windows environments. They are all certified solutions (on a variety of other vendor products) across a wide range of applications and databases running on Windows and Linux, including mySAP, Exchange, Oracle, DB2, and SQL Server. See www.steeleye.com. ptg 2351 Summary 56 . EMC—EMC Corporation provides cost-effective, continuous remote replication and continuous data protection via tools such as AutoStart, MirrowView, Open Migrator/LM, Replication Manager, and RepliStor. The Legato AA family of products includes capabilities required to manage systems performance and to automate recovery from failures. Legato AA also automates data mirroring and replication, to enable data consolidation, migration, distribution, and preservation through failures and disasters. See www.emc.com. Our recommendation is that if you are already a customer of one of these vendors, you should look closely at these solutions because they may be available with a product you already are using. Summary Perhaps thousands of considerations must be dealt with when you are building a viable production implementation, let alone one that needs to have disaster recovery built in. You would be well advised to make the extra effort of first properly determining which disaster recovery solution matches your company’s needs and then to switch focus to what is the most effective way to implement that chosen solution. If, for example, you choose data replication to support your DR needs, you must determine the right type of replication model to use (like a central publisher or peer-to-peer), what the limitations might be, the failover process that needs to be devised, and so on. Understanding other characteristics of your DR needs, such as what applications or databases are tightly coupled to your most important revenue-generation applications, is paramount. Not only is disaster recovery planning important, but testing the DR solution to make sure it works is even more important. You don’t want to test your DR solution for the first time when your primary site has actually failed. You need to set some short-term attainable goals of getting to DR Level 1. This gets you in a basic level of protection (mitigating some of the risk from a disaster). Then you can start pushing upward to Level 2 and beyond to create the highest DR capability possible within your budget and capabilities. ptg This page intentionally left blank ptg Index Symbols ` (backtick), 492 [ ] (brackets), 495 += compound operator, CD:1569 -= compound operator, CD:1569 *= compound operator, CD:1569 /= compound operator, CD:1569 %= compound operator, CD:1569 ^= compound operator, CD:1569 |= compound operator, CD:1569 &= compound operator, CD:1569 $( ) designators, 108 $ (dollar sign), 492 | (pipe character), 483 + (plus sign), 493 # (pound sign), 491, 879 $_ special variable, 493 1204 trace flags (error logs), 1386-1388 1222 trace flags (error logs), 1388-1390 A access clients. See client data access technologies Database Engine data access, 11-12 identity access management, 364, 366 of performance counters via T-SQL, 1477 . C:Program Files Microsoft SQL Server MSSQL10 .SQL0 8DE01MSSQLLogERRORLOG -l C:Program Files Microsoft SQL Server MSSQL10 .SQL0 8DE01MSSQLDATAmast- log.ldf 2009-09-07 23:50:21.610 Server Detected. Files Microsoft SQL Server MSSQL10 .SQL0 8DE01MSSQLLogERRORLOG’. 2009-09-07 23:50:21.570 Server Registry startup parameters: -d C:Program Files Microsoft SQL Server MSSQL10 .SQL0 8DE01MSSQLDATAmaster.mdf -e. 23:50:21.560 Server All rights reserved. 2009-09-07 23:50:21.560 Server Server process ID is 1884. 2009-09-07 23:50:21.560 Server Logging SQL Server messages in file ‘C:Program Files Microsoft SQL Server MSSQL10 .SQL0 8DE01MSSQLLogERRORLOG’. 2009-09-07