ptg 1424 CHAPTER 38 Database Design and Performance Cache Performance One of the reasons SANs can offer superior performance to locally attached storage is they typically are configured with a significant amount of cache space. This is normally a good thing. However, because the SAN provides storage services to multiple servers, the avail- able cache space is shared as well. If there is significant activity against the SAN, there can be extensive cache turnover. This means that the large cache space may not always be available to SQL Server, so some of the performance gains provided by the large cache are not realized. NOTE Cache turnover in a SAN can lead to widely varying physical I/O response times. When SQL Server performs I/O against the SAN, it’s considered a physical I/O whether or not the data resides in the SAN cache. When the physical I/O performance for SQL Server is measured, the performance can be orders of magnitude faster when the data is residing in the SAN cache than when the data has to be physically read from the disks in the SAN. It is important that you perform benchmarking with your SAN vendor to ensure that your SAN cache will be adequate to provide optimal database performance. Avoid Disk Drive Contention SAN storage is divided into LUNs. Servers attached to the SAN recognize one or more of these units as a disk partition or drive. However, these LUNs may share the same disk drives. For example, consider six 100GB drives in the SAN. Theoretically, this could be divided into two LUNs of 300GB each. Although each LUN may be allocated to different SQL Servers, some of the drives shared between the two LUNs could experience twice the I/O from both servers than if the drives were dedicated to a single server. To avoid this situation, most SANs support zoning, which allows the SAN administrator to dedicate entire disks in the SAN to your LUN to isolate the I/O on the drives in the LUN to your SQL Server. In addition, you should try to ensure that your database log files are on a LUN consisting of dedicated drives separate from the LUN (or LUNs) used for your SQL Server data files. Log files typically are written sequentially, unlike data files where data access tends to consist more of random reads and writes. Sharing a LUN between data files and log files generally does not provide optimal IO performance. Unfortunately, your SAN administrator may not permit you to dedicate a separate disk or set of disks to your log files. An alterna- tive may be to place your log files on a local RAID 1 or RAID 10 array. However, you might want to benchmark to determine which solution provides better performance because the caching capabilities of the SAN may offset the potential drive contention in the SAN. Additional SAN Performance Considerations Some SAN administrators may attempt to convince you to use RAID 5 for all data and log files. Before following their advice, you should benchmark the system using a representa- tive load to ensure that RAID 5 will offer the best performance for your log files, tempdb, and any write-intensive filegroups. ptg 1425 Summary 38 You should also ensure that the hardware your SQL Server system uses to connect to the SAN provides optimal performance. Make sure that you have the correct and most up-to- date drivers for your SAN components. If you can, consider using multiple high-speed host bus adapters (HBAs) to connect your servers to your SAN to avoid the I/O contention that can occur with a single HBA. If you do use multiple HBAs, try to ensure they are on different buses to prevent bus saturation and that the HBAs are plugged into the PCI slots offering the highest speed. SANs are complex, and delivering optimal performance for a SQL Server solution using a SAN is challenging. Benchmark your SQL Server to determine if bottlenecks exist with your SAN. Be willing to work with your SAN administrator or vendor to fine-tune your SAN configuration and carefully consider and benchmark any recommendations they may make to ensure optimal performance. Summary A good database design is the best place to start to ensure that your database application runs smoothly. This chapter outlined some of the fundamental aspects a database design that you should consider. If you have the luxury of designing the database system from the ground up, be sure to use what you learned in this chapter in the early stages of develop- ing your database system. If you inherit a database with an inadequate design, the design principles described in this chapter still apply, but they may be a bit harder to implement. The next chapter, “Monitoring SQL Server Performance,” delves into the tools and tech- niques you can use to evaluate the performance of your SQL Server instance. Monitoring is tightly linked to database design and is essential in achieving optimal database performance. ptg This page intentionally left blank ptg CHAPTER 39 Monitoring SQL Server Performance IN THIS CHAPTER . What’s New in Monitoring SQL Server Performance . Performance Monitoring Tools . A Performance Monitoring Approach No SQL Server implementation is perfect out of the box. As you build and add SQL Server–based applications to your server, you should take an active approach to monitoring performance. You also need to keep reevaluating as more and more load is placed on your servers and data volumes grow. This chapter focuses on SQL Server monitoring and leaves monitoring of the other types of servers (including application servers, backup servers, domain controllers, file and print servers, mail/messaging servers, and web servers) for the specialists in those areas. You can monitor many things on your SQL Server platform, ranging from physical and logical I/O to the network packets passing between the server and your client applica- tions. To make this monitoring task a little cleaner, this chapter classifies the key monitoring elements into network, processors, memory/cache, and disk systems. Figure 39.1 shows how these key elements interrelate with SQL Server 2008 and Windows. The aspect of utilization— whether CPU utilization, memory utilization, or something else—is at the center of most of the discussions in this chapter. The important concept to remember is how to monitor or measure utilization and how to make changes to improve this utilization because you are still not in a perfect world of infinite CPU power, infinite disk space, infinite network load capability, and infinite memory. It is essential that you know which tools you can use to get this valuable information. These tools include SQL Server Management Studio (SSMS) Activity Monitor, Data Collector, Extended Events, Windows Performance Monitor and its various counters, a few SQL Server DBCC options, SQL Server Profiler, and a variety of SQL Server dynamic ptg 1428 CHAPTER 39 Monitoring SQL Server Performance Network Windows Server SQL Server 2008 Processors Memory/Cache Disk System FIGURE 39.1 Key elements of SQL Server 2008 performance monitoring: network, proces- sors, memory/cache, and disks. management views (DMVs). Although many other third-party products are available for performance monitoring, some of which do a fantastic job of gathering and aggregating performance data from a number of sources, there is just not enough space in this chapter to cover all the various third-party tools and their features. Instead, this chapter focuses on the performance monitoring tools provided out of the box with SQL Server 2008. What’s New in Monitoring SQL Server Performance Performance tuning and troubleshooting are time-consuming tasks for the administrator. To help provide insights quickly into performance issues, SQL Server 2008 provides a number of new and enhanced features for monitoring SQL Server performance. In SQL Server 2005, the Activity Monitor in SSMS showed only the current running processes and the locks currently being held in the system. In SQL Server 2008, the Activity Monitor has been enhanced and now provides a graphical overview of SQL Server activity, as well as information on active user tasks, resource waits, data file I/O, and the recent most expensive queries. Unfortunately, the new information and fancier interface come with the loss of the useful lock monitor that was available in SQL Server 2005 Activity Monitor. SQL Server 2008 also introduced a new performance monitoring tool called the Data Collector. The Data Collector gathers performance data from multiple sources and stores the data in the management data warehouse (MDW). The MDW is simply a database setup within a SQL Server instance where the data is collected for subsequent viewing and reporting. SQL Server Extended Events is a general event-handling system for the server. The Extended Events infrastructure is a lightweight mechanism that supports capturing, filter- ing, and acting on events generated by the server process. Extended Events is designed to be a foundation that users can configure to monitor and capture different types of data, including performance data. It’s a flexible and powerful way to provide a low granular level of information about the server system. Events can be used to diagnose runtime ptg 1429 Performance Monitoring Tools 39 problems by adding contextual data, such as Transact-SQL (T-SQL) call stacks or query plan handles, to any event. Events can be captured into several different output types, including Event Tracing for Windows (ETW), which enables you to correlate events with operating system and application events and performance counters. SQL Server 2008 also introduces a number of new dynamic management views to simplify retrieval of information that can be helpful with memory troubleshooting. These new DMVs are described in more detail later in this chapter. Performance Monitoring Tools In prior versions of SQL Server, the tools available for monitoring SQL Server performance were somewhat limited. Yes, you had the Windows Performance Monitor, Activity Monitor, SQL Server Profiler, and SQL Trace, but performing in-depth performance moni- toring usually required the purchase of third-party tools to collect, monitor, and view performance information in a useful way. SQL Server 2008 provide a number of tools you can use to collect, analyze, monitor, and report performance-related data. The usual old-timers such as SQL Server Profiler and Database Engine Tuning Advisor still exist and are available to you, but SQL Server 2008 also includes a new Activity Monitor, the Data Collector and management data ware- house, SQL Utility, and SQL Server Extended Events. NOTE For a discussion on using SQL Server Profiler for monitoring and analyzing perfor- mance, see Chapter 6, “SQL Server Profiler.” For more information on the Database Engine Tuning Advisor, see Chapter 55, “Configuring, Tuning, and Optimizing SQL Server Options.” In addition, the Activity Monitor is already covered in detail in Chapter 4, “SQL Server Management Studio,” so detailed information on Activity Monitor is not provided in this chapter. The Data Collector and the MDW As mentioned previously, SQL Server 2008 introduces a new performance monitoring tool called the Data Collector. The Data Collector is designed to collect performance-related data from multiple sources from one or more SQL Servers, store it in a central data ware- house, and present the data through reports in SQL Server Management Studio. The main purpose of the Data Collector is to provide an easy way to automate the collection of criti- cal performance data. The Data Collector gathers information from Windows performance counters, snapshots of data grabbed from dynamic management views, and details on disk utilization. Data collection can be configured to run continuously or on a user-defined schedule. You can adjust the scope of data collection to suit the needs of your test and production envi- ronments. The Data Collector provides a single central point for data collection across ptg 1430 CHAPTER 39 Monitoring SQL Server Performance SQL Agent SSIS Data Collection Sets and Items MDW SQL Server Mgmt Studio SSMS Reports msdb Monitored SQL Server Client Workstation SSIS Pack Job Definitions Audit and History Temporary File Cache Data Collector Runtime FIGURE 39.2 Data Collector architecture. your database servers and applications and, unlike SQL Trace, is not limited to collecting performance data only. The Data Collector feature consists of the following components: . Data collection sets—These are the definitions and scheduled jobs for collecting performance data. They are stored in the msdb system database. . The Data Collector runtime component—This standalone process, called Dcexec.exe, is responsible for loading and executing the SSIS packages that are part of a collection set. . SQL Server Integration Services (SSIS) packages—These packages are used to collect and upload the data. . The management data warehouse database—This is a relational database where the collected data is stored. It also contains the views and stored procedures needed for collection management. . MDW Reports—These reports are built in to SSMS for viewing the collected perfor- mance data. Figure 39.2 provides an overview of the Data Collector architecture and how the various components interact. ptg 1431 Performance Monitoring Tools 39 NOTE The Data Collector is not a zero-impact monitoring solution. It incurs approximately a 2% to 5% performance hit on the servers where it’s collecting data. This performance hit is mainly on the CPU. Data Collection Sets A data collection set is group of collection items. A collection set is the unit of data collection that a user can interact with through the user interface. Data collection sets are defined and deployed on a SQL Server 2008 instance and can be run independently of each other. Each collection set is run by a SQL Server Agent job or jobs, and data is uploaded to the management data warehouse on a predefined schedule. Out of the box, SQL Server 2008 provides the following built-in system data collection sets and reports: . Disk Usage—Collects local disk usage information for all the databases of the SQL Server instance. This information can help you determine current space utilization and future disk space requirements for disk capacity planning. . Server Activity—Collects SQL Server instance-level resource usage information like CPU, memory, and I/O. This information can help you monitor short-term to long- term resource usage trends and identify potential resource bottlenecks on the system. It can also be used for resource capacity planning. . Query Statistics—Collects individual statement-level query statistics, including query text and query plans. This information can help you identify the top resource- consuming queries that may need performance tuning. The definition of the system collection sets cannot be modified. However, you can define your own collection sets or define your own custom reports for this data. Data Collector Runtime Component The Data Collector runtime component is invoked by a standalone process called Dcexec.exe. This component manages data collection based on the definitions provided in a collection set. The Data Collector runtime component is responsible for loading and executing the SSIS packages that are part of a collection set. A collection set can be run in one of the following collection and upload modes: . Noncached mode—Data collection and upload are executed on the same schedule. The packages collect data as scheduled and then immediately upload data. . Cached mode—Data collection and upload are performed on different schedules. The collection package continues to collect and cache data until stopped. Data is uploaded from the local cache according to the schedule specified by the user. ptg 1432 CHAPTER 39 Monitoring SQL Server Performance NOTE The Data Collector runtime component can perform only data collection or data upload. It cannot run these tasks concurrently. SSIS Packages The Data Collector is implemented as SSIS packages that are invoked by the Data Collector runtime component. These packages can be configured to run manually, continuously, or scheduled as SQL Server Agent jobs to periodically collect and upload data to the manage- ment data warehouse. The two most important tasks for the SSIS packages are data collection and data upload. These tasks are carried out by separate packages. A collection package gathers data from a data provider and keeps it in temporary storage. An upload package reads the data in temporary storage, processes the data as required (for example, removing unnecessary data points, normalizing the data, and data aggregation) and then uploads the data to the management data warehouse. The upload is done as a bulk insert to minimize the impact on server performance.The separation of data collection and data upload into separate packages provides more flexibility and efficiency. This design supports scenarios in which snapshots of the data are captured at frequent intervals (for example, every 15 seconds), but the collected data needs to be uploaded only every hour. Data collection and upload frequency should be determined by the monitoring requirements of a particular SQL Server installation. The Management Data Warehouse The management data warehouse is a relational database where the Data Collector stores its data. A single MDW database can serve as the central repository for data collectors running on one or more target SQL Server instances. A data collector is configured on each target server, and it collects and uploads data to the MDW database, which may be on a remote server. Between the time the data is captured and the time it is uploaded, the Data Collector may write temporary data into cache files on the target server. NOTE You can install the MDW on the same instance of SQL Server that is running the Data Collector. However, if server resources or performance are an issue on the server that is being monitored, you might want to install the management data warehouse on a dif- ferent computer to avoid additional CPU and I/O contention. The MDW can become quite large, growing at approximately 250–500MB per day. This is roughly around 2GB of database storage per server each week. You need to decide how long you want to retain the data based on on your performance monitoring needs and your storage availability. For the most part, you can probably stick with the default reten- tion settings, which are 14 days for Query Statistics and Server Activity History data collec- tions and two years for Disk Usage Summary collections. ptg 1433 Performance Monitoring Tools 39 The required schemas and the objects to support the predefined system collection sets are created when you run the wizard to create the MDW. Two schemas are created: core and snapshots. The core schema describes the tables, stored procedures, and views used to organize and identify collected data. These tables are shared among all the data tables created for individual collector types. The snapshots schema describes the objects needed to store and maintain the data collected by the collector types that are provided. A third schema, custom_snapshots, is created if you create your own user-defined collec- tion sets that include collection items that use the Generic T-SQL Query collector type. CAUTION You should not directly modify any data stored in the management data warehouse. Changing the data that you have collected invalidates the legitimacy of the collected data. Also, instead of directly accessing the MDW tables, you should always use the documented stored procedures and functions provided with the Data Collector to access instance and application data. MDW Reports The MDW reports included in SSMS present the information gathered by the Data Collector in the following areas: . Query performance statistics and use of indexes . Server activity information, including waiting processes, memory usage, CPU/sched- uler usage, and disk I/O . Disk usage information Each of the reports present a summary of the data at a high level, with the capability to drill down into the details. Sometimes the reports can provide information to help direct you to a solution for a performance problem. For example, if the query performance statis- tics report shows an extremely slow-running query, you can drill down through the report to expose more details on the query, right down to the query plan. The query plan could indicate that there is a missing index on that table, and creating that index could make a major difference in the query performance. Installing and Configuring the Data Collector Before you can use the Data Collector, you must complete the following tasks: . Create logins and map them to Data Collector roles. . Configure the management data warehouse. NOTE The management data warehouse can be installed only on a server running SQL Server 2008 or SQL Server 2008 R2. . of SQL Server dynamic ptg 1428 CHAPTER 39 Monitoring SQL Server Performance Network Windows Server SQL Server 2008 Processors Memory/Cache Disk System FIGURE 39.1 Key elements of SQL Server 2008. but SQL Server 2008 also includes a new Activity Monitor, the Data Collector and management data ware- house, SQL Utility, and SQL Server Extended Events. NOTE For a discussion on using SQL Server. warehouse. NOTE The management data warehouse can be installed only on a server running SQL Server 2008 or SQL Server 2008 R2.