SQL Server Tacklebox- P20 ppsx

4 – Managing data growth 95 Common causes of space issues The following issues are among the most-common of DB space-related sorrow: • Poorly configured Model database – meaning that subsequent databases adopt properties (AutoGrowth, Recovery Model and so on) that are inappropriate for their intended use. • Inefficient Delete, Insert or Bulk Insert statements – such processes, plus those that create temp tables, can very quickly fill the log file with unnecessary data. The situation is exacerbated by incorrect Model database configuration. • Indexes and large row counts – clustered indexes can take up a lot of space for tables that contain millions of rows of data. However, you simply need to plan for this because the consequences of not having these indexes can severely impact performance. • Blatant misuse of TempDB – Temporary tables often play an important role when developers are tasked with comparing millions of rows of data, to return a small subset of results. This practice can have unwanted consequences, such as inadvertently filling the TempDB database. It is our job, as DBAs, to make sure this does not happen, often by performing a code review and offering an alternate solution. Over the coming sections, I am going to delve into each of these issues, and discuss the techniques I have used to analyze and fix each one, where possible. I say "where possible" because sometimes data growth really does exceed all expectation and confound even the most rigorous capacity planning. The only course of action, in such cases, is to expand disks or add additional SAN space, things only peripherally known to many DBAs. I want to stress that this chapter is not going to shine a light on SQL Server internals. I will not be taking you on a journey to the heart of the database engine to explore the esoteric concepts of leaf level storage. Every DBA needs to understand where and how objects, such as tables and indexes, use up space on your servers, and be very familiar with core concepts such as pages, extents, fill factors, as well as internal and external fragmentation. However, I will leave those details to Books Online. Here, I intend to drive the All Terrain Vehicle of my experience right to the source of the space allocation issues that wreak havoc on the waking and sleeping life of the on-call DBA. 4 – Managing data growth 96 Being a model DBA This chapter is about space utilization in SQL Server and there is no better place to begin than with the Model database. The first thing I will say about the Model database is that, if it were up to me, I would rename it. Out of the box, there is nothing "model" about it; it is not a "model" citizen nor should it be considered a "role model" for other databases. Nevertheless, it is the template upon which all subsequent databases are based, including TempDB. In other words, new databases created on the server, unless otherwise specified, will inherit the configuration settings of the model database. The full list of options for the Model database, including their default settings, can be found at http://technet.microsoft.com/en-us/library/ms186388.aspx. The defaults for most of the options are fine for most databases. Most significantly, however, the model database settings determine the following: • Autogrowth properties for the data and log files • Recovery model for the database The default settings for each of these are definitely not appropriate for all databases, and it's easy for new DBAs, or even us old haggard DBAs, to forget to check these settings especially where we're working with a server configured by a previous DBA. Beware of default autogrowth and recovery By default, the data file (modeldev) for the Model database, for both SQL Server 2005 and 2008 will be roughly 3MB in size initially, and is set to autogrow in 1 MB (1024 K) increments, unrestricted, until the disk is full. The log file is set at an initial size of 2MB and is set to grow in 10% increments, again until the disk is full. These settings are shown in Figure 4.1. NOTE Microsoft SQL Server 2008 Books Online states: "The sizes of these files can vary slightly for different editions of SQL Server." I am using Standard Edition for the examples in this chapter. In SQL Server storage terms, 1024K is 128 pages; pages are stored in 8K blocks. For applications that are going to potentially load millions of records, growing the data file of a database every 128 pages incurs a large performance hit, given that one of the major bottlenecks of SQL Server is I/O requests. 4 – Managing data growth 97 Figure 4.1: Initial sizes and growth characteristics for the model database data and log files. Rather than accept these defaults, it is a much better practice to size the data file appropriately at the outset, at say 2G. The same advice applies for the log file. Generally, growth based on a percentage is fine until the file reaches a threshold where the next growth will consume the entire disk. Let's say you had a 40G log file on a 50G drive. It would only take two 10% growths to fill the disk, and then the alerts go out and you must awake, bleary-eyed, to shrink log files and curse the Model database. Coupled with the previously-described file growth characteristics, our databases will also inherit from the default model database a recovery model of Full. Transactions in the log file for a Full recovery database are only ever removed from the log upon a transaction log backup. This is wonderful for providing point in time recovery for business critical applications that require Service Level Agreements (SLAs), but it does mean that if you do not backup the transaction log, you run the risk of eventually filling up your log drive. If you have a database that is subject to hefty and /or regular (e.g. daily) bulk insert operations, and you are forcing the data file to be incremented in size regularly, by small amounts, then it's likely that the performance hit will be significant. It is also likely that the size of your log file will increase rapidly, unless you are performing regular transaction log backups. 4 – Managing data growth 98 To find out how significant an impact this can have, let's take a look at an example. I'll create a database called All_Books_Ever_Read, based on a default model database, and then load several million rows of data into a table in that database, while monitoring file growth and disk I/O activity, using Profiler and PerfMon, respectively. Loading this amount of data may sound like an extreme case, but it's actually "small fry" compared to many enterprise companies, that accumulate, dispense and disperse Terabytes of data. NOTE I just happen to own a file, Books-List.txt, that allegedly contains a listing of all books ever read by everyone on the planet Earth, which I'll use to fill the table. Surprisingly the file is only 33 MB. People are just not reading much any more. The first step is to create the All_Books_Ever_Read database. The initial sizes of the data and log files, and their growth characteristics, will be inherited from the Model database, as described in Figure 4.1. Once I've created the database, I can verify the initial data (mdf) and log file (ldf) sizes are around 3 and 2 MB respectively, as shown in Figure 4.2. Figure 4.2: Data and log files sizes prior to data load. The next step is to back up the database. It's important to realize that, until I have performed a full database backup, the log file will not act like a typical log file in a database set to Full recovery mode. In fact, when there is no full backup of the database, it is not even possible to perform a transaction log backup at this point, as demonstrated in Figure 4.3. Figure 4.3: Can't backup log if no full database backup exists. Until the first full backup of the database is performed, this database is acting as if it is in Simple recovery mode and the transaction log will get regularly truncated at 4 – Managing data growth 99 checkpoints, so you will not see the full impact of the data load on the size of the log file. With the database backed up, I need to set up Profiler and PerfMon so that I can monitor the data load. To monitor auto growth behavior using Profiler, simply start it up, connect to the SQL Server 2008 instance that holds the All_Books_Ever_Read database, and then set up a trace to monitor Data and Log file Auto Grow events, as shown in Figure 4.4. Figure 4.4: Setting SQL Server Profiler to capture data and log file growth. All you have to do then is click "Run". Next, I'll set up Perfmon (Administrative Tools | Performance) in order to monitor disk I/O activity. Click on the "+" button in the toolbar of the graph; Perfmon will connect to the local server by default. Select "Physical Disk" as the performance object, as shown in Figure 4.5, and then select "% Disk Time" as the counter and click" Add". Next, change to the Physical Disk object and select the "Average Disk Queue Length" and "Current Disk Queue Length" counters. These settings will capture the amount of disk activity, to review after the data load. . shown in Figure 4.1. NOTE Microsoft SQL Server 2008 Books Online states: "The sizes of these files can vary slightly for different editions of SQL Server. " I am using Standard Edition. we're working with a server configured by a previous DBA. Beware of default autogrowth and recovery By default, the data file (modeldev) for the Model database, for both SQL Server 2005 and 2008. SQL Server 2008 instance that holds the All_Books_Ever_Read database, and then set up a trace to monitor Data and Log file Auto Grow events, as shown in Figure 4.4. Figure 4.4: Setting SQL

Định dạng
Số trang	5
Dung lượng	336,45 KB