Microsoft SQL Server 2008 R2 Unleashed- P116 pot

ptg 1094 CHAPTER 34 Data Structures, Indexes, and Performance SQL Server uses the file location information visible in the sys.master_files catalog view most of the time. However, the Database Engine uses the file location information stored in the primary file to initialize the file location entries in the master database when attaching a database using the CREATE DATABASE statement with either the FOR ATTACH or FOR ATTACH_REBUILD_LOG options. Every database can have three types of files: . Primary data file . Secondary data files . Log files In addition, in SQL Server 2008, databases can also have FILESTREAM data files and fulltext data files. TABLE 34.1 The sysfiles Table Column Name Description file_id A file identification number that is unique within each database file_guid GUID for the file type File type (0=rows [that is, data files], 1=log, 2=FILESTREAM, 4=Full-text catalogs prior to SQL Server 2008 type_desc Description of the file type (ROWS, LOG, FILESTREAM, FULLTEXT) data_space_id 0 represents a log file; values > 0 represent the ID of the filegroup the data file belongs to name The logical name of the file filename The physical name of the file, including path state File state (0 = OFFLINE, 1 = RESTORING, 2 = RECOVERING, 3 = RECOVERY_PENDING, 4 = SUSPECT, 6 = OFFLINE, 7=DEFUNCT) state_desc Description of the file state (OFFLINE, RESTORING, RECOVERING, RECOVERY_PENDING, SUSPECT, OFFLINE, DEFUNCT) size Current size of the file in 8KB pages max_size Maximum file size in 8KB pages growth File growth setting (0=fixed, >0=autogrow in units of 8KB pages or by percentage if is_percent_growth is set to 1) is_media_read_only 1=file is on read-only media is_read_only 1= file is marked read-only is_sparse 1=file is a sparse file is_percent_growth 1=growth of file value is percentage Download from www.wowebook.com ptg 1095 Database Files and Filegroups 34 Primary Data File Every database has only one primary database file. The location of the primary database file is stored in the master database (visible via the filename column in the sys.master_files view). When SQL Server opens a database, it looks for this file and then reads from the file information on the other files defined for the database. The file extension for the primary database file defaults to .mdf. The primary database file always belongs to the default filegroup. It is often sufficient to have only one database file for storing your tables and indexes (the primary database file). The file can, of course, be created on a RAID partition to help spread I/O. However, if you need finer control over placement of your tables across disks or disk arrays, or if you want to be able to back up only a portion of your database via filegroups, you can create additional, secondary data files for a database. Secondary Data Files A database can have any number of secondary files (in reality, the maximum number of files per database is 32,767, but that should be sufficient for most implementations). You can put a secondary file in the default filegroup or in another filegroup defined for the database. Secondary data files have the file extension.ndf by default. Following are some situations in which the use of secondary database files might be beneficial: . You want to perform a partial backup. A backup can be performed for the entire database or a subset of the database. The subset is specified as a set of files or filegroups. The partial backup feature is useful for large databases, where it is impracti- cal to back up the entire database. When recovering with partial backups, a transaction log backup must also be available. For more information about backups, see Chapter 14, “Database Backup and Restore.” . You want more control over placement of database objects. When you create a table or index, you can specify the filegroup in which the object is created. This could help you spread I/O by placing your most active tables or indexes on separate filegroups defined on separate disks or disk arrays. . Creating multiple files on a single disk provides no real performance benefit but could help in recovery. If you have a 90GB database in a single file and have to restore it, you need to have enough disk space available to create a new 90GB file. If you don’t have 90GB of space available on a single disk, you cannot restore the database. On the other hand, if the database was created with three files each 30GB in size, you more likely will be able to find three 30GB chunks of space available on your server. Download from www.wowebook.com ptg 1096 CHAPTER 34 Data Structures, Indexes, and Performance The Log File Each database must have at least one log file. The log file contains the transaction log records of all changes made in a database (for more information on what is contained in the transaction log, see Chapter 31, “Transaction Management and the Transaction Log”). By default, log files have the file extension .ldf. A database can have several log files, and each log file can have a maximum size of 32TB. A log file cannot be part of a filegroup. No information other than transaction log records can be written to a log file. For more information on the log file and log file management, see Chapter 31. File Management In SQL Server 2008, you can specify that a database file should grow automatically as space is needed. SQL Server can also shrink the size of the database if the space is not needed. You can control whether to use this feature along with the increment by which the file is to be expanded. The increment can be specified as a fixed number of megabytes or as a percentage of the current size of the file. You can also set a limit on the maximum size of the file or allow it to grow until no more space is available on the disk. Listing 34.1 provides an example of a database being created with a 10MB growth increment for the first database file, 20MB for the second, and 20% growth increment for the log file. LISTING 34.1 Creating a Database with Autogrowth CREATE DATABASE Customer ON ( NAME=’Customer_Data’, FILENAME=’D:\SQL_data\Customer_Data1.mdf’, SIZE=50, MAXSIZE=100, FILEGROWTH=10), ( NAME=’Customer_Data2’, FILENAME=’E:\SQL_data\Customer_Data2.ndf’, SIZE=100, FILEGROWTH=20) LOG ON ( NAME=’Customer_Log’, FILENAME=’F:\SQL_data\Customer_Log.ldf’, SIZE=50, FILEGROWTH=20%) GO The Customer_Data file has an initial size of 50MB, a maximum size of 100MB, and a file increment of 10MB. The Customer_Data2 file has an initial size of 100MB, has a file growth increment of 20MB, and can grow until the E: disk partition is full. Download from www.wowebook.com ptg 1097 Database Files and Filegroups 34 The transaction log has an initial size of 50MB; the file increases by 20% with each file growth. The increment is based on the current file size, not the size originally specified. When creating or expanding data files in SQL Server 2008, SQL Server uses fast file initialization. This allows for the fast execution of the file creation and growth. With fast file initialization, the space is added to the data file immediately, but without initializing the logical pages in the data file with zeros. The existing disk content in the data file is not overwritten until new data is written to the files. This provides a huge performance advan- tage when a data file autogrows while an application is attempting to write data to the database. The application does not need to wait until the space is initialized; it can begin writing to the database immediately. SQL Server also provides an option to autoshrink databases as well as manually shrink databases. However, shrinking a database is a resource-intensive process and should be done only if it is absolutely imperative to reclaim disk space. Also, if a data file is constantly shrinking and growing, it can lead to excessive file fragmentation at the file system level as well as excessive logical fragmentation within the file, both of which can lead to poor I/O performance. Using Filegroups All databases have a primary filegroup that contains the primary data file. There can be only one primary filegroup. If you don’t create any other filegroups or change the default filegroup to a filegroup other than the primary filegroup, all files will be in the primary file group unless specifically placed in another filegroup. In addition to the primary filegroup, you can add one or more filegroups to the database, and a filegroup can contain one or more files. The main purpose of using filegroups is to provide more control over the placement of files and data on your server. When you create a table or index, you can map it to a specific filegroup, thus controlling the placement of data. A typical SQL Server database installation generally uses a single RAID array to spread I/O across disks and create all files in the primary filegroup; more advanced installations or installations with very large databases spread across multiple array sets can benefit from the finer level of control of file and data placement afforded by additional filegroups. For example, for a simple database such as AdventureWorks, you can create just one primary file that contains all data and objects and a log file that contains the transaction log information. For a larger and more complex database, such as a securities trading system where large data volumes and strict performance criteria are the norm, you might create the database with one primary file and four additional secondary files. You can then set up filegroups so you can place the data and objects within the database across all five files. If you have a table that itself needs to be spread across multiple disk arrays for performance reasons, you can place multiple files in a filegroup, each of which resides on a different disk, and create the table on that filegroup. For example, you can create three files ( Data1.ndf, Data2.ndf, and Data3.ndf) on three disk arrays, respectively, and then Download from www.wowebook.com ptg 1098 CHAPTER 34 Data Structures, Indexes, and Performance assign them to the filegroup called spread_group. Your table can then be created specifically on the filegroup spread_group. Queries for data from the table are spread across the three disk arrays, thereby improving I/O performance. If a filegroup contains more than one file, when space is allocated to objects stored in that filegroup, the data is stored proportionally across the files. In other words, if you have one file in a filegroup with twice as much free space as another, the first file has two extents allocated from it for each extent allocated from the second file (extents and space allocation are discussed in more detail later in this chapter). Listing 34.2 provides an example of using filegroups in a database to control the file placement of the customer_info table. LISTING 34.2 Using a Filegroup to Control Placement for a Table CREATE DATABASE Customer ON ( NAME=’Customer_Data’, FILENAME=’C:\SQLData\Customer_Data1.mdf’, SIZE=50, MAXSIZE=100, FILEGROWTH=10) LOG ON ( NAME=’Customer_Log’, FILENAME=’C:\SQLData\Customer_Log.ldf’, SIZE=50, FILEGROWTH=20%) GO ALTER DATABASE Customer ADD FILEGROUP Cust_table GO ALTER DATABASE Customer ADD FILE ( NAME=’Customer_Data2’, FILENAME=’G:\SQLData\Customer_Data2.ndf’, SIZE=100, FILEGROWTH=20) TO FILEGROUP Cust_Table GO USE Customer CREATE TABLE customer_info (cust_no INT, cust_address NCHAR(200), info NVARCHAR(3000)) ON Cust_Table GO Download from www.wowebook.com ptg 1099 Database Files and Filegroups 34 TABLE 34.2 The sys.filegroups System Catalog View Column Name Description name Name of the data space, unique within the database. data_space_id Data space ID number, unique within the database. type FG = Filegroup. type_desc Description of data space type: ROWS_FILEGROUP. is_default 1 = This is the default data space. The default data space is used when a filegroup or partition scheme is not specified in a CREATE TABLE or CREATE INDEX statement. 0 = This is not the default data space. filegroup_guid GUID for the filegroup. NULL = PRIMARY filegroup. log_filegroup_id Not used; value is NULL. is_read_only 1 = Filegroup is read-only. 0 = Filegroup is read/write. The CREATE DATABASE statement in Listing 34.2 creates a database with a primary database file and log file. The first ALTER DATABASE statement adds a filegroup. A secondary database file is added with the second ALTER DATABASE command. This file is added to the Cust_Table filegroup. The CREATE TABLE statement creates a table; the ON Cust_Table clause places the table in the Cust_Table filegroup (the Customer_Data2 file on the G: disk partition). The sys.filegroups system catalog view contains information about the database filegroups defined within a database, as shown in Table 34.2. The following statement returns the filename, size in megabytes (not including autogrow), and the name of the filegroup to which each file belongs: SELECT convert(varchar(30), sf.name) as filename, size/128 as size_in_MB, convert(varchar(30), sfg.name) as filegroupname FROM sys.database_files sf INNER JOIN sys.filegroups sfg ON sf.data_space_id = sfg.data_space_id Download from www.wowebook.com ptg 1100 CHAPTER 34 Data Structures, Indexes, and Performance go filename size_in_MB filegroupname Customer_Data 50 PRIMARY Customer_Data2 100 Cust_table FILESTREAM Filegroups FILESTREAM storage is a new feature in SQL Server 2008 for storing unstructured data, such as documents, images, and videos. FILESTREAM storage helps to solve the issues with using unstructured data by integrating the SQL Server Database Engine with the NTFS file system for storing the unstructured data, such as documents and images, on the file system with the database storing a pointer to the data. Although the actual data resides outside the database in the NTFS file system, you can still use Transact-SQL (T-SQL) statements to insert, update, query, and back up FILESTREAM data, while maintaining transactional consistency between the unstructured data and corresponding structured data with same level of security. NOTE To use FILESTREAM storage, you must first enable FILESTREAM storage at the Windows level as well as at the SQL Server instance level. You can enable FILESTREAM at the Windows level during installation of SQL Server 2008 or at any time using SQL Server Configuration Manager. After you enable FILESTREAM at the Windows level, you next need to enable FILESTREAM for the SQL Server instance. You can do this either through SQL Server Management Studio (SSMS) or via T-SQL. After you enabled FILESTREAM for the SQL Server instance, you can enable it for a database by creating a FILESTREAM filegroup. You can do this when the database is created (or to an existing database) by adding a filegroup and including the CONTAINS FILESTREAM clause. Unlike regular filegroups, a FILESTREAM filegroup can contain only a single file reference, which is actually a file system folder rather than an actual file. The actual folder must not exist (although the path up to the folder must exist); SQL Server creates the filestream folder. For example, in Listing 34.3, the code adds a FILESTREAM filegroup called CustFSGroup and adds the folder G:\SQLData\custinfo_FS into the file group. This custinfo_FS folder is created by SQL Server in the G:\SQLData folder. LISTING 34.3 Using a Filegroup to Control Placement for a Table ALTER DATABASE Customer ADD FILEGROUP Cust_FSGroup CONTAINS FILESTREAM ALTER DATABASE Customer Download from www.wowebook.com ptg 1101 Database Pages 34 ADD FILE ( NAME=custinfo_FS, FILENAME = ‘G:\SQLData\custinfo_FS’) to FILEGROUP Cust_FSGroup GO If you look in the G:\SQLData\custinfo_FS folder, you should see a Filestream.hdr file and an $FSLOG folder. The Filestream.hdr file is a FILESTREAM container header file that should not be moved or modified. As you can see in the example in Listing 34.3, for FILESTREAM files or file groups, unlike regular files, you do not specify size or growth information. No space is preallocated. The file and filegroup grow as data is added to tables that have been created with FILESTREAM columns. As you create tables with FILESTREAM columns, a subfolder is created in the filegroup folder for each table. The filenames are GUIDs. Each FILESTREAM column created in the table results in another subfolder created under the table subfolder. The column subfolder name is also a GUID. At this point, there still are no actual files created. That happens after you start adding rows to the table. A file is created in the column subfolder for each row inserted into the table with a non- NULL value for the FILESTREAM column. For more information on creating and using tables with FILESTREAM columns, see Chapter 42, “What’s New for Transact-SQL in SQL Server 2008.” Database Pages All information in SQL Server is stored at the page level. The page is the smallest level of I/O in SQL Server and is the fundamental storage unit. Pages contain the data itself or information about the physical layout of the data. The page size is the same for all page types: 8KB, or 8,192 bytes. The pages are arranged in two basic types of storage structures: linked data pages and index trees. Databases are divided into logical 8KB pages. Within each file allocated to a database, the pages are numbered contiguously from 0 to n. The actual number of pages in the database file depends on the size of the file. Pages in a database are uniquely referenced by specify- ing the database ID, the file ID for the file the page resides in, and the page number within the file. When you expand a database with ALTER DATABASE, the new space is added at the end of the file, and the page numbers continue incrementing from the previ- ous last page in the file. If you add a completely new file, its first page number is 0. When you shrink a database, pages are removed from the end of the file only, starting at the highest page in the database and moving toward lower-numbered pages until the database reaches the specified size or a used page that cannot be removed. This ensures that page numbers within a file are always contiguous. Download from www.wowebook.com ptg 1102 CHAPTER 34 Data Structures, Indexes, and Performance TABLE 34.3 Page Types Page Type Stores Data Data rows for all data except text, ntext, image, nvarchar(max), varchar(max), varbinary(max), and xml data Row Overflow Data columns that cause a data row to exceed the 8,060 bytes per page limit LOB Large object types (text, ntext, image, nvarchar(max), varchar(max), varbinary(max), xml data, and varchar, nvarchar, varbinary, and sqlvariant when data row size exceeds 8KB) Index Index entries and pointers Global Allocation Map Information about allocated (used) extents Page Free Space Information about page allocation and free space on pages Index Allocation Map Information about extents used by a table or an index Differential Changed Map Information about which extents have been modified since the last full database backup Bulk Changed Map Information about which extents have been used in a minimally logged or bulk-logged operation since the last BACKUP LOG statement Body Header 96 byte header 8096 bytes 8K PagE (8192Bytes) FIGURE 34.1 SQL Server page layout. Page Types There are eight page types in SQL Server, as listed in Table 34.3. All pages, regardless of type, have a similar layout. They all have a page header, which is 96 bytes, and a body, which consequently is 8,096 bytes. The page layout is shown in Figure 34.1. Download from www.wowebook.com ptg 1103 Database Pages 34 Data Pages The actual data rows in tables are stored on data pages. Figure 34.2 shows the basic structure of a data page. The following sections discuss and examine the contents of the data page. The Page Header The page header contains control information for the page. Some fields assist when SQL Server checks for consistency among its storage structures, and some fields are used when navigating among the pages that constitute a table. Table 34.4 describes the more useful fields contained in the page header. Header Row Offset Table . . . 96118140 … … Row 0 Row 1 Row 2 Byte Address Row ID0 1 2 96 118 140 0 34 8095 FIGURE 34.2 The structure of a SQL Server data page. Download from www.wowebook.com . “What’s New for Transact -SQL in SQL Server 2008. ” Database Pages All information in SQL Server is stored at the page level. The page is the smallest level of I/O in SQL Server and is the fundamental. enable FILESTREAM for the SQL Server instance. You can do this either through SQL Server Management Studio (SSMS) or via T -SQL. After you enabled FILESTREAM for the SQL Server instance, you can. Windows level as well as at the SQL Server instance level. You can enable FILESTREAM at the Windows level during installation of SQL Server 2008 or at any time using SQL Server Configuration Manager.

Định dạng
Số trang	10
Dung lượng	239,62 KB