ptg 1114 CHAPTER 34 Data Structures, Indexes, and Performance If SQL Server had to search throughout an entire database file to find free extents, it would- n’t be efficient. Instead, SQL Server uses two special types of pages to record which extents have been allocated to tables or indexes and whether it is a mixed or uniform extent: . Global allocation map pages (GAMs) . Shared global allocation map pages (SGAMs) Global and Shared Global Allocation Map Pages The allocation map pages track whether extents have been allocated to objects and indexes and whether the allocation is for mixed extents or uniform extents. As mentioned in the preceding section, there are two types of GAMs: . Global allocation map (GAM)—The GAM keeps track of all allocated extents in a database, regardless of what it’s allocated to. The structure of the GAM is straightfor- ward: each bit in the page outside the page header represents one extent in the file, where 1 means that the extent is not allocated, and 0 means that the extent is allo- cated. Nearly 8,000 bytes (64,000 bits) are available in a GAM page after the header and other overhead bytes are taken into account. Therefore, a single GAM covers approximately 64,000 extents, or 4GB (64,000 * 64KB) of data. . Shared global allocation map (SGAM)—The SGAM keeps track of mixed extents that have free space available. An SGAM has a structure similar to a GAM, with each bit representing an extent. A value of 1 means that the extent is a mixed extent and there is free space (at least one unused page) available on the extent. A value of 0 means that the extent is not currently allocated, that the extent is a uniform extent, or that the extent is a mixed extent with no free pages. Table 34.6 summarizes the meaning of the bit in GAMs and SGAMs. When SQL Server needs to allocate a uniform extent, it simply searches the GAM for a bit with a value of 1 and sets it to 0 to indicate it has been allocated. To find a mixed extent with free pages, it searches the SGAM for a bit set to 1. When all pages in a mixed extent are used, its corresponding bit is set to 0. When a mixed extent needs to be allocated, SQL Server searches the GAM for an extent whose bit set to 1 and sets the bit to 0, and the corresponding SGAM bit is set to 1. There is some more processing involved as well, such as spreading the data evenly across database files, but the allocation algorithms are still relatively simple. TABLE 34.6 Meaning of the GAM and SGAM Bits Extent Usage GAM Bit SGAM Bit Free, not used 1 0 Uniform or mixed with no free pages 0 0 Mixed, with free pages available 0 1 Download from www.wowebook.com ptg 1115 Space Allocation Structures 34 SQL Server is able to easily locate GAM pages in a database because the first GAM page is located at the third page in the file (page number 2). There is another GAM every 511,230 pages after the first GAM. The fourth page (page number 3) in each database file is the SGAM page, and there is another SGAM each 511,230 pages after the first SGAM. Page Free Space Pages A page free space (PFS) page records whether each page is allocated and the amount of free space available on the page. Each PFS covers 8,088 contiguous pages in the file. For each of the 8,088 pages, the PFS has a 1-byte record that contains a bitmap for each page indi- cating whether the page is empty, 1 to 50% full, 51 to 80% full, 81 to 95% full, or more than 95% full. The first PFS page in a file is located at page number 1, the second PFS page is located at page 8088, and each additional PFS page is located every 8,088 pages after that. SQL Server uses PFS pages to find free pages on extents and to find pages with space available on extents when a new row needs to be added to a table or index. Figure 34.6 shows the layout of GAM, SGAM, and PFS pages in a database file. Note that every file has a single file header located at page 0. Index Allocation Map Pages Index allocation map (IAM) pages keep track of the extents used by a heap or index. Each heap table and index has at least one IAM page for each file where it has extents. An IAM cannot reference pages in other database files; if the heap or index spreads to a new data- base file, a new IAM for the heap or index is created in that file. IAM pages are allocated as needed and are spread randomly throughout the database files. An IAM page contains a small header that has the address of the first extent in the range of pages being mapped by the IAM. It also contains eight page pointers that keep track of index or heap pages that are in mixed extents. These pointers might or might not contain any information, depending on whether any data has been deleted from the tables and the page(s) released. Remember, an index or heap will have no more than eight pages in mixed extents (after eight pages, it begins using uniform extents), so only the first IAM page stores this information. The remainder of the IAM page is for the allocation bitmap. The IAM bitmap works similarly to the GAM, indicating which extents over the range of extents covered by the IAM are used by the heap or index the IAM belongs to. If a bit is on, the corresponding extent is allocated to the table. Each IAM covers a possible range of 63,903 extents (511,224 pages), covering a 4GB section of a file. Each bit represents an extent within that range, whether or not the Page 0 File Heaader Page 2 GAM Page Page 3 SGAM Page Page 8089 PFS Page … Page 1 PFS Page Page 16177 PFS Page … Page 509545 PFS Page Page 511232 GAM Page Page 511233 SGAM Page … … … FIGURE 34.6 The layout of GAM, SGAM, and PFS pages in a database file. Download from www.wowebook.com ptg 1116 CHAPTER 34 Data Structures, Indexes, and Performance extent is allocated to the object that the IAM belongs to. If the bit is set to 1, the relative extent in the range is allocated to the index or heap. If the bit is set to 0, the extent is either not allocated or might be allocated to another heap or index. For example, assume that an IAM page resides at page 649 in the file. If the bit pattern in the first byte of the IAM is 1010 0100, the first, third, and sixth extents within the range of the IAM are allocated to the heap or index. The second, fourth, fifth, seventh, and eighth extents are not. NOTE For a heap table, the data pages and rows within them are not stored in any specific order. Unlike versions of SQL Server prior to 7.0, the pages in a heap structure are not linked together in a page chain. The only logical connection between data pages is the information recorded in the IAM pages, which are linked together. The structure of heap tables is examined in more detail later in this chapter. Differential Changed Map Pages The seventh page (page number 6), and every 511,232 nd page thereafter, in the database file is the differential changed map (DCM) page. This page keeps track of which extents in a file have been modified since the last full database backup. When an extent has been modified, its corresponding bit in the DCM is turned on. This information is used when a differential backup is performed on the database. A differential backup copies only the extents changed since the last full backup was made. Using the DCM, SQL Server can quickly tell which extents need to be backed up by examining the bits on the DCM pages for each data file in the database. When a full backup is performed for the database, all the bits are set back to 0. Bulk Changed Map Pages The eighth page (page number 7), and every 511,232 nd page thereafter, in the database file is the bulk changed map (BCM). When you perform a minimally or bulk-logged operation in SQL Server 2008 in BULK_LOGGED recovery mode, SQL Server logs only the fact that the operation occurred and doesn’t log the actual data changes. The operation is still fully recoverable because SQL Server keeps track of what extents were actually modified by the bulk operation in the BCM page. Similar to the DCM page, each bit on a BCM page repre- sents an extent within its range, and if the bit is set to 1, that indicates that the corre- sponding extent has been changed by a minimally logged bulk operation since the last full database backup. All the bits on the BCM page are reset to 0 whenever a full database backup or log backup occurs. When you initiate a log backup for a database using the BULK_LOGGED recovery model, SQL Server scans the BCM pages and backs up all the modified extents along with the contents of the transaction log itself. You should be aware that the log file itself might be small, but the backup of the log can be many times larger if a large bulk operation has been performed since the last log backup. Download from www.wowebook.com ptg 1117 Data Compression 34 Data Compression SQL Server 2008 introduced a new data compression feature that is available in Enterprise and Datacenter Editions. Data compression helps to reduce both storage and memory requirements as the data is compressed both on disk and when brought into the SQL Server data cache. When compression is enabled and data is written to disk, it is compressed and stored in the designated compressed format. When the data is read from disk into the buffer cache, it remains in its compressed format. This helps reduce both storage requirements and memory requirements. It also reduces I/O because more data can be stored on a data page when it’s compressed. When the data is passed to another component of SQL Server, however, the Database Engine then has to uncompress the data on the fly. In other words, every time data has to be passed to or from the buffered cache, it has to be compressed or uncompressed. This requires extra CPU overhead to accomplish. However, in most cases, the amount of I/O and buffer cache saved by compression more than makes up for the CPU costs, boosting the overall performance of SQL Server. Data compression can be applied on the following database objects: . Tables (clustered or heap) . Nonclustered indexes . Indexed views As the DBA, you need to evaluate which of the preceding objects in your database could benefit from compression and then decide whether you want to compress it using either row- level or page-level compression. Compression is enabled or disabled at the object level There is no single option you can enable that turns compression on or off for all objects in the data- base. Fortunately, other than turning compression on or off for the preceding objects, you don’t have to do anything else to use data compression. SQL Server handles data compression transparently without your having to re-architect your database or your applications. Row-Level Compression Row-level compression isn’t true data compression. Instead, space savings are achieved by using a more efficient storage format for fixed-length data to use the minimum amount of space required. For example, the int data type uses 4 bytes of storage regardless of the value stored, even NULL. However, only a single byte is required to store a value of 100. Row-level compression allows fixed-length values to use only the amount of storage space required. Row-level compression saves space and reduces I/O by . Reducing the amount of metadata required to store data rows . Storing fixed-length numeric data types as if they were variable-length data types, using only as many bytes as necessary to store the actual value . Storing CHAR data types as variable-length data types . Not storing NULL or 0 values Download from www.wowebook.com ptg 1118 CHAPTER 34 Data Structures, Indexes, and Performance Row-level data compression provides less compression than page-level data compression, but it also incurs less overhead, reducing the amount of CPU resources required to implement it. Row-level compression can be enabled when creating a table or index or using the ALTER TABLE or ALTER INDEX commands by specifying the WITH (DATA_COMPRESSION = ROW) option. The following example enables row compression on the titles table in the bigpubs2008 database: ALTER TABLE titles REBUILD WITH (DATA_COMPRESSION=ROW) Additionally, if a table or index is partitioned, you can apply compression at the parti- tion level. When row-level compression is applied to a table, a new row format is used that is unlike the standard data row format discussed previouslywhich has a fixed-length data section separate from a variable-length data section (see Figure 34.3). This new row format is referred to as column descriptor, or CD, format. The name of this row format refers to the fact the every column has description information contained in the row itself. Figure 34.7 illustrates a representative view of the CD format (a definitive view is difficult because, except for the header, the number of bytes in each region is completely dependent on the values in the data row). The row header is always 1 byte in length and contains information similar to Status Bits A in a normal data row: . Bit 0—This bit indicates the type of record (1 = CD record format). . Bit 1—This bit indicates whether the row contains versioning information. . Bits 2–4—This three-bit value indicates what kind of information is stored in the row (such as primary record, ghost record, forwarding record, index record). . Bit 5—This bit indicates whether the row contains a Long data region (with values greater than 8 bytes in length). . Bits 6 and 7—These bits are not used. The CD region consists of two parts. The first is either a 1- or 2-byte value indicating the number of short columns (8 bytes or less). If the most significant bit of the first byte is set to 0, it’s a 1-byte field representing up to 127 columns; if it’s 1, it’s a 2-byte field represent- ing up to 32,767 columns. Following the first 1 or 2 bytes is the CD array. The CD array uses 4 bits for each column in the table to represent information about the length of the Header (1 byte) CD Region Short Data Region Long Data Region Special Information FIGURE 34.7 A representative structure of a CD format row. Download from www.wowebook.com ptg 1119 Data Compression 34 column. A bit representation of 0 indicates the column is NULL. A bit representation of the values 1 to 9 indicates the column is 0 to 8 bytes in length, respectively. A bit representa- tion of 10 (0xa) indicates that the corresponding column value is a long data value and uses no space in the short data region. A bit representation of 11 (0xb) represents a bit column with a value of 1, and a bit representation of 12 (0xc) indicates that the corre- sponding value is a 1-byte symbol representing a value in the page compression dictionary (the page compression dictionary is discussed next in the page-level compression section). The short data region contains each of the short data values. However, because accessing the last columns can be expensive if there are hundreds of columns in the table, columns are grouped into clusters of 30 columns. At the beginning of the short data region, there is an area called the short data cluster array. Each entry in the array is a single byte, which indicates the sum of the sizes of all the data in the previous cluster in the short data region; the value is essentially a pointer to the first column of the cluster (no row offset is needed for the first cluster because it starts immediately after the CD region). Any data value in the row longer than 8 bytes is stored in the long data region. This can include LOB and row-overflow pointers. Long data needs an actual offset value to allow SQL Server to locate each value. This offset array looks similar to the offset array used in the standard data row structure. The long data region consists of three parts: an offset array, a long data cluster array, and the long data. The long data cluster array is similar to the short data cluster array; it has one entry for each 30-column cluster (except for the last one) and serves to limit the cost of locating columns near the end of a long list of columns. The special information section at the end of the row contains three optional pieces of information. The existence of any or all of this information is indicated by bits in the first 1-byte header at the beginning of the row. The three special pieces of information are . Forwarding pointer—This pointer is used in a heap when a row is forwarded due to an update (forward pointers are discussed later in this chapter). . Back pointer—If the row is a forwarded row, it contains a pointer back to the origi- nal row location. . Versioning information—If snapshot isolation is being used, 14 bytes of version- ing information are appended to the row. Page-Level Compression Page-level compression is an implementation of true data compression, using both column prefix and dictionary-based compression. Data is compressed be storing repeating values or common prefixes only once and then referencing those values from other columns and rows. When you implement page compression for a table, row compression is applied as well. Page-level compression offers increased data compression over row-level compression alone but at the expense of greater CPU utilization. It works using these techniques: . First, row-level data compression is applied to fit as many rows as it can on a single page. Download from www.wowebook.com ptg 1120 CHAPTER 34 Data Structures, Indexes, and Performance . Next, column prefix compression is run. Essentially, repeating patterns of data at the beginning of the values of a given column are removed and substituted with an abbreviated reference, which is stored in the compression information (CI) structure stored after the page header. . Finally, dictionary compression is applied on the page. Dictionary compression searches for repeated values anywhere on a page and stores them in the CI. Page compression is applied only after a page is full and if SQL Server determines that compressing a page will save a meaningful amount of space. The amount of compression provided by page-level data compression is highly dependent on the data stored in a table or index. If a lot of the data repeats itself, compression is more efficient. If the data is more randomly discrete values, fewer benefits are gained from using page-level compression. Column prefix compression looks at the column values on a single page and chooses a common prefix that can be used to reduce the storage space required for values in that column. The longest value in the column that contains the prefix is chosen as the anchor value. A row that represents the prefix values for each column is created and stored in the CI structure that immediately follows the page header. Each column is then stored as a delta from the anchor value, where repeated prefix values in the column are replaced by a reference to the corresponding prefix. If the value in a row does not exactly match the selected prefix value, a partial match can still be indicated. For example, consider a page that contains the following data rows before prefix compres- sion as shown in Figure 34.8. After you apply column prefix compression on the page, the CI structure is stored after the page header holding the prefix values for each column. The columns then are stored as the difference between the prefix and column value, as shown in Figure 34.9. In the first column in the first data row, the value 4b represents that the first four charac- ters of the prefix (aaab) are present at the beginning of the column for that row and also the character b. If you append the character b to the first four values of the prefix, it rebuilds the original value of aaabb. For any columns values that are [empty], the column matches the prefix value exactly. Any column value that starts with 0 means that none of the first characters of the column match the prefix. For the fourth column, there is no common prefix value in the columns, so no prefix value is stored in the CI structure. Page Header aaabb aaaab abcd abc aaabccc bbbbb abcd mno aaaccc aaaacc bbbb xyz Data Rows FIGURE 34.8 Sample page of a table before prefix compression. Download from www.wowebook.com ptg 1121 Data Compression 34 aaabccc Page Header Data Rows aaabccc aaabcccaaabccc 4b4b abcd[empty] 0bbbb[empty] mno[empty] [empty]3ccc xyz0bbbb FIGURE 34.9 Sample page of a table after prefix compression. After column prefix compression is applied to every column individually on the page, SQL Server then looks to apply dictionary compression. Dictionary compression looks for repeated values anywhere on the page and also stores them in the CI structure after the column prefix values. Dictionary compression values replace repeated values anywhere on a page. The following illustrates the same page shown previously after dictionary compres- sion has been applied: The dictionary is stored as a set of these duplicate values and a symbol to represent these values in the columns on the page. As you can see in this example, 4b is repeated in multiple columns in multiple rows, and the value is replaced by the symbol 0 throughout the page. The value 0bbbb is replaced by the symbol 1. SQL Server recognizes that the value stored in the column is a symbol and not a data value by examining the coding in the CD array, as discussed earlier. Not all pages contain both the prefix record and a dictionary. Having them both depends on whether the data has enough repeating values or patterns to warrant either a prefix record or a dictionary. Data Rows Page Header 0 0 [empty] abcd [empty] 1 [empty] mno 3ccc [empty] 1 xyz aaabccc aaaacc abcd [NULL] 4b 0bbbb FIGURE 34.10 Sample page of a table after dictionary compression. Download from www.wowebook.com ptg 1122 CHAPTER 34 Data Structures, Indexes, and Performance The CI Record The CI record is the only main structural change to a page when it is page compressed versus a page that uses row compression only. As shown in the previous examples, the CI record is located immediately after the page header. There is no entry for the CI record in the row offset table because its location is always the same. A bit is set in the page header to indicate whether the page is page compressed. When this bit is present, SQL Server knows to look for the CI record. The CI record contains the data elements shown in Table 34.7. Implementing Page Compression Page compression can be implemented for a table at the time it is created or by using the ALTER TABLE command, as in the following example: ALTER TABLE sales_big REBUILD WITH (DATA_COMPRESSION=PAGE) Unlike row compression, which is applied immediately on the rows, page compression isn’t applied until the page is full. The rows cannot be compressed until SQL Server can determine what encodings for prefix and dictionary substitution are going to be used to replace the actual data. When you enable page compression for a table or a partition, SQL Server examines every full page to determine the possible space savings. Any pages that are not full are not considered for compression. During the compression analysis, the prefix and dictionary values are created, and the column values are modified to reflect the prefix and dictionary values. Then row compression is applied. If the new compressed page can hold at least five additional rows, or 25% more rows than the page currently TABLE 34.7 Data Elements Within the CI Record Name Description Header This structure contains 1 byte to keep track of information about the CI. Bit 0 is the version (currently always 0), Bit 1 indicates the presence of a column prefix anchor record, and Bit 2 indicates the presence of a compression dictionary. PageModCount This value keeps track of the number of changes to the page to determine whether the compression on the page should be reevaluated and the CI record rebuilt. Offsets This element contains values to help SQL Server find the dictionary. It contains the offset of the end of the Column prefix anchor record and offset of the end of the CI record itself. Anchor Record This record looks exactly like a regular CD record (see Figure 34.7). Values stored are the common prefix values for each column, some of which might be NULL. Dictionary The first 2 bytes represent the number of entries in the dictionary, followed by an offset array of 2-byte entries, which indicate the end offset of each dictionary entry, and then the actual dictionary values. Download from www.wowebook.com ptg 1123 Data Compression 34 holds, the page is compressed. If neither one of these criteria is met, the compressed version of the page is discarded. New rows inserted into a compressed page are compressed as they are inserted. However, new entries are not added to the prefix list or dictionary based on a single new row. The prefix values and dictionary symbols are rebuilt only on an all-or-nothing basis. After the page is changed a sufficient number of times, SQL Server evaluates whether to rebuild the CI record. The PageModCount field in the CI record is used to keep track of the number of changes to the page since the CI record was last built or rebuilt. This value is updated every time a row is updated, deleted, or inserted. If SQL Server encounters a full page during a data modification and the PageModCount is greater than 25 or the PageModCount divided by the number of rows on the page is greater than 25%, SQL Server reapplies the compression analysis on the page. Again, only if recompressing the page creates room for five additional rows, or 25% more rows than the page currently holds, the new compressed page replaces the existing page. In B-tree structures (nonclustered indexes or a clustered table), only the leaf-level and data pages are considered for compression. When you insert a new row into a leaf or data page, if the compressed row fits, it is inserted and nothing more is done. If it doesn’t fit, SQL Server attempts to recompress the page and then recompress the row based on the new CI record. If the row fits after recompression, it is inserted and nothing more is done. If the row still doesn’t fit, the page needs to be split. When a compressed page is split, the CI record is copied to the new page exactly as it was, along with the rows moved to the new page. However, the PageModCount value is set to 25, so that when the new page gets full, it will be immediately analyzed for recompression. Leaf and data pages are also checked for recompression whenever you run an index rebuild or shrink operation. If you enable compression on a heap table, pages are evaluated for compression only during rebuild and shrink operations. Also, if you drop a clustered index on a table, turning it into a heap, SQL Server runs compression analysis on any full pages. Compression is avoided during normal data modification operations on a heap to avoid changes to the Row IDs, which are used as the row locators for any indexes on the heap. (See the “Understanding Index Structures” section later in this chapter for a discussion of row locators.) Although the RowModCounter is still maintained, SQL Server essentially ignores it and never tries to recompress a page based on the RowModCounter value. Evaluating Page Compression Before choosing to implement page compression, you should determine if the overhead of page compression will provide sufficient benefit in space savings. To determine how changing the compression state will affect a table or an index, you can use the SQL Server 2008 sp_estimate_data_compression_savings stored procedure, which is available only in the editions of SQL Server that support data compression. This stored procedure evalu- ates the effects of compression by sampling up to 5,000 pages in the table and creating a copy of these 5,000 pages of the table in tempdb, performing the compression, and then using the sample to estimate the overall size for the table after compression. The syntax for sp_estimate_data_compression_savings is as follows: Download from www.wowebook.com . changed map (BCM). When you perform a minimally or bulk-logged operation in SQL Server 2008 in BULK_LOGGED recovery mode, SQL Server logs only the fact that the operation occurred and doesn’t log. table or an index, you can use the SQL Server 2008 sp_estimate_data_compression_savings stored procedure, which is available only in the editions of SQL Server that support data compression SQL Server encounters a full page during a data modification and the PageModCount is greater than 25 or the PageModCount divided by the number of rows on the page is greater than 25%, SQL Server