Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 98 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
98
Dung lượng
2,69 MB
Nội dung
160 Chapter 4 Creating Indexes Lesson Review The following questions are intended to reinforce key information presented in this lesson. The questions are also available on the companion CD if you prefer to review them in electronic form. NOTE Answers Answers to these questions and explanations of why each answer choice is right or wrong are located in the “Answers” section at the end of the book. 1. Which type of index physically orders the rows in a table? A. Unique index B. Clustered index C. Nonclustered index D. Foreign key 2. Which index option causes SQL Server to create an index with empty space on the leaf level of the index? A. PAD_INDEX B. FILLFACTOR C. MAXDOP D. IGNORE_DUP_KEY C0462271X.fm Page 160 Friday, April 29, 2005 7:31 PM Lesson 3: Creating Nonclustered Indexes 161 Lesson 3: Creating Nonclustered Indexes After you build your clustered index, you can create nonclustered indexes on the table. In contrast with a clustered index, a nonclustered index does not force a sort order on the data in a table. In addition, you can create multiple nonclustered indexes to most efficiently return results based on the most common queries you execute against the table. In this lesson, you will see how to create nonclustered indexes, including how to build a covering index that can satisfy a query by itself. And you will learn the importance of balancing the number of indexes you create with the over- head needed to maintain them. After this lesson, you will be able to: ■ Implement nonclustered indexes. ■ Build a covering index. ■ Balance index creation with maintenance requirements. Estimated lesson time: 20 minutes Implementing a Nonclustered Index Because a nonclustered index does not impose a sort order on a table, you can create as many as 249 nonclustered indexes on a single table. Nonclustered indexes, just like clustered indexes, create a B-tree structure. However, unlike a clustered index, in a nonclustered index, the leaf level of the index contains a pointer to the data instead of the actual data. This pointer can reference one of two items. If the table has a clustered index, the pointer points to the clustering key. If the table does not have a clustered index, the pointer points at a relative identifier (RID), which is a reference to the physical loca- tion of the data within a data page. When the pointer references a nonclustered index, the query transits the B-tree struc- ture of the index. When the query reaches the leaf level, it uses the pointer to find the clustering key. The query then transits the clustered index to reach the actual row of data. If a clustered index does not exist on the table, the pointer returns a RID, which causes SQL Server to scan an internal allocation map to locate the page referenced by the RID so that it can return the requested data. You use the same CREATE…INDEX command to create a nonclustered index as you do to create a clustered index, except that you specify the NONCLUSTERED keyword. C0462271X.fm Page 161 Friday, April 29, 2005 7:31 PM 162 Chapter 4 Creating Indexes Creating a Covering Index An index contains all the values contained in the column or columns that define the index. SQL Server stores this data in a sorted format on pages in a doubly linked list. So an index is essentially a miniature representation of a table. This structure can have an interesting effect on certain queries. If the query needs to return data from only columns within an index, it does not need to access the data pages of the actual table. By transiting the index, it has already located all the data it requires. For example, let’s say you are using the Customer table that we created in Chapter 3 to find the names of all customers who have a credit line greater than $10,000. SQL Server would scan the table to locate all the rows with a value greater than 10,000 in the Credit Line column, which would be very inefficient. If you then created an index on the Credit Line column, SQL Server would use the index to quickly locate all the rows that matched this criterion. Then it would transit the primary key, because it is clustered, to return the customer names. However, if you created a nonclustered index that had two columns in it—Credit Line and Customer Name—SQL Server would not have to access the clustered index to locate the rows of data. When SQL Server used the nonclustered index to find all the rows where the credit line was greater than 10,000, it also located all the customer names. An index that SQL Server can use to satisfy a query without having to access the table is called a covering index. Even more interesting, SQL Server can use more than one index for a given query. In the preceding example, you could create nonclustered indexes on the credit line and on the customer name, which SQL Server could then use together to satisfy a query. NOTE Index selection SQL Server determines whether to use an index by examining only the first column defined in the index. For example, if you defined an index on FirstName, LastName and a query were looking for LastName, this index would not be used to satisfy the query. Balancing Index Maintenance Why wouldn’t you just create dozens or hundreds of indexes on a table? At first glance, knowing how useful indexes are, this approach might seem like a good idea. However, remember how an index is constructed. The values from the column that C0462271X.fm Page 162 Friday, April 29, 2005 7:31 PM Lesson 3: Creating Nonclustered Indexes 163 the index is created on are used to build the index. And the values within the index are also sorted. Now, let’s say a new row is added to the table. Before the operation can complete, the value from this new row must be added to the correct location within the index. If you have only one index on the table, one write to the table also causes one write to the index. If there are 30 indexes on the table, one write to the table causes 30 addi- tional writes to the indexes. It gets a little more complicated. If the leaf-level index page does not have room for the new value, SQL Server has to perform an operation called a page split. During this operation, SQL Server allocates an empty page to the index, moving half the values on the page that was filled to the new page. If this page split also causes an intermediate- level index page to overflow, a page split occurs at that level as well. And if the new row causes the root page to overflow, SQL Server splits the root page into a new interme- diate level, causing a new root page to be created. As you can see, indexes can improve query performance, but each index you create degrades performance on all data-manipulation operations. Therefore, you need to carefully balance the number of indexes for optimal operations. As a general rule of thumb, if you have five or more indexes on a table designed for online transactional processing (OLTP) operations, you probably need to reevaluate why those indexes exist. Tables designed for read operations or data warehouse types of queries gener- ally have 10 or more indexes because you don’t have to worry about the impact of write operations. Using Included Columns In addition to considering the performance degradation caused by write operation, keep in mind that indexes are limited to a maximum of 900 bytes. This limit can cre- ate a challenge in constructing more complex covering indexes. An interesting new indexing feature in SQL Server 2005 called included columns helps you deal with this challenge. Included columns become part of the index at the leaf level only. Values from included columns do not appear in the root or intermedi- ate levels of an index and do not count against the 900-byte limit for an index. C0462271X.fm Page 163 Friday, April 29, 2005 7:31 PM 164 Chapter 4 Creating Indexes Quick Check ■ What are the two most important things to consider for nonclustered indexes? Quick Check Answer ■ The number of indexes must be balanced against the overhead required to maintain them when rows are added, removed, or modified in the table. ■ You need to make sure that the order of the columns defined in the index match what the queries need, ensuring that the first column in the index is used in the query so that the query optimizer will use the index. PRACTICE Create Nonclustered Indexes In this practice, you will add a nonclustered index to the tables that you created in Chapter 3. 1. If necessary, launch SSMS, connect to your instance, and open a new query window. 2. Because users commonly search for a customer by city, add a nonclustered index to the CustomerAddress table on the City column, as follows: CREATE NONCLUSTERED INDEX idx_CustomerAddress_City ON dbo.CustomerAddress(City); Lesson Summary ■ You can create up to 249 nonclustered indexes on a table. ■ The number of indexes you create must be balanced against the overhead incurred when data is modified. ■ An important factor to consider when creating indexes is whether an index can be used to satisfy a query in its entirety, thereby saving additional reads from either the clustered index or data pages in the table. Such an index is called a covering index. ■ SQL Server 2005’s new included columns indexing feature enables you to add values to the leaf level of an index only so that you can create more complex index implementations within the index size limit. C0462271X.fm Page 164 Friday, April 29, 2005 7:31 PM Lesson 3: Creating Nonclustered Indexes 165 Lesson Review The following questions are intended to reinforce key information presented in this lesson. The questions are also available on the companion CD if you prefer to review them in electronic form. NOTE Answers Answers to these questions and explanations of why each answer choice is right or wrong are located in the “Answers” section at the end of the book. 1. Which index option causes an index to be created with empty space on the inter- mediate levels of the index? A. PAD_INDEX B. FILLFACTOR C. MAXDOP D. IGNORE_DUP_KEY C0462271X.fm Page 165 Friday, April 29, 2005 7:31 PM 166 Chapter 4 Review Chapter Review To further practice and reinforce the skills you learned in this chapter, you can ■ Review the chapter summary. ■ Review the list of key terms introduced in this chapter. ■ Complete the case scenario. This scenario sets up a real-world situation involv- ing the topics of this chapter and asks you to create a solution. ■ Complete the suggested practices. ■ Take a practice test. Chapter Summary ■ Indexes on SQL Server tables, just like indexes on books, provide a way to quickly access the data you are looking for—even in very large tables. ■ Clustered indexes cause rows to be sorted according to the clustering key. In general, every table should have a clustered index. And you can have only one clustered index per table, usually built on the primary key. ■ Nonclustered indexes do not sort rows in a table, and you can create up to 249 per table to help quickly satisfy the most common queries. ■ By constructing covering indexes, you can satisfy queries without needing to access the underlying table. Key Terms Do you know what these key terms mean? You can check your answers by looking up the terms in the glossary at the end of the book. ■ B-tree ■ clustered index ■ clustering key ■ covering index ■ intermediate level ■ leaf level ■ nonclustered index ■ online index creation C0462271X.fm Page 166 Friday, April 29, 2005 7:31 PM Chapter 4 Review 167 ■ page split ■ root node Case Scenario: Indexing a Database In the following case scenario, you will apply what you’ve learned in this chapter. You can find answers to these questions in the “Answers” section at the end of this book. Contoso Limited, a health care company located in Bothell, WA, has just implemented a new patient claims database. Over the course of one month, more than 100 employ- ees entered all the records that used to be contained in massive filing cabinets in the basements of several new clients. Contoso formed a temporary department to validate all the data entry. As soon as the data-validation process started, the IT staff began to receive user complaints about the new database’s performance. As the new database administrator (DBA) for the company, everything that occurs with the data is in your domain, and you need to resolve the performance problem. You sit down with several employees to determine what they are searching for. Armed with this knowledge, what should you do? Suggested Practices To help you successfully master the exam objectives presented in this chapter, com- plete the following practice tasks. Creating Indexes ■ Practice 1 Locate all the tables in your databases that do not have primary keys. Add a primary key to each of these tables. ■ Practice 2 Locate all the tables in your databases that do not have clustered indexes. Add a clustered index or change the primary key to clustered for each of these tables. ■ Practice 3 Identify poorly performing queries in your environment. Create non- clustered indexes that the query optimizer can use to satisfy these queries. ■ Practice 4 Identify the queries that can take advantage of covering indexes. If indexes do not already exist that cover the queries, use the included columns clause to add additional columns to the appropriate index to turn it into a cov- ering index. C0462271X.fm Page 167 Friday, April 29, 2005 7:31 PM 168 Chapter 4 Review Take a Practice Test The practice tests on this book’s companion CD offer many options. For example, you can test yourself on just the content covered in this chapter, or you can test yourself on all the 70-431 certification exam content. You can set up the test so that it closely sim- ulates the experience of taking a certification exam, or you can set it up in study mode so that you can look at the correct answers and explanations after you answer each question. MORE INFO Practice tests For details about all the practice test options available, see the “How to Use the Practice Tests” sec- tion in this book’s Introduction. C0462271X.fm Page 168 Friday, April 29, 2005 7:31 PM 169 Chapter 5 Working with Transact-SQL The query language that Microsoft SQL Server uses is a variant of the ANSI-standard Structured Query Language, SQL. The SQL Server variant is called Transact-SQL. Database administrators and database developers must have a thorough knowledge of Transact-SQL to read data from and write data to SQL Server databases. Using Transact-SQL is the only way to work with the data. Exam objectives in this chapter: ■ Retrieve data to support ad hoc and recurring queries. ❑ Construct SQL queries to return data. ❑ Format the results of SQL queries. ❑ Identify collation details. ■ Manipulate relational data. ❑ Insert, update, and delete data. ❑ Handle exceptions and errors. ❑ Manage transactions. Lessons in this chapter: ■ Lesson 1: Querying Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 ■ Lesson 2: Formatting Result Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 ■ Lesson 3: Modifying Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 ■ Lesson 4: Working with Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Before You Begin To complete the lessons in this chapter, you must have ■ SQL Server 2005 installed. ■ A connection to a SQL Server 2005 instance in SQL Server Management Studio (SSMS). ■ The AdventureWorks database installed. C0562271X.fm Page 169 Friday, April 29, 2005 7:32 PM [...]... INFO Learning query basics For more information about writing queries, see the “Query Fundamentals” topic in SQL Server 2005 Books Online, which is installed as part of SQL Server 2005 Updates for SQL Server 2005 Books Online are available for download at www .microsoft. com/technet/prodtechnol /sql/ 2005/ downloads/books.mspx How to Create Subqueries Subqueries are queries that are nested in other queries... C0562271X.fm Page 1 93 Friday, April 29, 2005 7 :32 PM Lesson 3: Modifying Data MORE INFO 1 93 Locks If you’re not familiar with the SQL Server locking mechanisms, see the “Locking in the Database Engine” topic in SQL Server 2005 Books Online BEST PRACTICES Try to steer clear of cursors Avoid cursors whenever possible Ideally, cursors should be used only for administrative purposes when a set-based solution... whatever column in your base table was used as the unique index when creating the full-text index MORE INFO Creating full-text indexes For information on creating full-text indexes, see the “CREATE FULLTEXT INDEX (Transact -SQL) ” topic in SQL Server 2005 Books Online C0562271X.fm Page 181 Friday, April 29, 2005 7 :32 PM Lesson 1: Querying Data 181 Quick Check ■ Which function should you use to query...C0562271X.fm Page 170 Friday, April 29, 2005 7 :32 PM 170 Chapter 5 Working with Transact -SQL Real World Adam Machanic In my work as a database consultant, I am frequently asked by clients to review queries that aren’t performing well More often than not, the problem is simple: Whoever wrote the query clearly did not understand how Transact -SQL works or how best to use it to solve problems Transact -SQL is a fairly... they end up writing less-than-desirable code If you feel like your query is getting more complex than it should be, it probably is Take a step back and rethink the problem The key to creating well-performing Transact -SQL queries is to think in terms of sets instead of row-by-row operations, as you would in a procedural system C0562271X.fm Page 171 Friday, April 29, 2005 7 :32 PM Lesson 1: Querying... tells SQL Server that you’re using a UDF rather than a system function C0562271X.fm Page 188 Friday, April 29, 2005 7 :32 PM 188 Chapter 5 Working with Transact -SQL Quick Check ■ What is the main difference between querying a UDF and a built-in function? Quick Check Answer ■ When querying a UDF, you must specify the function’s schema Built-in functions do not participate in schemas Querying CLR User-Defined... not familiar with the SQL case expression, see the “CASE (Transact -SQL) ” topic in SQL Server 2005 Books Online This query conditionally checks the value of the SalariedFlag column, returning the total of the VacationHours and SickLeaveHours columns if the employee is salaried Otherwise, only the SickLeaveHours column value is returned C0562271X.fm Page 177 Friday, April 29, 2005 7 :32 PM Lesson 1: Querying... comma-delimited string, and the column should be called “EmpData” 1 If necessary, open SSMS and connect to your SQL Server 2 Open a new query window and select AdventureWorks as the active database 3 Type the following query and execute it: SELECT HireDate, VacationHours, LoginId FROM HumanResources.Employee C0562271X.fm Page 190 Friday, April 29, 2005 7 :32 PM 190 Chapter 5 Working with Transact -SQL. .. inside of SQL Server can be performed using set-based techniques—that is, using standard SELECT statements Even when working with very complex formatting requirements, this holds true However, you can develop nonset-based SQL Server code by using cursors Cursors operate by iterating through a data set one row at a time, letting the developer operate on individual rows rather than on sets of data SQL Server. .. aggregate function processes a group of rows to produce a single output value Transact -SQL has several built-in aggregate functions, and you can also define aggregate functions by using Microsoft NET languages Table 5-1 lists commonly used built-in aggregate functions and what they do Table 5-1 Commonly Used Built-in Aggregate Functions Function Description AVG Returns the average value of the rows . in SQL Server 2005 Books Online, which is installed as part of SQL Server 2005. Updates for SQL Server 2005 Books Online are available for download at www .microsoft. com/technet/prodtechnol /sql/ 2005/ downloads/books.mspx. How. well-perform- ing Transact -SQL queries is to think in terms of sets instead of row-by-row oper- ations, as you would in a procedural system. C0562271X.fm Page 170 Friday, April 29, 2005 7 :32 PM Lesson. Begin To complete the lessons in this chapter, you must have ■ SQL Server 2005 installed. ■ A connection to a SQL Server 2005 instance in SQL Server Management Studio (SSMS). ■ The AdventureWorks database