Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 73 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
73
Dung lượng
1,55 MB
Nội dung
Dropping Rules If you want to completely eliminate a rule from your database, you use the same DROP syntax that we’ve already become familiar with for tables: DROP RULE <rule name> Defaults Defaults are even more similar to their cousin — a default constraint — than a rule is to a CHECK con- straint. Indeed, they work identically, with the only real differences being in the way that they are attached to a table and the default’s (the object, not the constraint) support for a user-defined data type. The syntax for defining a default works much as it did for a rule: CREATE DEFAULT <default name> AS <default value> Therefore, to define a default of zero for our Salary: CREATE DEFAULT SalaryDefault AS 0; Again, a default is worthless without being bound to something. To bind it, we make use of sp_bindefault, which is, other than the procedure name, identical in syntax to the sp_bindrule procedure: EXEC sp_bindefault ‘SalaryDefault’, ‘Employees.Salary’; To unbind the default from the table, we use sp_unbindefault: EXEC sp_unbindefault ‘Employees.Salary’; Keep in mind that the futureonly_flag also applies to this stored procedure; it is just not used here. Dropping Defaults If you want to completely eliminate a default from your database, you use the same DROP syntax that we’ve already become familiar with for tables and rules: DROP DEFAULT <default name> The concept of defaults vs. DEFAULT constraints is wildly difficult for a lot of people to grasp. After all, they have almost the same name. If we refer to “default,” then we are referring to either the object-based default (what we’re talking about in this sec- tion), or a shorthand to the actual default value (that will be supplied if we don’t provide an explicit value). If we refer to a “ DEFAULT constraint,” then we are talking about the non-object-based solution — the solution that is an integral part of the table definition. 182 Chapter 6: Constraints 57012c06.qxd:WroxBeg 11/25/08 5:19 AM Page 182 Determining Which Tables and Data Types Use a Given Rule or Default If you ever go to delete or alter your rules or defaults, you may first want to take a look at which tables and data types are making use of them. Again, SQL Server comes to the rescue with a system-stored pro- cedure. This one is called sp_depends. Its syntax looks like this: EXEC sp_depends <object name> sp_depends provides a listing of all the objects that depend on the object you’ve requested information about. Triggers for Data Integrity We’ve got a whole chapter coming up on triggers, but any discussion of constraints, rules, and defaults would not be complete without at least a mention of triggers. One of the most common uses of triggers is to implement data integrity rules. Since we have that chapter coming up, I’m not going to get into it very deeply here, other than to say that triggers have a very large number of things they can do data integrity–wise that a constraint or rule could never hope to do. The downside (and you knew there had to be one) is that they incur substantial additional overhead and are, therefore, much (very much) slower in almost any circumstance. They are procedural in nature (which is where they get their power), but they also happen after everything else is done and should be used only as a relatively last resort. Choosing What to Use Wow. Here you are with all these choices, and now how do you figure out which is the right one to use? Some of the constraints are fairly independent ( PRIMARY and FOREIGN KEYs, UNIQUE constraints) — you are using either them or nothing. The rest have some level of overlap with each other, and it can be rather confusing when deciding what to use. You’ve gotten some hints from me as we’ve been going through this chapter about what some of the strengths and weaknesses are of each of the options, but it will prob- ably make a lot more sense if we look at them all together for a bit. Unfortunately, sp_depends is not a sure bet to tell you about every object that depends on a parent object. SQL Server supports something called “deferred name resolution.” Basically, deferred name resolution means that you can create objects (primary stored procedures) that depend on another object — even before the sec- ond (target of the dependency) object is created. For example, SQL Server will now allow you to create a stored procedure that refers to a table even before the said table is created. In this instance, SQL Server isn’t able to list the table as having a depend- ency on it. Even after you add the table, it will not have any dependency listing if you use sp_depends. 183 Chapter 6: Constraints 57012c06.qxd:WroxBeg 11/25/08 5:19 AM Page 183 The main time to use rules and defaults is if you are implementing a rather robust logical model and are making extensive use of user-defined data types. In this instance, rules and defaults can provide a lot of functionality and ease of management without much programmatic overhead. You just need to be aware that they may go away someday. Probably not soon, but someday. Triggers should only be used when a constraint is not an option. Like constraints, they are attached to the table and must be redefined with every table you create. On the bright side, they can do most things that you are likely to want to do data integrity–wise. Indeed, they used to be the common method of enforcing foreign keys (before FOREIGN KEY constraints were added). I will cover these in some detail later in the book. That leaves constraints, which should become your data integrity solution of choice. They are fast and not that difficult to create. Their downfall is that they can be limiting (they can’t reference other tables except for a FOREIGN KEY), and they can be tedious to redefine over and over again if you have a common constraint logic. Regardless of what kind of integrity mechanism you’re putting in place (keys, trig- gers, constraints, rules, defaults), the thing to remember can best be summed up in just one word — balance. Every new thing that you add to your database adds more overhead, so you need to make sure that whatever you’re adding honestly has value to it before you stick it in your database. Avoid things like redundant integrity implementations (for example, I can’t tell you how often I’ve come across a database that has both foreign keys defined for referential integrity and triggers to do the same thing). Make sure you know what constraints you have before you put the next one on, and make sure you know exactly what you hope to accomplish with it. Restriction Pros Cons Constraints Fast. Can reference other columns. Happen before the command occurs. ANSI-compliant. Must be redefined for each table. Can’t reference other tables. Can’t be bound to data types. Rules, Defaults Independent objects. Reusable. Can be bound to data types. Happen before the command occurs. Slightly slower. Can’t reference across columns. Can’t reference other tables. Really meant for backward compatibility only!!! Triggers Ultimate flexibility. Can reference other columns and other tables. Can even use .NET to reference informa- tion that is external to your SQL Server. Happen after the command occurs. High overhead. 184 Chapter 6: Constraints 57012c06.qxd:WroxBeg 11/25/08 5:19 AM Page 184 Summary The different types of data integrity mechanisms described in this chapter are part of the backbone of a sound database. Perhaps the biggest power of RDBMSs is that the database can now take responsibility for data integrity rather than depending on the application. This means that even ad hoc queries are sub- ject to the data rules and that multiple applications are all treated equally with regard to data integrity issues. In the chapters to come, we will look at the tie between some forms of constraints and indexes, along with taking a look at the advanced data integrity rules that can be implemented using triggers. We’ll also begin looking at how the choices between these different mechanisms affect our design decisions. 185 Chapter 6: Constraints 57012c06.qxd:WroxBeg 11/25/08 5:19 AM Page 185 57012c06.qxd:WroxBeg 11/25/08 5:19 AM Page 186 7 Adding More to Our Queries When I first started writing about SQL Server a number of years ago, I was faced with the question of when exactly to introduce more complex queries into the knowledge mix — this book faces that question all over again. At issue is something of a “chicken or egg” thing — talk about scripting, variables, and the like first, or get to some things that a beginning user might make use of long before they do server-side scripting. This time around, the notion of more queries early won out. Some of the concepts in this chapter are going to challenge you with a new way of thinking. You already had a taste of this just dealing with joins, but you haven’t had to deal with the kind of depth that I want to challenge you with in this chapter. Even if you don’t have that much procedural pro- gramming experience, the fact is that your brain has a natural tendency to break complex problems down into their smaller subparts (sub-procedures, logical steps) as opposed to solving them whole (the “set,” or SQL way). While SQL Server 2008 supports procedural language concepts now more than ever, my challenge to you is to try and see the question as a whole first. Be certain that you can’t get it in a single query. Even if you can’t think of a way, quite often you can break it up into several small queries and then combine them one at a time back into a larger query that does it all in one task. Try to see it as a whole and, if you can’t, then go ahead and break it down, but then combine it into the whole again to the largest extent that makes sense. This is really what’s at the heart of my challenge of a new way of thinking — conceptualizing the question as a whole rather than in steps. When we program in most languages, we usually work in a linear fashion. With SQL, however, you need to think more in terms of set theory. You can liken this to math class and the notion of A union B, or A intersect B. We need to think less in terms of steps to resolve the data and more about how the data fits together. In this chapter, we’re going to be using this concept of data fit to ask what amounts to multiple ques- tions in just one query. Essentially, we’re going to look at ways of taking what seem like multiple queries and placing them into something that will execute as a complete unit. In addition, we’ll also be taking a look at query performance and what you can do to get the most out of queries. 57012c07.qxd:WroxBeg 11/25/08 5:26 AM Page 187 Among the topics we’ll be covering in this chapter are: ❑ Nested subqueries ❑ Correlated subqueries ❑ Derived tables ❑ Making use of the EXISTS operator ❑ MERGE ❑ Optimizing query performance We’ll see how, using subqueries, we can make the seemingly impossible completely possible, and how an odd tweak here and there can make a big difference in your query performance. What Is a Subquer y? A subquery is a normal T-SQL query that is nested inside another query. Subqueries are created using parentheses when you have a SELECT statement that serves as the basis for either part of the data or the condition in another query. Subqueries are generally used to fill one of a few needs: ❑ To break a query into a series of logical steps ❑ To provide a listing to be the target of a WHERE clause together with [IN|EXISTS|ANY|ALL] ❑ To provide a lookup driven by each individual record in a parent query Some subqueries are very easy to think of and build, but some are extremely complex — it usually depends on the complexity of the relationship between the inner (the sub) and outer (the top) queries. It’s also worth noting that most subqueries (but definitely not all) can also be written using a join. In places where you can use a join instead, the join is usually the preferable choice for a variety of reasons we will continue to explore over the remainder of the book. I once got into a rather lengthy debate (perhaps 20 or 30 e-mails flying back and forth, with examples, reasons, and so on over a few days) with a coworker over the joins versus subqueries issue. Traditional logic says to always use the join, and that was what I was pushing (due to experience rather than traditional logic — you’ve already seen several places in this book where I’ve pointed out how tradi- tional thinking can be bogus). My coworker was pushing the notion that a subquery would actually cause less overhead — I decided to try it out. What I found was essentially (as you might expect) that we were both right in certain circumstances. We will explore these circumstances fully toward the end of the chapter after you have a bit more background. Now that we know what a subquery theoretically is, let’s look at some specific types and examples of subqueries. 188 Chapter 7: Adding More to Our Queries 57012c07.qxd:WroxBeg 11/25/08 5:26 AM Page 188 Building a Nested Subquery A nested subquery is one that goes in only one direction, returning either a single value for use in the outer query, or perhaps a full list of values to be used with the IN operator. In the event you want to use an explicit = operator, then you’re going to be using a query that returns a single value — that means one column from one row. If you are expecting a list back, then you’ll need to use the IN operator with your outer query. In the loosest sense, your query syntax is going to look something like one of these two syntax templates: SELECT <SELECT list> FROM <SomeTable> WHERE <SomeColumn> = ( SELECT <single column> FROM <SomeTable> WHERE <condition that results in only one row returned>) Or: SELECT <SELECT list> FROM <SomeTable> WHERE <SomeColumn> IN ( SELECT <single column> FROM <SomeTable> [WHERE <condition>)] Obviously, the exact syntax will vary, not only because you will be substituting the select list and exact table names, but also because you may have a multitable join in either the inner or outer queries — or both. Nested Queries Using Single-Value SELECT Statements Let’s get down to the nitty-gritty with an explicit example. Let’s say, for example, that we wanted to know the ProductIDs of every item sold on the first day any product was purchased from the system. If you already know the first day that an order was placed in the system, then it’s no problem; the query would look something like this: SELECT DISTINCT sod.ProductID FROM Sales.SalesOrderHeader soh JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID WHERE OrderDate = ‘07/01/2001’; This is first OrderDate in the system This yields the correct results: ProductID 707 708 709 … … 776 777 778 (47 row(s) affected) 189 Chapter 7: Adding More to Our Queries 57012c07.qxd:WroxBeg 11/25/08 5:26 AM Page 189 But let’s say, just for instance, that we are regularly purging data from the system, and we still want to ask this same question as part of an automated report. Since it’s going to be automated, we can’t run a query to find out what the first date in the system is and manually plug that into our query — or can we? Actually, the answer is: “Yes, we can,” by putting it all into just one statement: SELECT DISTINCT soh.OrderDate, sod.ProductID FROM Sales.SalesOrderHeader soh JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID WHERE OrderDate = (SELECT MIN(OrderDate) FROM Sales.SalesOrderHeader); It’s just that quick and easy. The inner query (SELECT MIN ) retrieves a single value for use in the outer query. Since we’re using an equal sign, the inner query absolutely must return only one column from one single row or you will get a runtime error. Notice that I added the order date to this new query. While it did not have to be there for the query to report the appropriate ProductID's, it does clarify what date those ProductID's are from. Under the first query, we knew what date because we had explicitly said it, but under this new query, the date is data driven, so it is often worthwhile to provide it as part of the result. Nested Queries Using Subqueries That Return Multiple Values Perhaps the most common of all subqueries that are implemented in the world are those that retrieve some form of domain list and use it as criteria for a query. For this one, let’s revisit a problem we looked at in Chapter 4 when we were examining outer joins. What you want is a list of all the products that have special offers. We might write something like this: SELECT ProductID, Name FROM Production.Product WHERE ProductID IN ( SELECT ProductID FROM Sales.SpecialOfferProduct); This returns 295 rows: ProductID Name 680 HL Road Frame - Black, 58 706 HL Road Frame - Red, 58 707 Sport-100 Helmet, Red … … 997 Road-750 Black, 44 998 Road-750 Black, 48 999 Road-750 Black, 52 (295 row(s) affected) 190 Chapter 7: Adding More to Our Queries 57012c07.qxd:WroxBeg 11/25/08 5:26 AM Page 190 While this works just fine, queries of this type almost always fall into the category of those that can be done using an inner join rather than a nested SELECT. For example, we could get the same results as the preceding subquery by running this simple join: SELECT DISTINCT pp.ProductID, Name FROM Production.Product pp JOIN Sales.SpecialOfferProduct ssop ON pp.ProductID = ssop.ProductID; For performance reasons, you want to use the join method as your default solution if you don’t have a specific reason for using the nested SELECT — we’ll discuss this more before the chapter’s done. SQL Server is actually pretty smart about this kind of thing. In the lion’s share of situations, SQL Server will actually resolve the nested subquery solution to the same query plan it would use on the join — indeed, if you checked the query plan for both the nested subquery and the previous join, you’d find it was the exact same plan. So, with that in mind, the truth is that most of the time, there really isn’t that much difference. The problem, of course, is that I just said most of the time. When the query plans vary, the join is usually the better choice, and thus the recommendation to use that syntax by default. Using a Nested SELECT to Find Orphaned Records This type of nested SELECT is nearly identical to our previous example, except that we add the NOT oper- ator. The difference this makes when you are converting to join syntax is that you are equating to an outer join rather than an inner join. Before we do the nested SELECT syntax, let’s review one of our examples of an outer join from Chapter 4. In this query, we were trying to identify all the special offers that do not have matching products: SELECT Description FROM Sales.SpecialOfferProduct ssop RIGHT OUTER JOIN Sales.SpecialOffer sso ON ssop.SpecialOfferID = sso.SpecialOfferID WHERE sso.SpecialOfferID != 1 AND ssop.SpecialOfferID IS NULL; This returned one row: Description Volume Discount over 60 (1 row(s) affected) This is the way that, typically speaking, things should be done (or as a LEFT JOIN). I can’t say, however, that it’s the way that things are usually done. The join usually takes a bit more thought, so we usually wind up with the nested SELECT instead. See if you can write this nested SELECT on your own. Once you’re done, come back and take a look. 191 Chapter 7: Adding More to Our Queries 57012c07.qxd:WroxBeg 11/25/08 5:26 AM Page 191 [...]... Order numbered 43 659 The Customer has an Order numbered 44 305 The Customer has an Order numbered 45 061 The Customer has an Order numbered 45 779 The Customer has an Order numbered 46 6 04 The Customer has an Order numbered 47 693 The Customer has an Order numbered 48 730 The Customer has an Order numbered 49 822 The Customer has an Order numbered 51081 The Customer has an Order numbered 552 34 The Customer... back the same 19,1 34 rows: CustomerID 11000 11001 11002 … … 30116 30117 30118 SalesOrderID -43 793 43 767 43 736 OrderDate 2001-07-22 00:00:00.000 2001-07-18 00:00:00.000 2001-07-10 00:00:00.000 51106 43 865 47 378 2003-07-01 00:00:00.000 2001-08-01 00:00:00.000 2002-09-01 00:00:00.000 (191 34 row(s) affected) There are a few key things to notice in this query: ❑ ❑ 1 94 We see only one... CustomerID SalesOrderID - -11000 43 793 11001 43 767 11002 43 736 … OrderDate 2001-07-22 00:00:00.000 2001-07-18 00:00:00.000 2001-07-10 00:00:00.000 193 57012c07.qxd:WroxBeg 11/25/08 5:26 AM Page 1 94 Chapter 7: Adding More to Our Queries … 30116 30117 30118 51106 43 865 47 378 2003-07-01 00:00:00.000 2001-08-01 00:00:00.000 2002-09-01 00:00:00.000 (191 34 row(s) affected) As previously stated,... AW0002 948 4 AW0002 949 0 AW0002 949 9 … … AW00030108 AW00030113 AW00030118 Name -Southeast Northwest Canada Canada United Kingdom Central (58 row(s) affected) If you want to check things out on this, just run the queries for the two derived tables separately and compare the results For this particular query, I needed to use the DISTINCT keyword If I didn’t, then I would have potentially... CONVERT or CASE Keep in mind that you can set a split point that SQL Server will use to determine whether a two-digit year should have a 20 added on the front or a 19 The default breaking point is 49 /50 — a two-digit year of 49 or less will be converted using a 20 on the front Anything higher will use a 19 These can be changed in the database server configuration (administrative issues are discussed in... AW00000696 AW00000697 AW00000698 AW00011012 AW00011013 AW000110 14 … … NULL NULL NULL 2003-09-17 00:00:00.000 2003-10-15 00:00:00.000 2003-09- 24 00:00:00.000 to something a bit more useful: … … AW00000696 AW00000697 AW00000698 AW00011012 AW00011013 AW000110 14 … … NEVER ORDERED NEVER ORDERED NEVER ORDERED Sep 17 2003 12:00AM Oct 15 2003 12:00AM Sep 24 2003 12:00AM Notice that I also had to put the CAST() function... that things usually work The word “usually” is extremely operative here There are very few rules in SQL that will be true 100 percent of the time In a world full of exceptions, SQL has to be at the pinnacle of that — exceptions are a dime a dozen when you try to describe the performance world in SQL Server In short, you need to gauge just how important the performance of a given query is If performance... a wide variety of data-type conversions that you’ll need to do when SQL Server won’t do it implicitly for you For example, converting a number to a string is a very common need To illustrate: SELECT ‘The Customer has an Order numbered ‘ + SalesOrderID FROM Sales.SalesOrderHeader WHERE CustomerID = 29825; will yield an error: Msg 245 , Level 16, State 1, Line 1 Conversion failed when converting the varchar... the front Anything higher will use a 19 These can be changed in the database server configuration (administrative issues are discussed in Chapter 19) The MERGE Command The MERGE command is new with SQL Server 2008 and provides a somewhat different way of thinking about DML statements With MERGE, we have the prospect of combining multiple DML action statements (INSERT, UPDATE, DELETE) into one overall... -INSERT INSERT UPDATE … … UPDATE UPDATE UPDATE Year -2003 2003 2003 Month 8 8 8 ProductID 928 882 707 QtySold 2 1 249 Year -NULL NULL 2003 Month NULL NULL 8 ProductID NULL NULL 707 QtySold -NULL NULL 242 2003 2003 2003 8 8 8 963 970 998 32 54 139 2003 2003 2003 8 8 8 963 970 998 31 53 138 (38 row(s) affected) Performance Considerations We’ve already touched on some of the . is created. For example, SQL Server will now allow you to create a stored procedure that refers to a table even before the said table is created. In this instance, SQL Server isn’t able to list. informa- tion that is external to your SQL Server. Happen after the command occurs. High overhead. 1 84 Chapter 6: Constraints 57012c06.qxd:WroxBeg 11/25/08 5:19 AM Page 1 84 Summary The different types. subparts (sub-procedures, logical steps) as opposed to solving them whole (the “set,” or SQL way). While SQL Server 2008 supports procedural language concepts now more than ever, my challenge to you