Nielsen c16.tex V4 - 07/21/2009 12:53pm Page 392 Part II Manipulating Data with Select explicitly ROLLBACK the transaction, the data-modification operation will go through as originally intended. Unlike INSTEAD OF triggers, AFTER triggers normally report an error code if an operation is rolled back. As Chapter 66, ‘‘Managing Transactions, Locking, and Blocking,’’ discusses in greater detail, every DML command implicitly occurs within a transaction, even if no BEGIN TRANSACTION command exists. The AFTER trigger takes place after the modification but before the implicit commit, so the transaction is still open when the AFTER trigger is fired. Therefore, a transaction ROLLBACK command in the trigger will roll back all pending transactions. This code sample creates the AfterDemo AFTER trigger on the Guide table, which includes the RAISERROR and ROLLBACK TRANSACTION commands: USE CHA2; CREATE TRIGGER AfterDemo ON Guide AFTER INSERT, UPDATE AS Print ‘After Trigger Demo’; logic in a real trigger would decide what to do here RAISERROR (’Sample Error’, 16, 1 ); ROLLBACK TRAN; With the AFTER trigger applied to the Guide table, the following INSERT will result: INSERT Guide(lastName, FirstName, Qualifications, DateOfBirth, DateHire) VALUES (’Harrison’, ‘Nancy’, ‘Pilot, Sky Diver, Hang Glider, Emergency Paramedic’, ‘19690625’, ‘20000714’); Result: After Trigger Demo Server: Msg 50000, Level 16, State 1, Procedure AfterDemo, Line 7 Sample Error A SELECT searching for Nancy Harrison would find no such row because the AFTER trigger rolled back the transaction. Note that the sample code in the file for this chapter drops the AfterDemo trigger so that the code in the remainder of the chapter will function. Non-Updateable Views Non-updateable views may affect INSERT, UPDATE,andDELETE operations. Several factors will cause a view to become non-updateable. The most common causes of non- updateable views are aggregate functions (including DISTINCT), group bys, and joins. If the view 392 www.getcoolebook.com Nielsen c16.tex V4 - 07/21/2009 12:53pm Page 393 Modification Obstacles 16 includes other nested views, any nested view that is non-updateable will cause the final view to be non-updateable as well. The view vMedGuide, created in the following sample code, is non-updateable because the DISTINCT predicate eliminates duplicates, making it impossible for SQL to be sure which underlying row should be updated: CREATE VIEW dbo.vMedGuide AS SELECT DISTINCT GuideID, LastName, Qualifications FROM dbo.Guide WHERE Qualifications LIKE ‘%Aid%’ OR Qualifications LIKE ‘%medic%’ OR Qualifications LIKE ‘%Physician%’; To test the updateability of the view, the next query attempts to perform an UPDATE command through the view: UPDATE dbo.vMedGuide SET Qualifications = ‘E.R. Physician, Diver’ WHERE GuideID = 1; Result: Server: Msg 4404, Level 16, State 1, Line 1 View or function ‘dbo.vMedGuide’ is not updatable because the definition contains the DISTINCT clause. For more information about creating views and a more complete list of the causes of non- updateable views, refer to Chapter 14, ‘‘Projecting Data Through Views.’’ Views With Check Option Views WITH CHECK OPTION may affect INSERT and UPDATE operations. Views can cause two specific problems, both related to the WITH CHECK OPTION. A special situation called disappearing rows occurs when rows are returned from a view and then updated such that they no longer meet the WHERE clause’s requirements for the view. The rows are still in the database but they are no longer visible in the view. For more about disappearing rows, the WITH CHECK OPTION, and their implications for security, refer to Chapter 14, ‘‘Projecting Data Through Views.’’ Adding the WITH CHECK OPTION to a view prohibits disappearing rows, but causes another problem. A view that includes the WITH CHECK OPTION will apply the WHERE clause condition to both data being retrieved through the view and data being inserted or updated through the view. If the data being inserted or updated will not be retrievable through the view after the insert or update of the operation, the WITH CHECK OPTION will cause the data-modification operation to fail. 393 www.getcoolebook.com Nielsen c16.tex V4 - 07/21/2009 12:53pm Page 394 Part II Manipulating Data with Select The following code sample modifies the previous view to add the WITH CHECK OPTION and then attempts two updates. The first update passes the WHERE clause requirements. The second update would remove the rows from the result set returned by the view, so it fails: ALTER VIEW dbo.vMedGuide AS SELECT GuideID, LastName, Qualifications FROM dbo.Guide WHERE Qualifications LIKE ‘%Aid%’ OR Qualifications LIKE ‘%medic%’ OR Qualifications LIKE ‘%Physician%’ WITH CHECK OPTION; The following queries test the views WITH CHECK OPTION. The first one will pass because the qualifica- tions include `Physician´, but the second query will fail: UPDATE dbo.vMedGuide SET Qualifications = ‘E.R. Physician, Diver’ WHERE GuideID = 1; UPDATE dbo.vMedGuide SET Qualifications = ‘Diver’ WHERE GuideID = 1; Result: Server: Msg 550, Level 16, State 1, Line 1 The attempted insert or update failed because the target view either specifies WITH CHECK OPTION or spans a view that specifies WITH CHECK OPTION and one or more rows resulting from the operation did not qualify under the CHECK OPTION constraint. The statement has been terminated. Calculated Columns A related issue to non-updateable views involves updating calculated columns. Essentially, a calculated column is a read-only generated value. Just like non-updateable views, attempting to write to a calcu- lated column will block the data-modification statement. To demonstrate how a calculated column can block an insert or update, the following code builds a table with a calculated column and inserts a couple of sample rows: USE tempdb; CREATE TABLE CalcCol ( ID INT NOT NULL IDENTITY PRIMARY KEY, 394 www.getcoolebook.com Nielsen c16.tex V4 - 07/21/2009 12:53pm Page 395 Modification Obstacles 16 Col1 CHAR(2), Col2 CHAR(2), Calc AS Col1 + Col2 ); INSERT CalcCol (Col1,Col2) VALUES (’ab’, ‘cd’), (’12’, ‘34’); SELECT Col1, Col2, Calc FROM CalcCol; Result: Col1 Col2 Calc ab cd abcd 12 34 1234 The last SELECT proved that the Calc column did indeed calculate its contents. The next query attempts to write into the Calc column and generates an error: INSERT CalcCol (Col1, Col2, Calc) VALUES (’qw’, ‘er’, ‘qwer’) Result: Msg 271, Level 16, State 1, Line 1 The column "Calc" cannot be modified because it is either a computed column or is the result of a UNION operator. Security Constraints Security may affect INSERT, UPDATE,andDELETE operations. A number of security settings and roles can cause any operation to fail. Typically, security is not an issue during development; but for production databases, security is often paramount. Documenting the security settings and security roles will help you solve data-modification problems caused by security. Best Practice E very data-modification obstacle is easily within the SQL developer’s or DBA’s ability to surmount. Understanding SQL Server and documenting the database, as well as being familiar with the database schema, stored procedures, and triggers, will prevent most data-modification problems. 395 www.getcoolebook.com Nielsen c16.tex V4 - 07/21/2009 12:53pm Page 396 Part II Manipulating Data with Select Summary Having read through this exhaustive list of potential obstacles, the question is, are these obstacles a problem or is it good that SQL Server behaves like this? Clearly, the answer is yes, it is good. Most of these issues involve constraints that enforce some type of integrity (see Chapter 3, ‘‘Relational Database Design,’’ and Chapter 20, ‘‘Creating the Physical Schema’’). A well-architected database requires keys, constraints, and security. The real value of this chapter isn’t learning about problems, but how to work with SQL Server the way it is supposed to work. One last critical point: If your database doesn’t have these constraints in place — to make data modifica- tion easier — that’s just plain wrong. This concludes a nine-chapter study on using the SELECT command, and its INSERT, UPDATE,and DELETE variations, to manipulate data. The next part moves beyond the traditional relational data types and discusses working with spatial data, hierarchies, full-text search, BLOBs, and XML. 396 www.getcoolebook.com Nielsen p03.tex V4 - 07/21/2009 12:59pm Page 397 Beyond Relational IN THIS PART Chapter 17 Traversing Hierarchies Chapter 18 Manipulating XML Data Chapter 19 Using Integrated Full-Text Search T he database world has grown in the past few years to data types that were once considered outside the realm of the traditional database. Websites and corporate applications require storing and searching of media, geography, XML, full-text, and hierarchies data. SQL Server 2008 is up for the job. Part III covers storing, searching, and retrieving five types of beyond relational data. Some of these data types require unique forms of table design and very different means of selecting the data. It may not look at all like SQL, but it’s SQL Server. If SQL Server is the box, then Part III is all about retooling the box into a more friendly, inclusive type of box that holds all sorts of data. www.getcoolebook.com Nielsen p03.tex V4 - 07/21/2009 12:59pm Page 398 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 399 Traversing Hierarchies IN THIS CHAPTER Hierarchical patterns Hierarchical user-defined functions (UDFs) Recursive CTE queries Materialized lists HierarchyID data type T raditionally, SQL has had a hard time getting along with data that doesn’t fit well into a relational grid, and that includes hierarchical data. The lack of an elegant solution became obvious when working with family trees, bills of materials, organizational charts, layers of jurisdictions, or modeling O-O class inheritance. At best, the older methods of handling hierarchies were more of a clumsy work-around than a solution. The problems surrounding hierarchical data involve modeling the data, navigating the tree, selecting multiple generations of ancestors or descendents, or manipu- lating the tree — i.e., moving portions of the tree to another location or inserting items. When the requirements demand a many-to-many relationship, such as a bill of materials, the relationships become even more complex. New query methods, new data types, and a better understanding of hierarchical information by the SQL community have coalesced to make this an area where SQL Server offers intelligent, scalable solutions to hierarchical problems. Is managing hierarchical data as easy as SEECT * from foo? No. Hierarchies still don’t fit the traditional relational model so it takes some work to understand and code a database that includes a hierarchy. The initial question when working with hierarchical data is how to store the hier- archy, as hierarchies aren’t natural to the relational model. There are several pos- sibilities to choose from. In this chapter I’ll explain three techniques (there are other methods but this chapter focuses on three), each with its own unique pros and cons: ■ Adjacency list: By far, the most popular method ■ Materialized path: My personal favorite method ■ HierarchyID: Microsoft’s new method 399 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 400 Part III Beyond Relational What’s New with Hierarchies? S QL Server 2005 introduced ANSI SQL 99’s recursive common table expressions (CTEs), which lessened the pain of querying hierarchies modeled using the adjacency list pattern. Recursive CTEs are simpler and often faster than the SQL Server 2000 methods of writing a user-defined function to iteratively query each level. The big news for SQL Server 2008 is the HierarchyID data type — a fast, compact CLR data type with several methods for querying and managing hierarchies. Hierarchies are common in the real world, which is why they are so important for database design and development. The word hierarchy comes from the Greek and was first used in English in 1880 to describe the order of angles. I highly recommend you take the time to read the Wikipedia article on hierarchies — fascinating stuff. Simple hierarchies only have a one parent to many children relationship and may be modeled using any of the three techniques. Examples of simple hierarchies include the following: ■ Organizational charts ■ Object-oriented classes ■ Jurisdictions ■ Taxonomy, or a species tree ■ Network marketing schemes ■ File system directory folders More complex hierarchies, referred to in the mathematics world as graphs, involve multiple cardinalities and can only be modeled using variations of the adjacency list pattern: ■ Genealogies: A child may have multiple parent relationships. ■ Bill of materials: An assembly may consist of multiple sub-assemblies or parts, and the assembly may be used in multiple parent assemblies. ■ Social Network Follows: A person may follow multiple other people, and multiple people may follow the person. ■ Complex organizational charts: An employee may report to one supervisor for local administration duties and another supervisor for technical issues on a global scale. 400 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 401 Traversing Hierarchies 17 This chapter walks through the three aforementioned patterns (adjacency lists, materialized paths, and HierarchyIDs) for storing hierarchical data and presents several ways of working with the data for each pattern. For each method, I present the tasks in the flow that make the most sense for that method. It’s possible that some hierarchies have more than one top node. For example, a jurisdiction hierarchy with countries, states, counties, and cities could have multiple country nodes at the top. This pattern is sometimes referred to as a hierarchical forest (with many trees.) Because the material for this chapter exceeds the allotted page count, you can find several additional code examples in the chapter script file, which you can download from www.sqlserverbible.com. Adjacency List Pattern The traditional pattern used to model hierarchical data is the adjacency list pattern, informally called the self-join pattern, which was presented by Dr. Edgar F. Codd. The adjacency list pattern stores both the current node’s key and its immediate parent’s key in the current node row. (This chapter refers to the two data elements in the data pair as current node and parent node.) The most familiar example of the adjacency list pattern is a simple organizational chart like the one in the AdventureWorks2008 sample database, partially illustrated in Figure 17-1. In a basic organizational chart, there’s a one-to-many relationship between employees who play the role of supervisor, and employees who report to supervisors. Supervisors may have multiple employees reporting to them, but every employee can have only one supervisor. An employee may both be a supervisor and report to another supervisor. The adjacency lists pattern handles this one-to-many relationship by storing the parent node’s primary key in the current node. This allows multiple current nodes to point to a single parent node. In the case of the basic organizational chart, the employee is the current node, and the manager is the parent node. The employee’s row points to the employee’s manager by storing the manager’s ID in the employee’s ManagerID column. 401 www.getcoolebook.com . different means of selecting the data. It may not look at all like SQL, but it’s SQL Server. If SQL Server is the box, then Part III is all about retooling the box into a more friendly, inclusive. are simpler and often faster than the SQL Server 2000 methods of writing a user-defined function to iteratively query each level. The big news for SQL Server 2008 is the HierarchyID data type —. HierarchyID: Microsoft s new method 399 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 400 Part III Beyond Relational What’s New with Hierarchies? S QL Server 2005 introduced ANSI SQL