ptg 1584 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 with a slash sign used as a separator between the levels. For example, you can run the following query to get both the binary and logical representations of the hid value: select cast(hid as varbinary(6)) as hid, substring(hid.ToString(), 1, 12) as path, lvl, partid, partname From parts_hierarchy go hid path lvl partid partname 0x / 0 22 Car 0x58 /1/ 1 1 DriveTrain 0x68 /2/ 1 23 Body 0x78 /3/ 1 24 Frame 0x5AC0 /1/1/ 2 2 Engine 0x5B40 /1/2/ 2 3 Transmission 0x5BC0 /1/3/ 2 4 Axle 0x5C20 /1/4/ 2 12 Drive Shaft 0x5B56 /1/2/1/ 3 9 Flywheel 0x5B5A /1/2/2/ 3 10 Clutch 0x5B5E /1/2/3/ 3 16 Gear Box 0x5AD6 /1/1/1/ 3 5 Radiator 0x5ADA /1/1/2/ 3 6 Intake Manifold 0x5ADE /1/1/3/ 3 7 Exhaust Manifold 0x5AE1 /1/1/4/ 3 8 Carburetor 0x5AE3 /1/1/5/ 3 13 Piston 0x5AE5 /1/1/6/ 3 14 Crankshaft 0x5AE358 /1/1/5/1/ 4 21 Piston Rings 0x5AE158 /1/1/4/1/ 4 11 Float Valve 0x5B5EB0 /1/2/3/1/ 4 15 Reverse Gear 0x5B5ED0 /1/2/3/2/ 4 17 First Gear 0x5B5EF0 /1/2/3/3/ 4 18 Second Gear 0x5B5F08 /1/2/3/4/ 4 19 Third Gear 0x5B5F18 /1/2/3/5/ 4 20 Fourth Gear As stated previously, the values stored in a Hierarchyid column provide topological sorting of the nodes in the hierarchy. The GetLevel method can be used to produce the level in the hierarchy (as it was to store the level in the computed lvl column in the Parts_hierarchy table). Using the lvl column or the GetLevel method, you can easily produce a graphical depiction of the hierarchy by simply sorting the rows by hid and generating indentation for each row based on the lvl column, as shown in the following example: ptg 1585 Hierarchyid Data Type 42 SELECT REPLICATE(‘ ’, lvl) + right(‘>’,lvl) + partname AS partname FROM Parts_hierarchy order by hid go partname Car >DriveTrain >Engine >Radiator >Intake Manifold >Exhaust Manifold >Carburetor >Float Valve >Piston >Piston Rings >Crankshaft >Transmission >Flywheel >Clutch >Gear Box >Reverse Gear >First Gear >Second Gear >Third Gear >Fourth Gear >Axle >Drive Shaft >Body >Frame To return only the subparts of a specific part, you can use the IsDescendantOf method. The parameter passed to this method is a node’s Hierarchyid value. The method returns 1 if the queried node is a descendant of the input node. For example, the following query returns all subparts of the engine: select child.partid, child.partname, child.lvl from parts_hierarchy as parent inner join parts_hierarchy as child on parent.partname = ‘Engine’ and child.hid.IsDescendantOf(parent.hid) = 1 ptg 1586 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 go partid partname lvl 2 Engine 2 5 Radiator 3 6 Intake Manifold 3 7 Exhaust Manifold 3 8 Carburetor 3 13 Piston 3 14 Crankshaft 3 21 Piston Rings 4 11 Float Valve 4 You can also use the IsDescendantOf method to return all parent parts of a given part: select parent.partid, parent.partname, parent.lvl from parts_hierarchy as parent inner join parts_hierarchy as child on child.partname = ‘Piston’ and child.hid.IsDescendantOf(parent.hid) = 1 go partid partname lvl 22 Car 0 1 DriveTrain 1 2 Engine 2 13 Piston 3 To return a specific level of subparts for a given part, you can use the GetAncestor method. You pass this method an integer value indicating the level below the parent you want to display. The function returns the Hierarchyid value of the ancestor n levels above the queried node. For example, the following query returns all the subparts two levels down from the drivetrain: select child.partid, child.partname from parts_hierarchy as parent inner join parts_hierarchy as child on parent.partname = ‘Drivetrain’ and child.hid.GetAncestor(2) = parent.hid go ptg 1587 Hierarchyid Data Type 42 partid partname lvl 9 Flywheel 3 10 Clutch 3 16 Gear Box 3 5 Radiator 3 6 Intake Manifold 3 7 Exhaust Manifold 3 8 Carburetor 3 13 Piston 3 14 Crankshaft 3 Modifying the Hierarchy The script in Listing 42.15 performs the initial population of the Parts_hierarchy table. What if you need to add additional records into the table? Let’s look at how to use the GetDescendant method to add new records at different levels of the hierarchy. For example, to add a child part to the Body node (node /2/), you can use the GetDescendant method without any arguments to add the new row below Body node at node /2/1/: INSERT Parts_hierarchy (hid, partid, partname) select hid.GetDescendant(null, null), 25, ‘left front fender’ from Parts_hierarchy where partname = ‘Body’ To add a new row as a higher descendant node at the same level as the left front fender inserted in the previous example, you use the GetDescendant method again, but this time passing the Hierarchyid of the existing child node as the first parameter. This specifies that the new node will follow the existing node, becoming /2/2/. There are a couple of ways to specify the Hierarchyid of the existing child node. You can retrieve it from the table as a Hierarchyid data type, or if you know the string representation of the node, you can use the Parse method. The Parse method converts a canonical string representa- tion of a hierarchical value to Hierarchyid. Parse is also called implicitly when a conver- sion from a string type to Hierarchyid occurs, as in CAST (input AS hierarchyid). Parse is essentially the opposite of the ToString method. INSERT Parts_hierarchy (hid, partid, partname) select hid.GetDescendant(hierarchyid::Parse(‘/2/1/’), null), 26, ‘right front fender’ from Parts_hierarchy where partname = ‘Body’ ptg 1588 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 Now, what if you need to add a new node between the two existing nodes you just added? Again, you use the GetDescendant methods, but this time, you pass it the hierarchy IDs of both existing nodes between which you want to insert the new node: declare @child1 hierarchyid, @child2 hierarchyid select @child1 = hid from Parts_hierarchy where partname = ‘left front fender’ select @child2 = hid from Parts_hierarchy where partname = ‘right front fender’ INSERT Parts_hierarchy (hid, partid, partname) select hid.GetDescendant(@child1, @child2), 27, ‘front bumper’ from Parts_hierarchy where partname = ‘Body’ Now, let’s run a query of the Body subtree to examine the newly inserted child nodes: select child.partid, child.partname, child.lvl, substring(child.hid.ToString(), 1, 12) as path from parts_hierarchy as parent inner join parts_hierarchy as child on parent.partname = ‘Body’ and child.hid.IsDescendantOf(parent.hid) = 1 order by child.hid go partid partname lvl path 23 Body 1 /2/ 25 left front fender 2 /2/1/ 27 front bumper 2 /2/1.1/ 26 right front fender 2 /2/2/ Notice that the first child added (left front fender) has a node path of /2/1/, and the second row added (right front fender) has a node path of /2/2/. The new child node inserted between these two nodes (front bumper) was given a node path of /2/1.1/ so that it maintains the designated topological ordering of the nodes. What if you need to make other types of changes within hierarchies? For example, you might need to move a whole subtree of parts from one part to another (that is, move a part and all its subordinates). To move nodes or subtrees in a hierarchy, you can use the GetReparentedValue method of the Hierarchyid data type. You invoke this method on the Hierarchyid value of the node you want to reparent and provide as inputs the value of the old parent and the value of the new parent. Note that this method doesn’t change the Hierarchyid value for the existing node that you want to move. Instead, it returns a new Hierarchyid value that you can use to update ptg 1589 Hierarchyid Data Type 42 the target node’s Hierarchyid value. Logically, the GetReparentedValue method simply substitutes the part of the existing node’s path that represents the old parent’s path with the new parent’s path. For example, if the path of the existing node is /1/2/1/, the path of the old parent is /1/2/, and the path of the new parent is /2/1/3/, the GetReparentedValue method would return /2/1/3/1/. You have to be careful, though. If the target parent node already has child nodes, the GetReparentedValue method may not produce a unique hierarchy path. If you reparent node /1/2/1/ from old parent /1/2/ to new parent /2/1/3/, and /2/1/3/ already has a child /2/1/3/1/, you generate a duplicate value. To avoid this situation when moving a single node from one parent to another, you should not use the GetReparentedValue method but instead use the GetDescendant method to produce a completely new value for the single node. For example, let’s assume you want to move the Flywheel part from the Transmission node to the Engine node. A sample approach is shown in Listing 42.16. This example uses the GetDescendant method to generate a new Hierarchyid under the Engine node following the last child node and updates the hid column for the Flywheel record to the new Hierarchyid generated. LISTING 42.16 Moving a Single Node in a Hierarchy declare @newhid hierarchyid, @maxchild hierarchyid first, find the max child node under the Engine node this is the node we will move the Flywheel node after select @maxchild = max(child.hid) from parts_hierarchy as parent inner join parts_hierarchy as child on parent.partname = ‘Engine’ and child.hid.GetAncestor(1) = parent.hid select ‘Child to insert after’ = @maxchild.ToString() Now, generate a new descendant hid for the Engine node after the max child node select @newhid = hid.GetDescendant(@maxchild, null) from Parts_hierarchy where partname = ‘Engine’ Update the hid for the Flywheel node to the new hid update Parts_hierarchy set hid = @newhid where partname = ‘Flywheel’ go Child to insert after /1/1/6/ ptg 1590 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 If you need to move an entire subtree within a hierarchy, you can use the GetReparentedValue method in conjunction with the GetDescendant method. For example, suppose you want to move the whole Engine subtree from its current parent node of Drivetrain to the new parent node of Car. The Car node obviously already has children. If you want to avoid conflicts, the best approach is to generate a new Hierarchyid value for the root node of the subtree. You can achieve this with the following steps: 1. Use the GetDescendant method to produce a completely new Hierarchyid value for the root node of the subtree. 2. Update the Hierarchyid value of all nodes in the subtree to the value returned by the GetReparentedValue method. Because you are generating a completely new Hierarchyid value under the target parent, this new child node has no existing children, which avoids any duplicate Hierarchyid values. Listing 42.17 provides an example for changing the parent node of the Engine subtree from Drivetrain to Car. LISTING 42.17 Reparenting a Subtree in a Hierarchy DECLARE @old_root AS HIERARCHYID, @new_root AS HIERARCHYID, @new_parent_hid AS HIERARCHYID, @max_child as hierarchyid Get the hid of the new parent select @new_parent_hid = hid FROM dbo.parts_hierarchy WHERE partname = ‘Car’ Get the hid of the current root of the subnode Select @old_root = hid FROM dbo.parts_hierarchy WHERE partname = ‘Engine’ Get the max hid of child nodes of the new parent select @max_child = MAX(hid) FROM parts_hierarchy WHERE hid.GetAncestor(1) = @new_parent_hid get a new hid for the moving child node that is after the current max child node of the new parent SET @new_root = @new_parent_hid.GetDescendant (@max_child, null) Next, reparent the moving child node and all descendants UPDATE dbo.parts_hierarchy ptg 1591 Hierarchyid Data Type 42 SET hid = hid.GetReparentedValue(@old_root, @new_root) WHERE hid.IsDescendantOf(@old_root) = 1 Now, let’s reexamine the hierarchy after the updates made in Listings 42.16. and 42.17: SELECT left(REPLICATE(‘ ’, lvl) + right(‘>’,lvl) + partname, 30) AS partname, hid.ToString() AS path FROM Parts_hierarchy order by hid go partname path Car / >DriveTrain /1/ >Transmission /1/2/ >Clutch /1/2/2/ >Gear Box /1/2/3/ >Reverse Gear /1/2/3/1/ >First Gear /1/2/3/2/ >Second Gear /1/2/3/3/ >Third Gear /1/2/3/4/ >Fourth Gear /1/2/3/5/ >Axle /1/3/ >Drive Shaft /1/4/ >Body /2/ >left front fender /2/1/ >front bumper /2/1.1/ >right front fender /2/2/ >Frame /3/ >Engine /4/ >Radiator /4/1/ >Intake Manifold /4/2/ >Exhaust Manifold /4/3/ >Carburetor /4/4/ >Float Valve /4/4/1/ >Piston /4/5/ >Piston Rings /4/5/1/ >Crankshaft /4/6/ >Flywheel /4/7/ ptg 1592 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 As you can see from the results, the Flywheel node is now under the Engine node, and the entire Engine subtree is now under the Car node. Using FILESTREAM Storage In versions of SQL Server prior to SQL Server 2008, there were two ways of storing unstructured data: as a binary large object (BLOB) in an image or varbinary(max) column, or in files outside the database, separate from the structured relational data, storing a refer- ence or pathname to the file in a varchar column. Neither of these methods is ideal for handling unstructured data. Storing the data outside the database makes managing the unstructured data and keeping it associated with structured data more complex. This approach lacks transactional consistency, coordinating backups and restores with the structured data in the database is difficult, and implementing proper data security can be quite cumbersome. Storing the unstructured data in the database solves the transactional consistency, backup/restore, and security issues, but BLOBs have different usage patterns than rela- tional data. SQL Server’s storage engine is primarily concerned with doing I/O on rela- tional data stored in pages and extents, not streaming large BLOBs. I/O performance typically degrades dramatically if the size of the BLOB data increases beyond 1MB. Accessing BLOB data stored inside a SQL Server database is generally slower than storing it externally in a location such as the NTFS file system. In addition, BLOB storage is not as efficient as the file system for storing large data values, so more storage space is required. FILESTREAM storage, introduced in SQL Server 2008, helps to solve the issues with using unstructured data by integrating the SQL Server Database Engine with the NTFS file system for storing unstructured data such as documents and images on the file system with a pointer to the data in the database. The file pointer is implemented in SQL Server as a varbinary(max) column, and the actual data is stored in files in the file system. In addition to enabling client applications to leverage the rich NTFS streaming APIs and the performance of the file system for storing and retrieving unstructured data, other advantages of FILESTREAM storage include the following: . You are able to use T-SQL statements to insert, update, query, and back up FILESTREAM data even though the actual data resides outside the database in the NTFS file system. . You are able to maintain transactional consistency between the unstructured data and corresponding structured data. . You are able to enforce the same level of security on the unstructured data as with your relational data using built-in SQL Server security mechanisms. ptg 1593 Using FILESTREAM Storage 42 . FILESTREAM uses the NT system cache for caching file data rather than caching the data in the SQL Server buffer pool, leaving more memory available for query processing. . FILESTREAM storage also eliminates the size limitation of BLOBS stored in the data- base. Whereas standard image and varbinary(max) columns have a size limitation of 2GB, the sizes of the FILESTREAM BLOBs are limited only by the available space of the file system. Columns with the FILESTREAM attribute set can be managed just like any other BLOB column in SQL Server. Administrators can use the manageability and security capabilities of SQL Server to integrate FILESTREAM data management with the rest of the data in the relational database—without needing to manage the file system data separately. This includes maintenance operations such as backup and restore, complete integration with the SQL Server security model, and full-transaction support to ensure data-level consis- tency between the relational data in the database and the unstructured data physically stored on the file system. The database administrator does not need to manage the file system data separately Whether you should use database storage or file system storage for your BLOB data is determined by the size and use of the unstructured data. If the following conditions are true, you should consider using FILESTREAM: . The objects being stored as BLOBS are, on average, larger than 1MB. . Fast read access is important. . You are developing applications that use a middle tier for application logic. Enabling FILESTREAM Storage If you decide to use FILESTREAM storage, it first needs to be enabled at both the Windows level as well as at the SQL Server Instance level. FILESTREAM storage can be enabled auto- matically during SQL Server installation or manually after installation. If you are enabling FILESTREAM during SQL Server installation, you need to provide the Windows share location where the FILESTREAM data will be stored. You can also choose whether to allow remote clients to access the FILESTREAM data. For more information on how to enable FILESTREAM storage during installation, see Chapter 8, “Installing SQL Server 2008.” If you did not enable the FILESTREAM option during installation, you can enable it for a running instance of SQL Server 2008 at any time using SQL Server Configuration Manager (SSCM). In SSCM, right-click on the SQL Server Service and select Properties. Then select the FILESTREAM tab, which provides similar options as those displayed during SQL Server installation (see Figure 42.1). This enables SQL Server to work directly with the Windows . 8, “Installing SQL Server 2008. ” If you did not enable the FILESTREAM option during installation, you can enable it for a running instance of SQL Server 2008 at any time using SQL Server Configuration. the SQL Server Instance level. FILESTREAM storage can be enabled auto- matically during SQL Server installation or manually after installation. If you are enabling FILESTREAM during SQL Server. the entire Engine subtree is now under the Car node. Using FILESTREAM Storage In versions of SQL Server prior to SQL Server 2008, there were two ways of storing unstructured data: as a binary large object