Hướng dẫn học Microsoft SQL Server 2008 part 47 pps

Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 422 Part III Beyond Relational Subtree queries The primary work of a hierarchy is returning the hierarchy as a set. The adjacency list method used similar methods for scanning up or down the hierarchy. Not so with materialized path. Searching down a materialized path is a piece of cake, but searching up the tree is a real pain. Searching down the hierarchy with materialized path Navigating down the hierarchy and returning a subtree of all nodes under a given node is where the materialized path method really shines. Check out the simplicity of this query: SELECT BusinessEntityID, ManagerID, MaterializedPath FROM HumanResources.Employee WHERE MaterializedPath LIKE ‘1,263,%’ Result: BusinessEntityID ManagerID MaterializedPath 263 1 1,263, 264 263 1,263,264, 265 264 1,263,264,265, 266 264 1,263,264,266, 267 263 1,263,267, 268 263 1,263,268, 269 263 1,263,269, 270 263 1,263,270, 271 263 1,263,271, 272 263 1,263,272 That’s all it takes to find a node’s subtree. Because the materialized path for every node in the subtree is just a string that begins with the subtree’s parent’s materialized path, it’s easily searched with a LIKE function and a % wildcard in the WHERE clause. It’s important that the LIKE search string includes the comma before the % wildcard; otherwise, searching for 1,263% would find 1,2635, which would be an error, of course. Searching up the hierarchy with materialized path Searching up the hierarchy means searching for the all the ancestors, or the chain of command, for a given node. The nice thing about a materialized path is that the full list of ancestors is right there in the materialized path. There’s no need to read any other rows. Therefore, to get the parent nodes, you need to parse the materialized path to return the IDs of each parent node and then join to this set of IDs to get the parent nodes. The trick is to extract it quickly. Unfortunately, SQL Server lacks a simple split function. There are two options: build a CLR function that uses the C# split function or build a T-SQL scalar user-defined function to parse the string. 422 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 423 Traversing Hierarchies 17 A C# CLR function to split a string is a relatively straightforward task: using Microsoft.SqlServer.Server; using System.Data.SqlClient; using System;using System.Collections; public class ListFunctionClass { [SqlFunction(FillRowMethodName = "FillRow", TableDefinition = "list nvarchar(max)")] public static IEnumerator ListSplitFunction(string list) { string[] listArray = list.Split(new char[] {’,’}); Array array = listArray; return array.GetEnumerator(); } public static void FillRow(Object obj, out String sc) { sc = (String)obj; } } Adam Machanic, SQL Server MVP and one of the sharpest SQL Server programmers around, went on a quest to write the fastest CLR split function possible. The result is posted on SQLBlog.com at http://tinyurl.com/dycmxb. But I’m a T-SQL guy, so unless there’s a compelling need to use CLR, I’ll opt for T-SQL. There are a number of T-SQL string-split solutions available. I’ve found that the performance depends on the length of the delimited strings. Erland Sommerskog’s website analyzes several T-SQL split solutions: http://www.sommarskog.se/arrays-in-sql-2005.html. Of Erland’s solutions, the one I prefer for shorter length strings such as these is in the ParseString user-defined function: up the hierarchy parse the string CREATE alter FUNCTION dbo.ParseString (@list varchar(200)) RETURNS @tbl TABLE (ID INT) AS BEGIN code by Erland Sommarskog Erland’s Website: http://www.sommarskog.se/arrays-in-sql-2005.html 423 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 424 Part III Beyond Relational DECLARE @valuelen int, @pos int, @nextpos int SELECT @pos = 0, @nextpos = 1 WHILE @nextpos > 0 BEGIN SELECT @nextpos = charindex(’,’, @list, @pos + 1) SELECT @valuelen = CASE WHEN @nextpos > 0 THEN @nextpos ELSE len(@list) + 1 END - @pos - 1 INSERT @tbl (ID) VALUES (substring(@list, @pos + 1, @valuelen)) SELECT @pos = @nextpos END RETURN END go SELECT ID FROM HumanResources.Employee CROSS APPLY dbo.ParseString(MaterializedPath) WHERE BusinessEntityID = 270 go DECLARE @MatPath VARCHAR(200) SELECT @MatPath = MaterializedPath FROM HumanResources.Employee WHERE BusinessEntityID = 270 SELECT E.BusinessEntityID, MaterializedPath FROM dbo.ParseString(@MatPath) JOIN HumanResources.Employee E ON ParseString.ID = E.BusinessEntityID ORDER BY MaterializedPath Is the node in the subtree? Because the materialized-path pattern is so efficient at finding subtrees, the best way to determine whether a node is in a subtree is to reference the WHERE-like subtree query in a WHERE clause, similar to the adjacency list solution: 424 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 425 Traversing Hierarchies 17 Does 270 work for 263 SELECT ‘True’ WHERE 270 IN (SELECT BusinessEntityID FROM HumanResources.Employee WHERE MaterializedPath LIKE ‘1,263,%’) Determining the node level Determining the current node level using the materialized-path pattern is as simple as counting the commas in the materialized path. The following function uses CHARINDEX to locate the commas and make quickworkofthetask: CREATE FUNCTION MaterializedPathLevel (@Path VARCHAR(200)) RETURNS TINYINT AS BEGIN DECLARE @Position TINYINT = 1, @Lv TINYINT = 0; WHILE @Position >0 BEGIN; SET @Lv += 1; SELECT @Position = CHARINDEX(’,’, @Path, @Position + 1 ); END; RETURN @Lv - 1 END; Testing the function: SELECT dbo.MaterializedPathLevel(’1,20,56,345,1010’) As Level Result: Level 6 A function may be easily called within an update query, so pre-calculating and storing the level is a triv- ial process. The next script adds a Level column, updates it using the new function, and then takes a look at the data: ALTER TABLE HumanResources.Employee ADD Level TINYINT UPDATE HumanResources.Employee SET Level = dbo.MaterializedPathLevel(MaterializedPath) 425 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 426 Part III Beyond Relational SELECT BusinessEntityID, MaterializedPath, Level FROM HumanResources.Employee Result (abbreviated): BusinessEntityID MaterializedPath Level 11, 1 2 1,2, 2 3 1,2,3, 3 4 1,2,3,4, 4 5 1,2,3,5, 4 6 1,2,3,6, 4 7 1,2,3,7, 4 8 1,2,3,7,8, 5 9 1,2,3,7,9, 5 10 1,2,3,7,10, 5 Storing the level can be useful; for example, being able to query the node’s level makes writing single-level queries significantly easier. Using the function in a persisted calculated column with an index works great. Single-level queries Whereas the adjacency list pattern was simpler for doing single-level queries, rather than returning complete subtrees, the materialized-path pattern excels at returning subtrees, but it’s more difficult to return just a single level. Although neither solution excels at returning a specific level in a hierarchy on its own, it is possible with the adjacency pattern but requires some recursive functionality. For the materialized-path pattern, if the node’s level is also stored in table, then the level can be easily added to the WHERE clause, and the queries become simple. This query locates all the nodes one level down from the CEO. The CTE locates the MaterializedPath and the Level for the CEO, and the main query’s join conditions filter the query to the next level down: Query Search 1 level down WITH CurrentNode(MaterializedPath, Level) AS (SELECT MaterializedPath, Level FROM HumanResources.Employee WHERE BusinessEntityID = 1) SELECT BusinessEntityID, ManagerID, E.MaterializedPath, E.Level FROM HumanResources.Employee E JOIN CurrentNode C ON E.MaterializedPath LIKE C.MaterializedPath + ‘%’ AND E.Level = C.Level + 1 426 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 427 Traversing Hierarchies 17 Result: BusinessEntityID ManagerID MaterializedPath Level 16 1 1,16, 2 2 1 1,2, 2 234 1 1,234, 2 25 1 1,25, 2 263 1 1,263, 2 273 1 1,273, 2 An advantage of this method over the single join method used for finding single-level queries for the adjacency list pattern is that this method can be used to find any specific level, not just the nearest level. Locating the single-level query up the hierarchy is the same basic outer query, but the CTE/subquery uses the up-the-hierarchy subtree query instead, parsing the materialized path string. Reparenting the materialized path Because the materialized-path pattern stores the entire tree in the materialized path value in each node, when the tree is modified by inserting, updating, or deleting a node, the entire affected subtree must have its materialized path recalculated. Each node’s path contains the path of its parent node, so if the parent node’s path changes, so do the children. This will propagate down and affect all descendants of the node being changed. The brute force method is to reexecute the user-defined function that calculates the materialized path. A more elegant method, when it applies, is to use the REPLACE T-SQL function. Indexing the materialized path Indexing the materialized path requires only a non-clustered index on the materialized path column. Because the level column is used in some searches, depending on the usage, it’s also a candidate for a non-clustered index. If so, then a composite index of the level and materialized path columns would be the best-performing option. Materialized path pros and cons There are some points in favor of the materialized-path pattern: ■ The strongest point in its favor is that in contains the actual references to every node in its hierarchy. This gives the pattern considerable durability and consistency. If a node is deleted or updated accidentally, the remaining nodes in its subtree are not orphaned. The tree can be reconstructed. If Jean Trenary is deleted, the materialized path of the IT department employees remains intact. 427 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 428 Part III Beyond Relational ■ The materialized-path pattern is the only pattern that can retrieve an entire subtree with a single index seek. It’s wicked fast. ■ Reading a materialized path is simple and intuitive. The keys are there to read in plain text. On the down side, there are a number of issues, including the following: ■ The key sizes can become large; at 10 levels deep with an integer key, the keys can be 40–80 bytes in size. This is large for a key. ■ Constraining the hierarchy is difficult without the use of triggers or complex check constraints. Unlike the adjacency list pattern, you cannot easily enforce that a parent node exists. ■ Simple operations like ‘‘get me the parent node’’ are more complex without the aid of helper functions. ■ Inserting new nodes requires calculating the materialized path, and reparenting the materialized path requires recalculating the materialized paths for every node in the affected subtree. For an OLTP system this can be a very expensive operation and lead to a large amount of contention. Offloading the maintenance of the hierarchy to a background process can alleviate this. An option is to combine adjacency and path solutions; one provides ease of maintenance and one provides performance for querying. The materialized path is my favorite hierarchy pattern and the one I use in Nordic (my SQL Server object relational façade) to store the class structure. Using the New HierarchyID For SQL Server 2008, Microsoft has released a new data type targeted specifically at solving the hierarchy problem. Working through the materialized-path pattern was a good introduction to HierarchyID because HierarchyID is basically a binary version of materialized path. HierarchyID is implemented as a CLR data type with CLR methods, but you don’t need to enable CLR to use HierarchyID. Technically speaking, the CLR is always running. Disabling the CLR only disables installing and running user-programmed CLR assemblies. To jump right into the HierarchyID, this first query exposes the raw data. The OrganizationalNode column in the HumanResources.Employee table is a HierarchyID column. The second column simply returns the binary data from OrganizationalNode.The third column, HierarchyID.ToString() uses the .ToString() method to converrt the HierarchyID data to text. The column returns the values stored in a caluculated column that’s set to the .getlevel() method: View raw HierarchyID Data SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’, OrganizationNode, OrganizationNode.ToString() as ‘HierarchyID.ToString()’, OrganizationLevel FROM HumanResources.Employee E 428 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 429 Traversing Hierarchies 17 JOIN Person.Person P ON E.BusinessEntityID = P.BusinessEntityID Result (abbreviated): BusinessEntityID OrganizationNode HierarchyID.ToString() OrganizationLevel 10x/ 0 2 0x58 /1/ 1 16 0x68 /2/ 1 25 0x78 /3/ 1 234 0x84 /4/ 1 263 0x8C /5/ 1 273 0x94 /6/ 1 3 0x5AC0 /1/1/ 2 17 0x6AC0 /2/1/ 2 In the third column, you can see data that looks similar to the materialized path pattern, but there’s a significant difference. Instead of storing a delimited path of ancestor primary keys, HierarchyID is intended to store the relative node position, as shown in Figure 17-6. FIGURE 17-6 The AdventureWorks Information Services Department with HierarchyID nodes displayed Adventure Works 2008 Information Service Department 1 Ken Sánchez / 263 Jean Trenary /5/ 264 Stephanie Conroy /5/1/ 270 François Ajenstat /5/5/ 271 Dan Wilson /5/6/ 266 Peter Connelly /5/1/2/ 265 Ashvini Sharma /5/1/1/ 267 Karen Berg /5/2/ 268 Ramesh Meyyappan /5/3/ 269 Dan Bacon /5/4/ 272 Janaina Bueno /5/7/ 429 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 430 Part III Beyond Relational Walking through a few examples in this hierarchy, note the following: ■ TheCEOistherootnode,sohis HierarchyID is just /. ■ If all the nodes under Ken were displayed, then Jean would be the fifth node. Her relative node position is the fifth node under Ken, so her HierarchyID is /5/. ■ Stephanie is the first node under Jean, so her HierarchyID is /5/1/. ■ Ashivini is the first node under Stephanie, so his node is /5/1/1/. Selecting a single node Even though HierarchyID stores the data in binary, it’s possible to filter by a HierarchyID data type column in a WHERE clause using the text form of the data: SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’, E.JobTitle FROM HumanResources.Employee E JOIN Person.Person P ON E.BusinessEntityID = P.BusinessEntityID WHERE OrganizationNode = ‘/5/5/’ Result: BusinessEntityID Name JobTitle 270 Fran ¸ cois Ajenstat Database Administrator Scanning for ancestors Searching for all ancestor nodes is relatively easy with HierarchyID. There’s a great CLR method, IsDescendantOf(), that tests any node to determine whether it’s a descendant of another node and returns either true or false. The following WHERE clause tests each row to determine whether the @EmployeeNode is a descendent of that row’s OrganizationNode: WHERE @EmployeeNode.IsDescendantOf(OrganizationNode) = 1 The full query returns the ancestor list for François. The script must first store François’ HierarchyID value in a local variable. Because the variable is a HierarchyID,theIsDescendantOf() method may be applied. The fourth column displays the same test used in the WHERE clause: DECLARE @EmployeeNode HierarchyID SELECT @EmployeeNode = OrganizationNode FROM HumanResources.Employee WHERE OrganizationNode = ‘/5/5/’ Fran ¸ cois Ajenstat the DBA SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’, E.JobTitle, @EmployeeNode.IsDescendantOf(OrganizationNode) as Test FROM HumanResources.Employee E JOIN Person.Person P ON E.BusinessEntityID = P.BusinessEntityID WHERE @EmployeeNode.IsDescendantOf(OrganizationNode) = 1 430 www.getcoolebook.com Nielsen c17.tex V4 - 07/21/2009 12:57pm Page 431 Traversing Hierarchies 17 Result: BusinessEntityID Name JobTitle Test 1 Ken S ´ anchez Chief Executive Officer 1 263 Jean Trenary Information Services Manager 1 270 Fran ¸ cois Ajenstat Database Administrator 1 Performing a subtree search The IsDescendantOf() method is easily flipped around to perform a subtree search locating all descendants. The trick is that either side of the IsDescendantOf() method can use a variable or column. In this case the variable goes in the parameter and the method is applied to the column. The result is the now familiar AdventureWorks Information Service Department: DECLARE @ManagerNode HierarchyID SELECT @ManagerNode = OrganizationNode FROM HumanResources.Employee WHERE OrganizationNode = ‘/5/’ Jean Trenary - IT Manager SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’, OrganizationNode.ToString() as ‘HierarchyID.ToString()’, OrganizationLevel FROM HumanResources.Employee E JOIN Person.Person P ON E.BusinessEntityID = P.BusinessEntityID WHERE OrganizationNode.IsDescendantOf(@ManagerNode)=1 Result: BusinessEntityID Name HierarchyID.ToString() OrganizationLevel 263 Jean Trenary /5/ 1 264 Stephanie Conroy /5/1/ 2 265 Ashvini Sharma /5/1/1/ 3 266 Peter Connelly /5/1/2/ 3 267 Karen Berg /5/2/ 2 268 Ramesh Meyyappan /5/3/ 2 269 Dan Bacon /5/4/ 2 270 Fran ¸ cois Ajenstat /5/5/ 2 271 Dan Wilson /5/6/ 2 272 Janaina Bueno /5/7/ 2 Single-level searches Single-level searches were presented first for the adjcency list pattern because they were the simpler searches. For HierarchyID searches, a single-level search is more complex and builds on the previous searches. In fact, a single-level HierarchyID search is really nothing more than an IsDescendantOf() search with the organizational level filter in the WHERE clause. 431 www.getcoolebook.com . relatively straightforward task: using Microsoft. SqlServer .Server; using System.Data.SqlClient; using System;using System.Collections; public class ListFunctionClass { [SqlFunction(FillRowMethodName =. pattern and the one I use in Nordic (my SQL Server object relational façade) to store the class structure. Using the New HierarchyID For SQL Server 2008, Microsoft has released a new data type. (String)obj; } } Adam Machanic, SQL Server MVP and one of the sharpest SQL Server programmers around, went on a quest to write the fastest CLR split function possible. The result is posted on SQLBlog.com at http://tinyurl.com/dycmxb. But

Định dạng
Số trang	10
Dung lượng	497,88 KB