Microsoft SQL Server 2008 R2 Unleashed- P190 potx

ptg 1884 CHAPTER 47 Using XML in SQL Server 2008 <ScrapReason>12 <! Comment: Name = Thermoform temperature too high > <?ModDatePI 1998-06-01T00:00:00?> <WorkOrders WorkOrderIds=”72370 72273 70875 69474 69173 68573 65970 60472 56975 56875 55275 53771 50370 47670 45773 42071 41975 39372 36673 36671 32872 32775 32770 31073 29370 27771 24174 22673 22670 17674 16073 13073 10274 9071 7771 4972 2573” /> </ScrapReason> </ScrappedWorkOrders> Let’s review the selected columns in Listing 47.11: the first is aliased with the asterisk (*) character. This character tells SQL Server to inline-generate the data for that column (as text). (Using the text() node test would do the same in this case.) Next, the comment() node test is specified for Name, telling the XML generator to output its value in a comment. For clarity’s sake, we added a little syntactic sugar in this statement by prepending the text ’Comment: Name = ‘ to the value produced inside the comment. Next, the processing-instruction() node test is specified to output each value of ModifiedDate to a new processing instruction called ModDatePI. Finally, the fourth column is produced as a list of WorkOrderId values, using the magical data() keyword in a nested FOR XML PATH statement. data() tells SQL Server to generate a space-delimited list of atomic column values, one value for each row in the result set. Note that the nested query is merely used to generate a list of WorkOrderId values. The empty string is given for the PATH keyword, telling the XML engine not to generate a default element at all, so no XML is generated whatsoever! You can extract and test the statement to see this in action. The nested query applies the same WHERE clause as its parent to filter WorkOrderId values where the value of ScrapReasonId is 12. This ensures the relevancy of the nested data to the outer query. The resulting list of values is grafted to the XML of the outer statement, using the column alias ’WorkOrders/@WorkOrderIds’. FOR XML and the xml Data Type By default, the results of any FOR XML query (using all four modes) is streamed to output as a one-column/one-row dataset with a column named XML_F52E2B61-18A1-11d1-B105- 00805F49916B of type nvarchar(max). (In SQL Server 2000, this was a stream of XML split into multiple varchar(8000) rows.) One of the biggest limitations of SQL Server 2000’s XML production was the inability to save the results of a FOR XML query to a variable or store it in a column directly without using some middleware code to first save the XML as a string and then insert it back into an ntext or nvarchar column and then select it out again. ptg 1885 Relational Data As XML: The FOR XML Modes 47 Today, SQL Server 2008 natively supports column storage of XML, using the xml data type. Be sure to read the section “Using the xml Data Type,” later in this chapter, for a complete overview. You can easily convert FOR XML results to instances of xml by using the TYPE directive with all four modes ( RAW, AUTO, EXPLICIT, and PATH). Listing 47.12 demonstrates the use of FOR XML PATH with the TYPE directive. LISTING 47.12 Using FOR XML PATH, TYPE to Create an Instance of the xml Data Type SELECT * FROM Production.WorkOrder WorkOrder WHERE ScrapReasonId = 12 AND WorkOrderId = 72370 FOR XML RAW(‘WorkOrder’), ELEMENTS XSINIL, ROOT(‘WorkOrders’), TYPE go <WorkOrders xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”> <WorkOrder> <WorkOrderID>72370</WorkOrderID> <ProductID>329</ProductID> <OrderQty>48</OrderQty> <StockedQty>47</StockedQty> <ScrappedQty>1</ScrappedQty> <StartDate>2008-07-01T00:00:00</StartDate> <EndDate>2008-07-11T00:00:00</EndDate> <DueDate>2008-07-12T00:00:00</DueDate> <ScrapReasonID>12</ScrapReasonID> <ModifiedDate>2008-07-11T00:00:00</ModifiedDate> </WorkOrder> </WorkOrders> Notice that in contrast to the preceding FOR XML examples, in this example, the query window in SQL Server Management Studio (SSMS) no longer displays the lengthy XML column UUID in the results frame, nor on the window tab. The results have been cast to a single instance of the xml data type, ready for use in variables of type xml, in subsequent queries, inserted into xml columns, or returned to the client. The five xml data type methods—value(), exist(), nodes(), query(), and modify(), discussed later in this chapter, in the section “The Built-in xml Data Type Methods”—can be intermixed with relational queries by using all FOR XML modes. This makes it even easier to shape your XML exactly the way you want. Listing 47.13 demonstrates how you can nest XQuery queries inside regular FOR XML T- SQL to produce XML documents built from both relational and XML sources. ptg 1886 CHAPTER 47 Using XML in SQL Server 2008 LISTING 47.13 Bridging the Gap Between Relational and XML Data by Using FOR XML PATH and the xml Data Type SELECT FirstName, LastName, E.JobTitle, Resume.query( ‘declare namespace ns=”http://schemas.microsoft.com/sqlserver/2004/07/ adventure-works/Resume”; //ns:Education ‘ ) ‘*’ FROM HumanResources.Employee E JOIN Person.Person C on E.BusinessEntityID = C. BusinessEntityID JOIN HumanResources.JobCandidate J on J. BusinessEntityID = E. BusinessEntityID WHERE J.JobCandidateId = 8 FOR XML PATH(‘AWorthyJobCandidate’), TYPE go <AWorthyJobCandidate> <FirstName>Peng</FirstName> <LastName>Wu</LastName> <Title>Quality Assurance Supervisor</Title> <ns:Education xmlns:ns=”http://schemas.microsoft.com/sqlserver/2004/07/adventureworks/Resume”> <ns:Edu.Level> </ns:Edu.Level> <ns:Edu.StartDate>1986-09-15Z</ns:Edu.StartDate> <ns:Edu.EndDate>1990-05-15Z</ns:Edu.EndDate> <ns:Edu.Degree>Bachelor of Science</ns:Edu.Degree> <ns:Edu.Major> </ns:Edu.Major> <ns:Edu.Minor /> <ns:Edu.GPA>3.3</ns:Edu.GPA> <ns:Edu.GPAScale>4</ns:Edu.GPAScale> <ns:Edu.School>Western University</ns:Edu.School> <ns:Edu.Location> <ns:Location> <ns:Loc.CountryRegion>US </ns:Loc.CountryRegion> <ns:Loc.State>WA </ns:Loc.State> <ns:Loc.City>Seattle</ns:Loc.City> </ns:Location> </ns:Edu.Location> </ns:Education> </AWorthyJobCandidate> ptg 1887 XML As Relational Data: Using OPENXML 47 In this example, the asterisk (*) is used as a column alias for the results of the nested query (on HumanResources.JobCandidate.Resume), telling SQL Server to simply output the XML inline with the other nodes. XML As Relational Data: Using OPENXML This section covers what might be called the inverse of FOR XML: OPENXML. You use OPENXML in T-SQL queries to read XML data and shred (or decompose) it into relational result sets. OPENXML is part of the SELECT statement, and you use it to generate a table from an XML source. The first step required in this process is a call to the system stored procedure sp_xml_preparedocument. sp_xml_preparedocument creates an in-memory representation of any XML document tree for use in querying. It takes the following parameters: . An integer output parameter for storing a handle to the document tree . The XML input data . An optional XML namespace declaration, used in subsequent OPENXML queries sp_xml_preparedocument is able to convert the following data types into internal XML objects: text, ntext, varchar, nvarchar, single-quoted literal strings, and untyped XML (data from an xml column having no associated schema collection). This is its syntax: sp_xml_preparedocument integer_variable OUTPUT[, xmltext ][, xpath_namespaces ] And here is an example of OPENXML in use: DECLARE @XmlDoc XML, @iXml int SET @XmlDoc = ‘ <ex:ExampleDoc xmlns:ex=”urn:www-samspublishing-com:examples”> <ex:foo>hello</ex:foo> <ex:bar>sql!</ex:bar> </ex:ExampleDoc>’ EXEC sp_xml_preparedocument @iXml OUTPUT, @XmlDoc, ‘<ExampleDoc xmlns:ex=”urn:www-samspublishing-com:examples”/>’ SELECT id, parentid, nodetype, localname, prefix FROM OPENXML(@iXml, ‘/ex:ExampleDoc/ex:foo’) WITH (foo varchar(10) ‘/ex:ExampleDoc/ex:foo’) EXEC sp_xml_removedocument @iXml go ptg 1888 CHAPTER 47 Using XML in SQL Server 2008 id parentid nodetype localname prefix 3 0 1 foo ex 5 3 3 #text NULL Notice in the example that the WITH predicate has been commented out. This is to illus- trate in the query results what is known as an edge table: the XML document in its relational form. Edge is a term taken from graph theory. It refers to what you might visualize as a depth line between two nodes. If the edge table looks familiar, the reason is probably that it bears a resemblance to the universal table that must be created for EXPLICIT mode. As with the universal table, the edge table follows the adjacency list model for its hierarchical relationships. The node types of the input XML are marked in the nodetype column (1 = element, 2 = attribute, 3 = text). Namespaces are stored in namespaceuri, and the data of each node is stored in the text column. If you uncomment the WITH predicate and change the query from SELECT * to SELECT foo, you get back a one-row/one-column table with a column called foo that has the varchar(10) value hello. This shows that the WITH predicate instructs OPENXML how to decompose the nodes to columns by using XPath syntax. The syntax for OPENXML (including the WITH predicate) is as follows: OPENXML(integer_document_handle_variable int, rowpattern nvarchar,[flags byte]) [WITH (SchemaDeclaration | TableName)] Let’s match this syntax with the values in the example: . The first parameter is the local variable @iXml, which acts as a handle to the internal XML representation. . The next parameter is a row pattern in XPath syntax that tells OPENXML how to select nodes into rows. OPENXML generates one row in the result set for each node that matches this row pattern. This is similar to the .NET XmlDocument object’s SelectNodes() method, insofar as every matching node in rowpattern returns a row in the rowset. . The result set’s columns are then defined, using matching nodes as the context and the XPath in the column definitions of the WITH predicate to find the values relative to the node. . The flags parameter is a combinable byte value that controls how the selected XML nodes are to be decomposed. The following values are possible: . 0—Uses attribute-centric decomposition. In this case, each attribute in the source XML is decomposed into a column. This is the default. . 1—Uses attribute-centric decomposition. May be combined with flag 2 (that is, the value 3 may be specified). Combining flags 1 and 2 tells the rowset generator how to deal with the values in the XML not yet accounted for in the down- ptg 1889 XML As Relational Data: Using OPENXML 47 ward parse of the XML document from nodes into rows. In other words, attribute-centric decomposition takes place before element-centric decomposition. This point is important because without the combinability of the flags, only one or the other decomposition will happen, and (lacking a WITH predicate that captures all the nodes) some nodes would not make it into the rowset. . 2—Uses element-centric decomposition. Combinable with flag 1 (that is, specify 3). . 8—Tells the rowset generator how to deal with text data in the metaproperties (not covered in this chapter). Can be combined with flags 1, 2, or both. Note that the column generation determined by the flags 0, 1, and 2 can all be overridden by the XPath expressions expressed in the lines of the WITH predicate. For example, if the 1 flag is specified to map a particular attribute to a column, but in the line of the WITH predicate for that same column, the XPath maps the value from an XML element, the WITH predicate takes precedence. It’s truly best to just set the value of flags to 3 in most cases, unless you care to ignore attributes or elements for some reason. The syntax of the WITH predicate tells the rowset generator which column names and data types to use when mapping the XML to rows. If the structure of the input XML matches the schema of a particular table in your database, the name of that table may be specified. An example of this case occurs when the input XML has been produced from an existing table, using FOR XML. The values in the FOR XML-produced document have been updated, and the new values need to make it back into the table. The following code example illus- trates this common scenario: DECLARE @JobCandidateXmlDoc XML, @iXml int SET @JobCandidateXmlDoc = ‘ <JobCandidateUpdate> <ModifiedDate> 10/5/2008 12:34PM </ModifiedDate> </JobCandidateUpdate>’ EXEC sp_xml_preparedocument @iXml OUTPUT, @JobCandidateXmlDoc, ‘<JobCandidateUpdate xmlns:ns=”http://schemas.microsoft.com/sqlserver/2004/07/adventureworks/Resume”/>’; UPDATE HumanResources.JobCandidate SET ModifiedDate = OXML.ModifiedDate FROM ( SELECT * FROM OPENXML(@iXml, ‘/JobCandidateUpdate’, 2) WITH HumanResources.JobCandidate ) AS OXML ptg 1890 CHAPTER 47 Using XML in SQL Server 2008 WHERE JobCandidateId = 8 EXEC sp_xml_removedocument @iXml go (1 row(s) affected) If a table name is not specified, you need to specify a comma-separated list of lines, using the following syntax: column_name datatype ‘XPath’ The following list explains each part of the preceding syntax: . column_name—Provides a relational name for the XML-produced column. . datatype—Provides a T-SQL data type for the XML-produced column. . ’XPath’—Specifies a row pattern that matches the nodes in the XML whose values are to be mapped to the XML-produced column. When you’re done reading out the XML, it’s important to free the memory used to hold the internal XML document. You accomplish this by calling the system stored procedure sp_xml_removedocument, as in the following example: EXEC sp_xml_removedocument @iXml Using the xml Data Type The xml data type is a real problem solver for those who use both XML and SQL Server on a daily basis. Relational columns and XML data can be stored side by side in the same table, in an implementation that plays to the strengths of both. With SQL Server’s power- ful XML storage, validation, querying, and indexing capabilities, it’s bound to cause quite a stir in the field of XML content management and beyond. Some of the benefits of storing XML on the database tier can be realized immediately. Building middleware using the .NET Framework to manage XML stored in columns, rather than on the filesystem, is a far more robust solution than depending on the filesystem; plus, it’s a lot easier to access the content from anywhere. SQL Server inherently provides to stored XML the traditional DBMS benefits of backup and restoration, replication and failover, query optimization, granular locking, indexing, and content validation. The xml data type can be used with local variable declarations, as the output of user-defined functions, as input parameters to stored procedures and functions, and much more. XML instances containing up to 128 levels of nesting can be stored in xml columns; deeper instances cannot be inserted, nor may existing instances be made to increase beyond this depth via the modify() data type method. xml columns can also be used to store code files such as XSLT, XSD, XHTML, and any other well-formed content. These files can then be retrieved by user-defined functions written in managed code hosted by SQL Server. (See Chapter 53, “SQL Server 2008 Reporting Services,” for a full review of SQL Server–managed hosting.) ptg 1891 Using the xml Data Type 47 NOTE In some cases, it’s still a perfectly valid scenario to store XML on the filesystem or in [n]varchar(max), [n]text, or [n]varbinary(max) columns. In a few cases this usage is actually recommended. The following summary details some possible XML usage scenarios and makes suggestions for each. XML data is stored in an internal binary format and can be up to 2GB in size. Before we dig into the many uses of the xml data type, it’s worthwhile to consider some of the different ways you can leverage your institution’s XML with SQL Server: . XML can be used solely as a temporary output format produced from relational data, using FOR XML. This applies in scenarios in which the relational tables hold the real- time data and XML is produced only for read-only application uses, as in the display of dynamic web pages. In this scenario, the XML really just provides a DBMS-inde- pendent, easy-to-transform view of the data. . XML can be stored in relational ( nvarchar and so on) columns, as done previously. This might be the best option when your XML is sometimes not well formed or when the learning curve to XQuery is too high for an application-delivery time frame. This is also a valuable option when the byte-for-byte exactness of the XML must be preserved. Note that the latter is a necessary option in some institutions because typed XML (that is, xml data type columns associated with a schema collection) storage disre- gards extra whitespace characters, namespace prefixes, attribute order, and the XML declaration to make way for query optimizations. This scenario also leverages fast data retrieval because, as far as SQL Server is concerned, XML is never brought into the mix (it’s all relational). The data can still be converted to the xml data type, using the methods described earlier, and applications can use OPENXML to read it as well. To read XML into SQL Server from server-side accessible files, you call the T- SQL OPENROWSET function. . The XML can be stored as untyped XML—that is, XML stored in an xml data type column lacking an associated schema collection. This provides the benefits of querying the XML using the data type methods (discussed later in the section “The Built- in xml Data Type Methods”) and provides server-side checks for well-formed XML. This scenario also allows for the possibility that XML adhering to any (or no) schemas may reside in the column. A schema collection could be added later to pro- vide validation on the existing data (although a few intermediate editing steps may be necessary if any documents fail to validate). Safely armed with an understanding of some of the different options and uses, let’s plunge into our discussion of xml. ptg 1892 CHAPTER 47 Using XML in SQL Server 2008 Defining and Using xml Columns You can add columns of type xml to any table by using a familiar Data Definition Language (DDL) syntax, with a few new twists. Much like their relational counterparts, xml columns, parameters, and variables may contain null or non-null values. The following snippet shows the DDL used to create the table HumanResources.JobCandidate from AdventureWorks2008. The column you are concerned with is Resume: CREATE TABLE [HumanResources].[JobCandidate]( [JobCandidateID] [int] IDENTITY(1,1) NOT NULL, [EmployeeID] [int] NULL, [Resume] [xml](CONTENT [HumanResources].[HRResumeSchemaCollection]) NULL, [ModifiedDate] [datetime] NOT NULL CONSTRAINT [DF_JobCandidate_ModifiedDate] DEFAULT (getdate()), CONSTRAINT [PK_JobCandidate_JobCandidateID] PRIMARY KEY CLUSTERED ( [JobCandidateID] ASC ) ON [PRIMARY] ) ON [PRIMARY] When you are defining objects of type xml, either of two facets may be applied: . CONTENT—This facet specifies that well-formed XML documents as well as fragments may be inserted into the xml column or variable. (CONTENT is the default and may be omitted from the definition.) Fragments may have more than one top-level node (as is produced, by default, using FOR XML), and elements may be mixed with text-only nodes. . DOCUMENT—This facet specifies that only well-formed, valid XML conforming to a specified schema collection may be stored. Updates to the column must also result in schema-valid, well-formed XML. XML schema collections can be associated with xml variables, parameters, or columns. The name of the schema collection is specified directly after the chosen facet, as is done in JobCandidate.Resume. The following code example defines a typed xml local variable that allows only valid Resume data to be stored in it: DECLARE @ValidWellFormed xml (DOCUMENT HumanResources.HRResumeSchemaCollection) Trying to insert the following well-formed but invalid document throws an error that says the first (and only) ThisBlowsUp element in the document is not declared in any of the schemas in HRResumeSchemaCollection: SELECT @ValidWellFormed = ‘<ThisBlowsUp/>’ go ptg 1893 Using the xml Data Type 47 XML Validation: Declaration not found for element ‘ThisBlowsUp’. Location:/*:ThisBlowsUp[1] When you change the facet to CONTENT (the default) and remove the schema association, the following is possible: DECLARE @WellFormed xml SELECT @WellFormed = ‘<ThisWorks/>’ go Command(s) completed successfully. When defining xml columns, you can specify defaults and constraints just as you do with relational columns. Consider the following example: CREATE TABLE XmlExample ( XmlColumn xml NOT NULL DEFAULT CONVERT(xml,’<root/>’,0) ) This example creates an xml column called XmlColumn that starts out having an empty root node. Notice how the string ’<root/>’ is converted to the xml type. This is actually not necessary because conversions from literal strings and from varchar to xml are implicit. The next example adds a table-level constraint to XmlColumn to make sure the root node always exists. It depends on a scalar-valued user-defined function to do its validation work: CREATE FUNCTION dbo.fn_XmlColumnNotNull ( @XmlColumnValue xml ) RETURNS bit AS BEGIN RETURN @XmlColumnValue.exist(‘/root’) END GO CREATE TABLE XmlExample ( XmlColumn xml NOT NULL DEFAULT CONVERT(xml,’<root/>’,0) ) GO ALTER TABLE XmlExample WITH CHECK ADD CONSTRAINT CK_XmlExample_HasRoot CHECK (dbo.fn_XmlColumnNotNull(XmlColumn) = 1) The following statement thus fails: INSERT XmlExample SELECT ‘<foo/>’ . user-defined functions written in managed code hosted by SQL Server. (See Chapter 53, SQL Server 2008 Reporting Services,” for a full review of SQL Server managed hosting.) ptg 1891 Using the xml Data. XQuery queries inside regular FOR XML T- SQL to produce XML documents built from both relational and XML sources. ptg 1886 CHAPTER 47 Using XML in SQL Server 2008 LISTING 47.13 Bridging the Gap Between. XML and SQL Server on a daily basis. Relational columns and XML data can be stored side by side in the same table, in an implementation that plays to the strengths of both. With SQL Server s

Định dạng
Số trang	10
Dung lượng	201,2 KB