Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 442 Part III Beyond Relational creating tables with XML columns, and allows declaring XML variables and using them as parameters and return values. XQuery is a W3C-recommended language created to query and format XML documents. XQuery can be used to query XML documents just like a SQL query is used to retrieve information from relational tables. The XML data type implements a limited subset of the XQuery specification and a T-SQL query can use XQuery to retrieve information from XML columns or variables. XQuery is built into the Relational Engine of SQL Server, and the Query Optimizer can build query plans that contain relational query operations as well as XQuery operations. Results of XQuery opera- tions can be joined with relational data, or relational data can be joined with XQuery results. SQL Server supports creating special types of indexes on XML columns to optimize XQuery operations. XML Schema Definition (XSD) is another W3C-recommended language created for describing and vali- dating XML documents. XSD supports creating very powerful and complex validation rules that can be applied to XML documents to verify that they are fully compliant with the business requirements. The XML data type supports XSD validation and is explained later in this chapter. The XML data type supports a number of methods, listed here: ■ value() ■ exist() ■ query() ■ modify() ■ nodes() Each of these methods is explained in detail later in this chapter. Typed and untyped XML As mentioned earlier, support for XSD schema validation is implemented in SQL Server in the form of XML schema collections. An XML schema collection can be created from an XML schema definition. XML columns or variables can be bound to an XML schema collection. An XML column or variable that is bound to an XML schema collection is known as typed XML. When a typed XML value is modified, SQL Server validates the new value against the rules defined in the XML schema collection. The assignment or modification operation will succeed only if the new value passes the validations defined in the XML schema collection. Typed XML has a number of advantages over untyped XML columns or variables. The most important benefit is that the validation constraints are always respected. The content of a typed XML document is always valid as per the schema with which it is associated. With typed XML, SQL Server has better knowledge of the XML document (structure, data types, and so on) and can generate a more optimized query plan. Because SQL Server has complete knowledge o f the data types of elements and attributes, storage of typed XML can be made significantly more compact than untyped XML. Static type checking is possible with typed XML documents, and SQL Server can detect, at compile time, if an XQuery expression on a typed XML document is mistyped. Stored procedures or functions that 442 www.getcoolebook.com Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 443 Manipulating XML Data 18 accept typed XML parameters are protected from receiving invalid XML documents, as SQL Server will perform implicit validation of the XML value against the schema collection before accepting the parame- ter value. Creating and using XML columns The XML data type can be used like other native SQL Server data types in most cases. (Note that there are exceptions, however. For example, an XML column cannot be added as a column to a regular index or used in a comparison operation.) A table can be created with one or more XML columns, or XML columns can be added to an existing table. VARCHAR/NVARCHAR/VARBINARY/TEXT/NTEXT columns can be altered to XML data type columns if all the existing values are well-formed XML values. Entire XML documents can be retrieved as part o f a SELECT query, or specific information can be extracted from within the XML documents. The following example shows a SELECT query that selects a column from a table and a value from the XML document stored in each row: DECLARE @t TABLE (OrderID INT,OrderData XML ) INSERT INTO @t(OrderID, OrderData) SELECT 1, ‘<CustomerNumber>1001</CustomerNumber> <Items> <Item ItemNumber="1001" Quantity="1" Price="950"/> <Item ItemNumber="1002" Quantity="1" Price="650" /> </Items>’ SELECT OrderID, OrderData.value(’CustomerNumber[1]’,’CHAR(4)’) AS CustomerNumber FROM @t /* OrderID CustomerNumber 1 1001 */ Thecodemightgetalittlemorecomplexifthequeryneeds to retrieve more than one element from the XML document stored in each row. Such a query needs to generate more than one row against each row stored in the base table. The nodes() method of the XML data type can be used to obtain an acces- sor to each element within the XML document. The XML element collection returned by the nodes() method can be joined with the base table using the CROSS APPLY operator as shown in the following example: DECLARE @t TABLE (OrderID INT,OrderData XML ) INSERT INTO @t(OrderID, OrderData) SELECT 1, ‘<CustomerNumber>1001</CustomerNumber> <Items> <Item ItemNumber="1001" Quantity="1" Price="950"/> 443 www.getcoolebook.com Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 444 Part III Beyond Relational <Item ItemNumber="1002" Quantity="1" Price="650" /> </Items>’ SELECT OrderID, o.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber, o.value(’@Quantity’,’INT’) AS Quantity, o.value(’@Price’,’MONEY’) AS Price FROM @t CROSS APPLY OrderData.nodes(’/Items/Item’) x(o) /* OrderID ItemNumber Quantity Price 1 1001 1 950.00 1 1002 1 650.00 */ The preceding examples use the value() method exposed by the XML data type. XML data type meth- ods are explained in detail later in this section. Declaring and using XML variables Just like other SQL Server native data types, XML variables can be created and used in T-SQL batches, stored procedures, functions, and so on. The following example demonstrates a few different ways an XML variable can be declared: Declare an XML variable DECLARE @x XML Declare a TYPED XML Variable DECLARE @x XML(CustomerSchema) Declare a TYPED XML DOCUMENT Variable DECLARE @x XML(DOCUMENT CustomerSchema) Declare a TYPED XML CONTENT variable DECLARE @x XML(CONTENT CustomerSchema) The first example creates an untyped XML variable, and the second example creates a typed one. The third example creates a DOCUMENT type variable, and the last one creates a CONTENT type variable. DOCUMENT and CONTENT types are explained later in this chapter. There is a slight difference in the way that an XQuery expression needs to be written for an XML variable versus an XML column. While working with an XML variable, the query will always process only one document at a time. However, while working with an XML column, more than one XML document may b e processed in a single batch operation. Because of this, the CROSS APPLY oper- ator is required while running such a query on an XML column (as demonstrated in the previous example). 444 www.getcoolebook.com Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 445 Manipulating XML Data 18 What follows is the version of the prior query that operates on an XML variable: DECLARE @x XML SELECT @x = ‘ <CustomerNumber>1001</CustomerNumber> <Items> <Item ItemNumber="1001" Quantity="1" Price="950"/> <Item ItemNumber="1002" Quantity="1" Price="650" /> </Items>’ SELECT o.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber, o.value(’@Quantity’,’INT’) AS Quantity, o.value(’@Price’,’MONEY’) AS Price FROM @x.nodes(’/Items/Item’) x(o) /* ItemNumber Quantity Price 1001 1 950.00 1002 1 650.00 */ An XML variable may be initialized by a static XML string, from another XML or VARCHAR/NVARCHAR/ VARBINARY variable, from the return value of a function, or from the result of a FOR XML query. The following example shows how to initialize an XML variable from the result of a FOR XML query: DECLARE @x XML SELECT @x = ( SELECT OrderID FROM OrderHeader FOR XML AUTO, TYPE) XML variables can also be initialized from an XML file, as demonstrated later in the section ‘‘Loading XML Documents from Disk Files.’’ Using XML parameters and return values Typed and untyped XML parameters can be passed to a stored procedure as INPUT as well as OUTPUT parameters. XML parameters can be used as argumentsaswellasthereturnvalueofscalarfunctionsor in result columns of table-valued functions. When a function returns an XML data type value, XML data type methods can be directly called on the return value, as shown in the following example: Create a function that returns an XML value CREATE FUNCTION GetOrderInfo( @OrderID INT ) RETURNS XML 445 www.getcoolebook.com Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 446 Part III Beyond Relational AS BEGIN DECLARE @x XML SELECT @x = ( SELECT OrderID, CustomerID FROM OrderHeader WHERE OrderID = @OrderID FOR XML PATH(’’),ROOT(’OrderInfo’)) RETURN @x END GO Call the function and invoke the value() method SELECT dbo.GetOrderInfo(1).value(’(OrderInfo/CustomerID)[1]’,’INT’) AS CustomerID /* CustomerID 1 */ Loading/querying XML documents from disk files The capability to load XML documents from disk files is one of the very interesting XML features available with SQL Server. This i s achieved by using the BULK row set provider for OPENROWSET.The following example shows how to load the content of an XML file into an XML variable: /*The sample code below assumes that a file named "items.xml" exists in folder c:\temp with the following content. <Items> <Item ItemNumber="1001" Quantity="1" Price="950"/> <Item ItemNumber="1002" Quantity="1" Price="650" /> </Items> */ DECLARE @xml XML SELECT @xml = CAST(bulkcolumn AS XML) FROM OPENROWSET(BULK ‘C:\temp\items.xml’, SINGLE_BLOB) AS x SELECT x.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber, x.value(’@Quantity’,’INT’) AS Quantity, x.value(’@Price’,’MONEY’) AS Price FROM @xml.nodes(’/Items/Item’) i(x) 446 www.getcoolebook.com Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 447 Manipulating XML Data 18 /* ItemNumber Quantity Price 1001 1 950.00 1002 1 650.00 */ OPENROWSET(BULK [filename, option]) can even query the data in the file directly without loading it to a table or variable. It can also be used as the source of an INSERT/UPDATE operation. The following example queries the XML file directly: SELECT x.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber, x.value(’@Quantity’,’INT’) AS Quantity, x.value(’@Price’,’MONEY’) AS Price FROM ( SELECT CAST(bulkcolumn AS XML) AS data FROM OPENROWSET(BULK ‘C:\temp\items.xml’, SINGLE_BLOB) AS x )a CROSS APPLY data.nodes(’/Items/Item’) i(x) /* ItemNumber Quantity Price 1001 1 950.00 1002 1 650.00 */ To use the OPENROWSET(BULK ) option, the user should have ADMINISTRATOR BULK OPERATIONS permission. Limitations of the XML data type Though the XML data type comes with a number of very interesting capabilities, it has a number of limitations as well. However, the limitations are not really ‘‘limiting,’’ considering the extensive set of functionalities provided by the data type. The stored representation of an XML data type instance cannot exceed 2 GB. The term ‘‘stored represen- tation’’ is important in the preceding statement, because SQL Server converts XML data type values to an internal structure and stores it. This internal representation takes much less space than the textual rep- resentation of the XML value. The following example demonstrates the reduction in size when a value is stored as an XML data type value: DECLARE @EmployeeXML XML, @EmployeeText NVARCHAR(500) SELECT @EmployeeText = ‘ 447 www.getcoolebook.com Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 448 Part III Beyond Relational <EmployeeInfo> <EmployeeName>Jacob</EmployeeName> <EmployeeName>Steve</EmployeeName> <EmployeeName>Bob</EmployeeName> </EmployeeInfo>’ SELECT DATALENGTH(@EmployeeText) AS StringSize /* StringSize 284 */ SELECT @EmployeeXML = @EmployeeText SELECT DATALENGTH(@EmployeeXML) AS XMLSize /* XMLSize 109 */ The stored representation of the XML data type value is different and much more optimized than the textual representation a nd the limit of 2 GB is on the stored representation. It indicates that an XML data type column may be able to store XML documents containing more than 2 * 1024 * 1024 * 1024 VARCHAR characters. Unlike other data types, XML data type values cannotbesortedorusedinagroupbyexpression.They cannot be used in a comparison operation. However, they can be used with the IS NULL operator to determine if the value is NULL. XML data type columns cannot be used in the key of an index. They can only be used in the INCLUDED column of an index. To facilitate faster querying and searching over XML columns, SQL Server supports a spe- cial type of index called an XML index. XML indexes are different from regular indexes and are discussed later in this chapter. Understanding XML Data Type Methods The XML data type supports a number of m ethods that allow various operations on the XML document. The most common operations needed on an XML document might be reading values from elements or attributes, querying for specific information, or modifying the document by inserting, updating, or deleting XML elements or attributes. The XML data type comes with a number of methods to support all these operations. Any operation on an XML document is applied on one or more elements or attributes at a spe- cific location. To perform an operation, the location of the specific element or attribute has to be specified. 448 www.getcoolebook.com . associated. With typed XML, SQL Server has better knowledge of the XML document (structure, data types, and so on) and can generate a more optimized query plan. Because SQL Server has complete knowledge. section. Declaring and using XML variables Just like other SQL Server native data types, XML variables can be created and used in T -SQL batches, stored procedures, functions, and so on. The following. opera- tions can be joined with relational data, or relational data can be joined with XQuery results. SQL Server supports creating special types of indexes on XML columns to optimize XQuery operations. XML