ptg 1594 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 FIGURE 42.1 Setting FILESTREAM options in SQL Server Configuration Manager. file system for storing FILESTREAM data. You have three options for how FILESTREAM functionality will be enabled: . Allowing only T-SQL access (by checking only the Enable FILESTREAM for Transact- SQL Access option). . Allowing both T-SQL and Win32 access to FILESTREAM data (by checking the Enable FILESTREAM for File I/O Streaming Access option and providing a Windows share name to be used to access the FILESTREAM data). This allows Win32 file system interfaces to provide streaming access to the data. . Allowing remote clients to have access to the FILESTREAM data that is stored on this share (by selecting the Allow Remote Clients to Have Streaming Access to FILESTREAM Data option). NOTE You need to be Windows Administrator on a local system and have sysadmin rights to enable FILESTREAM for SQL Server. After you enable FILESTREAM in SQL Server Configuration Manager, a new share is created on the host system with the name specified. This share is intended only to allow very low-level streaming interaction between SQL Server and authorized clients. It is recommended that only the service account used by the SQL Server instance should have access to this share. Also, because this change takes place at the OS level and not from within SQL Server, you need to stop and restart the SQL Server instance for the change to take effect. ptg 1595 Using FILESTREAM Storage 42 After restarting the SQL Server instance to enable FILESTREAM at the Windows OS level, you next need to enable FILESTREAM for the SQL Server Instance. You can do this either through SQL Server Management Studio or via T-SQL. To enable FILESTREAM for the SQL Server instance using SQL Server Management Studio, right-click on the SQL Server instance in the Object Explorer, select Properties, select the Advanced page, and set the Filestream Access Level property as shown in Figure 42.2. The available options are . Disabled (0)—FILESTREAM access is not permitted. . Transact SQL Access Enabled (1)—FILESTREAM data can be accessed only by T- SQL commands. . Full Access Enabled (2)—Both T-SQL and Win32 access to FILESTREAM data are permitted. You can also optionally enable FILESTREAM for the SQL Server instance using the sp_Configure system procedure, specifying the ’filestream access level’ as the setting and passing the option of 0 (disabled), 1 (T-SQL access), or 2 (Full access). The following example shows full access being enabled for the current SQL Server instance: EXEC sp_configure ‘filestream access level’, 2 GO RECONFIGURE GO FIGURE 42.2 Enabling FILESTREAM for a SQL Server Instance in SSMS. ptg 1596 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 After you configure the SQL Server instance for FILESTREAM access, the next step is to set up a database to store FILESTREAM data. Setting Up a Database for FILESTREAM Storage After you enable FILESTREAM for the SQL Server instance, you can store FILESTREAM data in a database by creating a FILESTREAM filegroup. You can do this when creating the data- base or by adding a new filegroup to an existing database. The filegroup designated for FILESTREAM storage must include the CONTAINS FILESTREAM clause and be defined. The code in Listing 42.18 creates the Customer database and then adds a FILESTREAM filegroup. LISTING 42.18 Setting Up a Database for FILESTREAM Storage CREATE DATABASE Customer ON ( NAME=’Customer_Data’, FILENAME=’C:\SQLData\Customer_Data1.mdf’, SIZE=50, MAXSIZE=100, FILEGROWTH=10) LOG ON ( NAME=’Customer_Log’, FILENAME=’C:\SQLData\Customer_Log.ldf’, SIZE=50, FILEGROWTH=20%) GO ALTER DATABASE Customer ADD FILEGROUP Cust_FSGroup CONTAINS FILESTREAM GO ALTER DATABASE Customer ADD FILE ( NAME=custinfo_FS, FILENAME = ‘G:\SQLData\custinfo_FS’) TO FILEGROUP Cust_FSGroup GO Notice in Listing 42.18 the FILESTREAM filegroup points to a file system folder rather than an actual file. This folder must not exist already (although the path up to the folder must exist); SQL Server creates the FILESTREAM folder (for example, in Listing 42.18, the custinfo_FS folder is created automatically by SQL Server in the G:\SQLData folder). The FILESTREAM files and file data actually end up being stored in the created folder. A FILESTREAM filegroup is restricted to referencing only a single file folder. ptg 1597 Using FILESTREAM Storage 42 Using FILESTREAM Storage for Data Columns Once FILESTREAM storage is enabled for a database, you can specify the FILESTREAM attribute on a varbinary(max) column to indicate that a column should store data in the FILESTREAM filegroup on the file system. When columns are defined with the FILESTREAM attribute, the Database Engine stores all data for that column on the file system instead of in the database file. In addition to a varbinary(max) column with the FILESTREAM attribute, tables used to store FILESTREAM data also require the existence of a UNIQUE ROWGUIDCOL , as shown in Listing 42.19, which creates a custinfo table on the FILESTREAM filegroup. CUSTDATA is defined as the FILESTREAM column, and ID is defined as the unique ROWGUID column. LISTING 42.19 Creating a FILESTREAM-Enabled Table CREATE TABLE CUSTINFO (ID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE, CUSTDATA VARBINARY (MAX) FILESTREAM NULL ) FILESTREAM_ON Cust_FSGroup Each table created with a FILESTREAM column(s) creates a new subfolder in the FILESTREAM filegroup folder, and each FILESTREAM column in the table creates a separate subfolder under the table folder. These column folders are where the actual FILESTREAM files are stored. Initially, these folders are empty until you start adding rows into the table. A file is created in the column subfolder for each row inserted into the table with a non-NULL value for the FILESTREAM column. NOTE For more detailed information on how FILESTREAM data is stored and managed, see Chapter 34. To ensure that SQL Server creates a new, blank file within the FILESTREAM storage folder for each row inserted in the table, you can specify a default value of 0x for the FILESTREAM column: alter table CUSTINFO add constraint custdata_def default 0x for CUSTDATA Creating a default is not required if all access to the FILESTREAM data is going to be done through T-SQL. However, if you will be using Win32 streaming clients to upload file contents into the FILESTREAM column, the file needs to exist already. Without the default to ensure creation of a “blank” file for each row, new files would have to be created first by inserting contents directly through T-SQL before they could be accessed via Win32 client streaming applications. ptg 1598 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 To insert data into a FILESTREAM column, you use a normal INSERT statement and provide a varbinary(max) value to store into the FILESTREAM column: INSERT CUSTINFO (ID, CUSTDATA) VALUES (NEWID(), CONVERT(VARBINARY(MAX), REPLICATE (‘CUST DATA’, 100000))) To retrieve FILESTREAM data, you can use a simple T-SQL SELECT statement, although you may need to convert the varbinary(max) to varchar to be able to display text data: select ID, CONVERT(varchar(40), CUSTDATA) as CUSTDATA from CUSTINFO go ID CUSTDATA FA67BF05-51B5-4BA7-A383-7F88DAAE9C49 CUST DATACUST DATACUST DATACUST DATACUST The preceding examples work fine if the FILESTREAM data is essentially text data; however, neither SQL Server Management Studio nor SQL Server itself really has any user interface, or native way, to let you stream the contents of an actual file into a table that’s been marked with the FILESTREAM attribute on one of your varbinary(max) columns. In other words, if you have a .jpg or .mp3 file that you want to store within SQL Server, there’s no native functionality to convert that image’s byte stream into something that you could put, for example, into a simple INSERT statement. To read or store this type of data, you need to use Win32 to read and write data to a FILESTREAM BLOB. Following are the steps you need to perform in your client applications: 1. Read the FILESTREAM file path. 2. Read the current transaction context. 3. Obtain a Win32 handle and use the handle to read and write data to the FILESTREAM BLOB. Each cell in a FILESTREAM table has a file path associated with it. You can use the PATHNAME property to retrieve the file path of a varbinary(max) column in a T-SQL statement: DECLARE @filePath varchar(max) SELECT @filePath = CUSTDATA.PathName() FROM CUSTINFO WHERE ID = ‘FA67BF05-51B5-4BA7-A383-7F88DAAE9C49’ PRINT @filepath go \\LATITUDED830-W7\FILESTREAM\v1\Customer\dbo\CUSTINFO\CUSTDATA \FA67BF05-51B5-4BA7-A383-7F88DAAE9C49 ptg 1599 Using FILESTREAM Storage 42 Next, to obtain the current transaction context and return it to the client application, use the GET_FILESTREAM_TRANSACTION_CONTEXT() T-SQL function: BEGIN TRAN SELECT GET_FILESTREAM_TRANSACTION_CONTEXT() After you obtain the transaction context, the next step in your application code is to obtain a Win32 file handle to read or write the data to the FILESTREAM column. To obtain a Win32 file handle, you call the OpenSqlFilestream API. The returned handle can then be passed to any of the following Win32 APIs to read and write data to a FILESTREAM BLOB: . ReadFile . WriteFile . TransmitFile . SetFilePointer . SetEndOfFile . FlushFileBuffers To summarize, the steps you perform to upload a file to a FILESTREAM column are as follows: 1. Start a new transaction and obtain the transaction context ID that can be used to initiate the Win32 file-streaming process. 2. Execute a SqlDataReader connection to pull back the full path (in SQL Server) of the FILESTREAM file to which you will be uploading data. 3. Initiate a straight file-streaming operation using the System.Data.SqlTypes.SqlFileStream class. 4. Create a new System.IO.FileStream object to read the file locally and buffer bytes along to the SqlFileStream object until there are no more bytes to transfer. 5. Close the transaction. NOTE Because you’re streaming file contents via a Win32 process, you need to use integrat- ed security to connect to SQL Server because native SQL logins can’t generate the needed security tokens to access the underlying file system where the FILESTREAM data is stored. To retrieve data from a FILESTREAM column to a file on the client, you primarily follow the same steps as you do for inserting data; however, instead you pull data from a SqlFileStream object into a buffer and push it into a local FILESTREAM object until there are no more bytes left to retrieve. ptg 1600 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 TIP Refer to the “Managing FILESTREAM Data by Using Win32” topic in SQL Server 2008 R2 Books Online for specific C#, Visual Basic, and Visual C++ application code exam- ples showing how to obtain a Win32 file handle and use it to read and write data to a FILESTREAM column. Sparse Columns SQL Server 2008 provides a new space-saving storage option referred to as sparse columns. Sparse columns can provide optimized and efficient storage for columns that contain predominately NULL values. The NULL values require no storage space, but these space savings come at a cost of increased space for storing non- NULL values (an additional 2–4 bytes of space is needed for non- NULL values). For this reason, Microsoft recommends using sparse columns only when the space saved is at least 20% to 40%. However, the consensus rule of thumb that is emerging from experience with sparse columns is that it is best to use them only when more than 90% of the values are NULL. There are a number of restrictions and limitations regarding the use of sparse columns, including the following: . Sparse columns cannot be defined with the ROWGUIDCOL or IDENTITY properties. . Sparse columns cannot be defined with a default value. . Sparse columns cannot be used in a user-defined table type. . Although sparse columns allow up to 30,000 columns per table, the total row size is reduced to 8,018 bytes due to the additional overhead for sparse columns. . If a table has sparse columns, you can’t compress it at either the row or page level. . Columns defined with the geography, geometry, text, ntext, timestamp, image, or user-defined data types cannot be defined as sparse columns. . You can’t define varbinary(max) fields that use FILESTREAM storage as sparse columns. . You can’t define a computed column as sparse, but you can use a sparse column in the calculation of a computed column. . A table cannot have more than 1,024 non-sparse columns. Column Sets Column sets provide an alternative way to view and work with all the sparse columns in a table. The sparse columns are aggregated into a single untyped XML column, which simplifies working with many sparse columns in a table. The XML column used for a column set is similar to a calculated column in that it is not physically stored, but unlike calculated columns, it is updateable. ptg 1601 Sparse Columns 42 There are some restrictions on column sets: . You cannot add a column set to a table that already has sparse columns. . You can define only one column set per table. . Constraints or default values cannot be defined on a column set. . Computed columns cannot contain column set columns. . A column set cannot be changed; you must delete and re-create the column set. However, sparse columns can be added to the table after a column set has been defined and is automatically included in the column set. . Distributed queries, replication, and Change Data Capture do not support column sets. . A column set cannot be part of any kind of index, including XML indexes, full-text indexes, and indexed views. NOTE Sparse columns and column sets are defined by using the CREATE TABLE or ALTER TABLE statements. This chapter focuses on using and working with sparse columns. For more information on defining sparse columns and column sets, see Chapter 24, “Creating and Managing Tables.” Working with Sparse Columns Querying and manipulation of sparse columns is the same as for regular columns, with one exception described later in this chapter. There’s nothing functionally different about a table that includes sparse columns, except the way the sparse columns are stored. You can still use all the standard INSERT, UPDATE, and DELETE statements on tables with sparse columns just like a table that doesn’t have sparse columns. You can also wrap operations on a table with sparse columns in transactions as usual. To work with sparse columns, let’s first create a table with sparse columns. Listing 42.20 creates a version of the Product table in the AdventureWorks2008R2 database and then populates the table with data from the Production.Product table. The Color, Weight, and SellEndDate columns are defined as sparse columns (the source data contains a significant number of NULL values for these columns). These columns are also defined as part of the column set, ProductInfo. LISTING 42.20 Creating a Table with Sparse Columns USE AdventureWorks2008R2 GO CREATE TABLE Product_sparse ( ProductID INT NOT NULL PRIMARY KEY, ProductName NVARCHAR(50) NOT NULL, ptg 1602 CHAPTER 42 What’s New for Transact-SQL in SQL Server 2008 Color NVARCHAR(15) SPARSE NULL, Weight DECIMAL(8,2) SPARSE NULL, SellEndDate DATETIME SPARSE NULL, ProductInfo XML COLUMN_SET FOR ALL_SPARSE_COLUMNS ) GO INSERT INTO Product_sparse (ProductID, ProductName, Color, Weight, SellEndDate) SELECT ProductID, Name, Color, Weight, SellEndDate FROM Production.Product GO You can reference the sparse columns in your queries just as you would any type of column: SELECT productID, productName, Color, Weight, SEllEndDate FROM Product_sparse where ProductID < 320 go productID productName Color Weight SEllEndDate 1 Adjustable Race NULL NULL NULL 2 Bearing Ball NULL NULL NULL 3 BB Ball Bearing NULL NULL NULL 4 Headset Ball Bearings NULL NULL NULL 316 Blade NULL NULL NULL 317 LL Crankarm Black NULL NULL 318 ML Crankarm Black NULL NULL 319 HL Crankarm Black NULL NULL Note, however, that if you use SELECT * in a query and the table has a column set defined for the sparse columns, the column set is returned as a single XML column instead of the individual columns: SELECT * FROM Product_sparse where ProductID < 320 go ProductID ProductName ProductInfo 1 Adjustable Race NULL 2 Bearing Ball NULL 3 BB Ball Bearing NULL 4 Headset Ball Bearings NULL ptg 1603 Sparse Columns 42 316 Blade NULL 317 LL Crankarm <Color>Black</Color> 318 ML Crankarm <Color>Black</Color> 319 HL Crankarm <Color>Black</Color> You need to explicitly list the columns in the SELECT clause to have the result columns returned as relational columns. When the column set is defined, you can also operate on the column set by using XML operations instead of relational operations. For example, the following code inserts a row into the table by using the column set and specifying a value for Weight as XML: INSERT Product_sparse(ProductID, ProductName, ProductInfo) VALUES(5, ‘ValveStem’, ‘<Weight>.12</Weight>’) go SELECT productID, productName, Color, Weight, SEllEndDate FROM Product_sparse where productID = 5 go productID productName Color Weight SEllEndDate 5 ValveStem NULL 0.12 NULL Notice that NULL is assumed for any column omitted from the XML value, such as Color and SellEndDate in this example. When updating a column set using an XML value, you must include values for all the columns in the column set you want to set, including any existing values. Any values not specified in the XML string are set to NULL. For example, the following query sets both Color and Weight where ProductID = 5: Update Product_sparse set ProductInfo = ‘<Color>black</Color><Weight>.20</Weight>’ where productID = 5 SELECT productID, productName, Color, Weight, SEllEndDate FROM Product_sparse where productID = 5 go productID productName Color Weight SEllEndDate 5 ValveStem black 0.20 NULL . FILESTREAM for the SQL Server Instance. You can do this either through SQL Server Management Studio or via T -SQL. To enable FILESTREAM for the SQL Server instance using SQL Server Management Studio,. not from within SQL Server, you need to stop and restart the SQL Server instance for the change to take effect. ptg 1595 Using FILESTREAM Storage 42 After restarting the SQL Server instance to. ptg 1594 CHAPTER 42 What’s New for Transact -SQL in SQL Server 2008 FIGURE 42.1 Setting FILESTREAM options in SQL Server Configuration Manager. file system for storing FILESTREAM