528 Hands-On Microsoft SQL Server 2008 Integration Services If Not Row.LastName_IsNull Then .Write(Row.LastName) End If .Write(columnDelimiter) If Not Row.Title_IsNull Then .Write(Row.Title) End If .Write(columnDelimiter) If Not Row.SalesAmount_IsNull Then .Write(Row.SalesAmount) End If .Write(columnDelimiter) If Not Row.IsBonusPaid_IsNull Then .Write(Row.IsBonusPaid) End If .Write(columnDelimiter) If Not Row.BonusAmount_IsNull Then .Write(Row.BonusAmount) End If .WriteLine() End With Close the VSTA IDE and the Script Transformation Editor. We will now move on to create a script destination to write into a nonstandard text file. Nonstandard Text File Destination In this part you will write the asynchronous output to a nonstandard text file. Considering that you can now do some of the configurations yourself, this section is kept short and quick. 52. Add a File Connection Manager to create a SalesSummary.txt file in the C:\SSIS\RawFiles folder. 53. Add a Script destination component, rename it as Script Non Standard Destination, and join it with the asynchronous output from the Script Transformation component. 54. Perform the following in the editor: Change the language. c Select both the input columns. c Rename the Input 0 to c DstAsyncInput. Add the SalesSummary.txt Connection Manager and name it as c SalesSummary. Chapter 11: Programming Integration Services 529 55. In the Code-design mode, perform the following: Add the System.IO namespace in the Imports section. c Create two script variables: c Dim cmSalesSummary As String Private textWriter As StreamWriter Add the following code for the AcquireConnections method: c Public Overrides Sub AcquireConnections(ByVal Transaction As Object) Dim connMgr As IDTSConnectionManager100 = Me.Connections.SalesSummary cmSalesSummary = CType(connMgr.AcquireConnection(Nothing), String) End Sub 56. Initialize the textWriter in the PreExecute method: textWriter = New StreamWriter(cmSalesSummary, False) 57. Close the textWriter in the PostExecute Method: textWriter.Close() 58. Last, type the following code in the DstAsyncInput_ProcessInputRow method: With textWriter .Write("This is a Summary Report for Sales") .WriteLine() .WriteLine() .Write("Total Sales Amount: " + Row.TotalSalesAmount.ToString) .WriteLine() .Write("Total Bonus Amount: " + Row.TotalBonusAmount.ToString) .WriteLine() End With 59. Execute the package and check out the output files. You will see the data from synchronous output written into a comma-delimited text file and the data from asynchronous output written into a nonstandard text file (Figure 11-19). Figure 11-19 A nonstandard text file written with the data from asynchronous output 530 Hands-On Microsoft SQL Server 2008 Integration Services Debugging Techniques for Script Component In the previous parts, you’ve used a combination of Row Count transformation and the data viewers to break the data flow and inspect the records visually. While this technique is very helpful in debugging and identifying the component that is causing issues, it fails if you want to know what’s happening within your own developed component. You’ve used breakpoints within your scripts while working with the Script task to step through the code to isolate the error causing rows, but unfortunately the Script component does not support breakpoints. So, in a Script component you’ve to use alternate techniques to capture the information within your code. SSIS provides the following methods to capture information from within your script. Raise informational messages c You can raise messages from within the script using the MessageBox.Show, Console.Write, or Console.WriteLine methods. You will be using these methods during development when you want to find out what rows or values are causing errors. For example, when you use the MessageBox. Show method, the output will be displayed in a modal message box, which you have to click to proceed further. You won’t be using this method for the normal run; otherwise, you will have to keep clicking through the pipeline to progress the package execution. You can test your package by adding the following line in the Script Transformation component in the XfrInput_ProcessInputRow method just before the line where you pass the BonusPaid value to the Row.IsBonusPaid column (refer to Figure 11-20). is will show you the Bonus for each row, and you will have to click OK to progress to the next row. MessageBox.Show("BonusAmount: " + Bonus.ToString) Raise events c You can raise various events to let the package know about some information, warnings, or errors raised within the script. e Script component provides the following methods that you use to raise different events. Event Method Description FireCustomEvent Fires a user-defined custom event—e.g., you may want to capture certain data-related attributes. FireError Fires an error event. FireInformation Fires an informational message—e.g., information for auditing purposes. FireProgress Fires the on-progress event. FireWarning Fires a warning that is less critical than an error—e.g., a warning for a varchar(50) column is being mapped to varchar(10), which will work as long as the data doesn’t exceed ten characters. Chapter 11: Programming Integration Services 531 As you know from the Script task, these methods were exposed by the event property of the Dts global object, but the Script component doesn’t have any such object and though the event-firing methods have the same names, they are the methods of the IDTSComponentMetaData100 interface exposed by the ComponentMetaData property of the ScriptMain class. This also means the usage will be different than what you used in the Script task. Once the events have been raised by the Script component, the package event handlers then manage the event notifications. You can build your own event handlers to respond to a certain event—e.g., you can change the course of a package work flow in response to an event raised by the Script component. To raise an event, you need to declare a variable for this interface as Dim myMetadata as IDTSComponentMetaData100 and then add the following lines in the code where you earlier put the MessageBox (Figure 11-20). If Bonus > 1000 Then myMetaData = Me.ComponentMetaData myMetaData.FireInformation(0, "Script Transformation Component", "Bonus paid: " + Bonus.ToString, String.Empty, 0, False) End If Figure 11-20 Raising events in a Script component 532 Hands-On Microsoft SQL Server 2008 Integration Services This will raise informational messages to tell the package the amount of bonuses paid whenever the bonus is more than 1000. You can see these messages displayed in the output window and in the Progress tab. Refer to Books Online to know how different methods can be written and how they print their messages. Log information c Much as when raising events, you can log detailed information about the execution progress, results, and errors. You can also raise events first and then log those events later. e main difference between first raising events and then logging compared to direct-logging to a log provider is that the direct logging method doesn’t provide you an opportunity to build event handlers to respond to the events. If you want to use a log directly without using any other log event, then you must use the ScriptComponentLogEntry event while configuring logging for the Data Flow task as shown in Figure 11-21. e ScriptComponentLogEntry Figure 11-21 Custom log entry for the Script component Chapter 11: Programming Integration Services 533 event gets added to the Data Flow task event in the Log Details tab when you add a Script component into the Data Flow task. After selecting this log entry, use the Log method within the ScriptMain class to log directly to the log providers enabled in the package. For example, if you add the following code in the Script Transformation component instead of the information event, the package will log the lines where the bonus is more than 1000: Dim bt(0) As Byte If Bonus > 1000 Then Me.Log(Bonus.ToString, 0, bt) End If Review You’ve used the Script component to configure a source, a transformation, and a destination in this exercise. You’ve also learned the different types of outputs—i.e., synchronous and asynchronous outputs—that you can configure for a Script Transformation component. The general steps you followed to create any of these components is to configure the component first in the metadata design mode and then in the code design mode to write custom code for the processing rules. Do not underestimate the power of metadata design mode, as the settings you use in this mode are used to create code for you in the three project items—BufferWrapper.vb, ComponentWrapper.vb, and main.vb. While the code is auto-generated for both the BufferWrapper.vb and ComponentWrapper.vb items whenever you make a change, this does not happen for the main.vb project item. The code for the main.vb project item is auto-generated only once, when you invoke the code design mode for the first time. Though you can always write the code to fix anything that has been missed by the auto-generated code, it will be far easier and effective to let the auto-generation engine work for you, especially during the initial days of your working with the Script component. Also, make sure you understand the differences between a Script task and the Script component, especially how the different properties and methods are exposed within both the objects. For example, the variables and the connections are exposed by the Dts object in the Script task, while they are exposed using strongly typed accessor properties within the Script component. Last, the debugging techniques are different in both the objects—while you can use breakpoints to debug a Script task, you’ve to use alternate methods to debug a Script component such as MessageBoxes, data viewers, and events. 534 Hands-On Microsoft SQL Server 2008 Integration Services Summary You’ve worked with the object model of Integration Services in this chapter and learned the different options it provides to extend an SSIS package. Based on your problem and the skills you have in-house, you can choose an appropriate method from scripting, developing custom objects, or programming packages from scratch. Later you worked with the Script task and the Script component and understood the various methods and properties provided by their object models. You also learned that though you have to write code differently for both the objects to use their methods, they provide the same functions, such as access to package variables or connection managers and the ability to raise events and log information. There are many samples contained in the Books Online that can enhance your understanding about the workings of SSIS object model. Also, check out the CodePlex site as well to see some rich programming content to hone your programming skills. Data Warehousing and SQL Server 2008 Enhancements Chapter 12 In This Chapter c The Need for a Data Warehouse c Data Warehouse Design Approaches c Data Warehouse Architecture c Data Warehouse Data Models c Loading a Dimension Using a Slowly Changing Dimension c Data Model Schema c Building a Star Schema c SQL Server 2008 R2 Features and Enhancements c SQL Server 2008 R2 Data Warehouse Editions c SQL Server 2008 R2 Data Warehouse Solutions c SQL Server 2008 R2 Data Warehouse Enhancements c Summary 536 Hands-On Microsoft SQL Server 2008 Integration Services N ow that you have gone through all the development chapters on Integration Services, it is time to take a slight digression to understand the facilities that are available outside SSIS that can be very helpful in developing a well-performing data operations and management system. Soon after developers start using an ETL tool such as Integration Services, they will find themselves working on some operation of a data mart or a data warehouse. It is therefore necessary that you understand the data warehousing and the processes involved along with the tools and utilities that are available in SQL Server 2008, which can be exploited to develop data warehousing processes. You can use the tools and utilities provided in SQL Server 2008 to enhance an SSIS package or replace some of the package functionality so that overall solution performs well. This chapter is primarily split in two parts, with first part focusing on data warehousing and the second part on SQL Server 2008 database engine enhancements, such as change data capture (CDC), backup compression, and the MERGE statement, along with two new editions specially targeted to data warehouses. The purpose of this chapter is not to cover how to use or configure the new functionalities, as it would be too great a deviation from the subject of this book, but to know how you can use them alongside SSIS to best utilize the toolset you have. However, if I succeed in my little effort to induce enough interest in you in data warehousing, there are tons of books and other online material available that I would recommend for you to check out. The Need for a Data Warehouse As a business starts, it needs to capture information and record transactions to support business needs. The applications that help to serve this need are called line of business (LOB) applications. The systems used to support line of business applications are referred to as online transactional processing systems. OLTP systems are designed to respond very quickly for accessing and recording small chunks of information. Generally, lots of users are connected to such systems performing transactions to support activities such as sales, billing, and order management. Technically, OLTP systems are highly normalized to capture the current state of information and not to keep the historical data. As the business grows, the business managers need some reporting out of the information systems to take business forward, for example, how many sales yesterday or the how many deliveries completed. These kinds of reports are called operational reporting and to some extent can be served from OLTP systems. However, when the business has grown enough and is in such a good position that the OLTP systems are running busy most of the time, you can’t run reports on them. Running reports touches more data and requires more resources that affect the performance of already-busy OLTP systems and can result in lost sales. So, the IT extracts data from OLTP systems into new systems so as to serve reports. As the report recipients generally do not want to Chapter 12: Data Warehousing and SQL Server 2008 Enhancements 537 look at detailed information related to individual transactions, it makes sense to extract data to only the lowest level of granularity required. Think of a web-based business that allows customers to buy products online. The underlying OLTP system keeps information about all the activities of the customers that may not be important for creating business reports. For example, a customer might have added three similar types of products in the basket, but while checking out removed two of the products from the basket after looking at the delivery charges for each of the items. While this data might be interesting for purposes such as data mining or to understand customer behavior, it is not very interesting for the business manager, who is concerned about the sales of the day only. So, while taking out data from an OLTP system, data is cleaned at the individual sales level, removing unwanted details. This data is still the detailed data as per requirements of the business manager and is kept more or less in the same format as that of the source OLTP system. Furthermore, this data might be combined with data from other source systems to create a bigger picture. In our example, consider joining the sales information with the information coming from the customer repository system to know who is an active buyer. These systems that keep cleaned, subject-oriented information at a detailed level and collect this information in real- or near-real-time mode are called operational data stores, or ODS in short. Once the business creates an ODS system, they look forward to get the right information at the right time to make well-informed decisions. However, an ODS is limited by its very definition and cannot answer the questions that are spread across the enterprise or are related to multiple departments. This is the beginning of the need to have a decision support system (DSS) within the organization. As an organization grows further where it becomes important to improve business performance and make strategically right decisions, more complex questions are being put to information systems. These complex questions generally contain comparisons with a previous period such as sales this quarter compared to sales in the same quarter last year. This creates a need to keep data over longer periods of time. Some businesses that fall under strict regulations—e.g., the financial services industry—may be required to keep data as long as five to ten years. Keeping data over longer periods is not just to comply with regulations, but also to get some answers that otherwise would not be possible. A business dependent on sales around Christmas day might ask the sales comparison over a week not with the last year, but with the year when Christmas was on the same day of the week (say Thursday), previously. Also, as the organization evolves and starts to sell multiple products across multiple zones and countries, another need arises to see the single version of truth. This will mean that the data from multiple source systems needs to be collected in a common place and business rules need to be applied to collate the data to create an information store that can answer the complex questions across countries, currencies, products, departments, and time. The operational reports created on top . a Star Schema c SQL Server 2008 R2 Features and Enhancements c SQL Server 2008 R2 Data Warehouse Editions c SQL Server 2008 R2 Data Warehouse Solutions c SQL Server 2008 R2 Data Warehouse. MessageBoxes, data viewers, and events. 534 Hands-On Microsoft SQL Server 2008 Integration Services Summary You’ve worked with the object model of Integration Services in this chapter and learned the. Enhancements c Summary 536 Hands-On Microsoft SQL Server 2008 Integration Services N ow that you have gone through all the development chapters on Integration Services, it is time to take