298 Hands-On Microsoft SQL Server 2008 Integration Services 13. Expand the Iterating October Opportunities container and select the Mailing Opportunities task. Note that the events on the Details tab remain the same, as the logging options have been inherited from the parent. Click twice on the check box to the left of Mailing Opportunities to place a highlighted tick in the check box and change the LoggingMode to Enabled for this task. As you do that, you will notice that three new events—SendMailTaskBegin, SendMailTaskEnd, and SendMailTaskInfo—have been added to the list in the Details pane. These are the custom log entries provided by the SMTP task, as you know many of the Integration Services objects provide custom log entries for logging specific information related to their functionality only. 14. Click Advanced and you will see the information fields that will be logged when the selected events occur. These log schema fields have been described earlier in this chapter. Select the OnProgress, SendMailTaskBegin, SendMailTaskEnd, and SendMailTaskInfo events only, as shown in Figure 8-3. Note that all the fields of log schema are selected by default. 15. Go to Providers and Logs tab. Select the check box for SMTP task level log to Text file, as you want to log events to a text file for Mailing Opportunities Task. Click OK to close the Configure SSIS Logs dialog box. You’ve successfully configured logging for this package. Figure 8-2 Enabling events to log Chapter 8: Advanced Features of Integration Services 299 Exercise (Analyze Integration Services Logs) Finally, let us execute the package to record logs entries and check them using SQL Server Management Studio. 16. Press the 5 key to execute the package. You will see the components changing color from yellow to green. When the package execution completes, press - 5 to return to design mode. 17. Run SQL Server Management Studio and connect to the database engine. Run the following query in the Query pane: Select * from [Campaign].[dbo].[sysssislog] Look through the results and particularly note the OnPreExecute event and the OnPostExecute event of Iterating October Opportunities. Note the start time and end time of these events. 18. Explore to the C:\SSIS\Projects\Contacting Opportunities with Logging folder and open the SMTPTaskLevel.log file using Notepad. Scroll right to the end of file and note that the Send Mail task has been initiated and completed seven times, corresponding to the seven messages it sent out. These events occurred in between the start time and end time of the Iterating October Opportunities container. Figure 8-3 Configuring unique log entries for the Mailing Opportunities task 300 Hands-On Microsoft SQL Server 2008 Integration Services Review You’ve enabled and configured logging options in this exercise with unique logging options on the Mailing Opportunities task. You’ve seen how the inherited logging options work and that you do not need to set any configurations. However, if you need to save disk space and tighten up logging only for the required events, you can do so by configuring unique log entries as you have done with the Mailing Opportunities task. One interesting thing to note is that the events such as OnError and OnWarning, not captured at the task level, do not get lost; instead, they travel up and can be captured at container level. This means that you do not need to log everything at every component level. You can actually choose discrete levels in the package where you want to capture errors and warnings. This will maintain high levels of performance and keep the disks free of unwanted logs. From the best practice point of view, you should be logging heavily during development and debugging phases while reducing the logging to the minimum required level at normal production runs. Some developers log to the Windows Event log in the development environment, as this is very simple to set up and they won’t have to manage log files or clear up any files. Transactions in Integration Services Packages Transactions enable you to implement data integrity and consistency within your packages. In DBMS world, a transaction is considered an atomic piece of work that must be completed entirely or rolled back altogether. Sometimes it becomes imperative to use transactions when working with database systems. For example, in banking systems, when somebody transfers money from his account to someone else’s account, the money has to be debited and credited in a single transaction—either both operations of debiting and crediting will happen or both will be rolled back if the system is unable to complete the transaction at any stage while processing this transaction. In this way, you can be sure that the data integrity will be maintained. Transactions are required to possess the ACID properties (Atomicity, Consistency, Isolation, and Durability), without which the integrity of the transacted data cannot be guaranteed. Integration Services also allows you to use transactions within a package or among one or more packages. Containers within an Integration Services package can be configured to use transactions by setting the TransactionOption property to Required or Supported in the Transactions section of the container’s Properties window. As the tasks are enclosed within the task host containers, the tasks can also be configured to use transactions using the TransactionOption property. This property can have one of three possible values: Required c e container must participate in a transaction when the TransactionOption is set to Required. at is, if the parent of this container has already started a transaction, this container will join that transaction; if no transaction exists, this container will start a new transaction. Chapter 8: Advanced Features of Integration Services 301 Supported c e container will participate in a transaction if it already exists when the TransactionOption is set to Supported. However, if a transaction doesn’t exist, it will not start a new transaction. NotSupported c e container will not participate in a transaction even if a transaction exists. at is, the containers having TransactionOption set as NotSupported will neither start a transaction nor join an existing transaction. You can use transactions with various types of possibilities and scenarios. Because Integration Services tasks support programming SQL Server, you can effectively use native transactions supported by SQL Server. But for most of the work, you will be using the Microsoft Distributed Transaction Coordinator (MSDTC) to support transactions in Integration Services. MSDTC is a Windows service that provides a transaction infrastructure across multiple computer systems or distributed computing environments. You can configure transactions involving tasks in a container, you can involve multiple containers in a transaction, or you can have a transaction spanning over multiple packages. These packages can be running on different machines, effectively resulting in a transaction running in a distributed environment. Multiple transactions can be included in a package, and multiple packages can be run under one transaction. How package transactions are run depend on you how you configure transactions on the containers and subcontainers and how you configure transactions when the package is run as a child package under the context of an Execute Package task. When a package is run under the context of an Execute Package task, it can inherit the transaction started by the parent package if the Execute Package task and the package are configured to join this transaction. In addition, while configuring multiple transactions within a package, you may have a transaction running a transaction. These nested transactions go hand in hand with the container hierarchy within a package. However, issues arise when you use nested transactions that are not related—i.e., they do not fall particularly within a single parent transaction, a condition that can cause tasks not to roll back completely as desired. While configuring transactions within your package, try to keep child transactions within the scope of a single parent transaction and, as a best practice, test thoroughly before deploying to production. Hands-On: Maintaining Data Integrity with Transactions You are tasked with making sure that the data is always kept in consistent state, irrespective of the data sources from which the data is coming. You are trying to identify how the data could become inconsistent; you determined that one of the reasons could be the data import process, when part of a record could not be imported. Various packages and processes are importing data and you want to determine exactly how transactions can protect your data. 302 Hands-On Microsoft SQL Server 2008 Integration Services Method In this exercise, you will be simulating various scenarios of data import and will work with transaction options to maintain data integrity. You can configure transactions within a package in various ways, but to help understand how they work, you will deal with three main cases in which transactions can be used. The main steps involved in this exercise are as follows: Create a package and understand how data consistency can be affected with c loading operations. Case I covers use of transaction involving multiple tasks but in a single container. c Case II covers transactions spanning multiple containers. c Case III covers transactions spanning multiple packages. c Exercise (Create a Simulation Package for Data Consistency Issues) You begin this exercise by creating NewCustomer, EmailAddress, and Vehicle tables using SQL Server Management Studio. Then you move on to BIDS and create a new Integration Services project to simulate data consistency issues. 1. Open SQL Server Management Studio, connect to the database engine, and run the following queries to create three tables after opening a new query pane: USE [Campaign] GO CREATE TABLE [dbo].[NewCustomer]( [CustomerID] [varchar](10) NOT NULL, [FirstName] [varchar](50) NULL, [SurName] [varchar](50) NULL ) ON [PRIMARY] Go CREATE TABLE [dbo].[EmailAddress]( [CustomerID] [varchar](10) NOT NULL, [Email] [varchar](100) NOT NULL, [Type] [varchar](50) NULL ) ON [PRIMARY] Go CREATE TABLE [dbo].[Vehicle]( [CustomerID] [varchar](10) NULL, [VIN] [varchar](20) NOT NULL, [Series] [varchar](50) NULL, [Model] [varchar](50) NULL ) ON [PRIMARY] GO Chapter 8: Advanced Features of Integration Services 303 2. Open BIDS and create a new Integration Services project with the following details: Name Maintaining data Integrity with Transactions Location C:\SSIS\Projects 3. When the new package loads up, right-click in the Connection Managers area and choose New OLE DB Connection. In the Configure OLE DB Connection Manager dialog box, choose localhost.Campaign and click OK. You’ve added a connection manager to connect to the Campaign database. 4. Drop an Execute SQL Task from the Toolbox onto the Control Flow surface. Rename this task as Loading NewCustomer. 5. Open the Execute SQL Task Editor by double-clicking its icon. In the Connection field’s drop-down list, choose localhost.Campaign. 6. With SQLSourceType set to Direct Input, click in the SQLStatement field, and then click the ellipsis button that appears in the right corner of the field to open the Enter SQL Query dialog box. Type the following SQL statement in this dialog box. INSERT INTO NewCustomer (CustomerID, FirstName, SurName) VALUES ('N501', 'Will', 'Harrison') Click OK twice to close the Editor. 7. Repeat Steps 4 to 6 to add a second Execute SQL task to your package. Rename this task Loading EmailAddress and edit it to add the localhost.Campaign connection manager. Close the task after assigning the following SQL statement to it: INSERT INTO EmailAddress (CustomerID, Email, Type) VALUES ('N501', 'wharrison@AffordingIT.co.uk', 'Work') Join the Loading NewCustomer task with this task using an on-success precedence constraint. 8. Repeat steps 4 to 6 to add a third Execute SQL task to your package. Rename this task Loading Vehicle and edit it to add the localhost.Campaign connection manager. Close the task after assigning the following SQL statement to it: INSERT INTO Vehicle (CustomerID, VIN, Series, Model) VALUES ('N501', 'UV123WX456YZ789', 'X11 Series', 'Saloon') Join the Loading EmailAddress task with this third task using an on-success precedence constraint. Your package should look like the one shown in Figure 8-4. 9. Execute this package by pressing the 5 key. When all the tasks have changed to green and the package has completed execution, press - 5 to return to design mode. 304 Hands-On Microsoft SQL Server 2008 Integration Services 10. Switch to the SQL Server Management Studio and run the following query: SELECT n.[CustomerID], [FirstName], [SurName], [Email], [Type], [VIN], [Series], [Model] FROM [Campaign].[dbo].[NewCustomer] n LEFT OUTER JOIN [Campaign].[dbo].[EmailAddress] e ON n.CustomerID = e.CustomerID LEFT OUTER JOIN [Campaign].[dbo].[Vehicle] v ON n.CustomerID = v.CustomerID You will see the result contains customer details, the customer’s e-mail address, and the vehicle details, all as you wanted (see Figure 8-5), irrespective to the fact that the data is stored in different tables. 11. Note the TransactionOption property of the package in Figure 8-4, which is set at the default value of Supported. This property has a Supported value for all the three tasks as well. Only the containers with TransactionOption set as Required can start a transaction, so no container started a transaction while this package was executed. So far, so good. Now, let’s see what happens if a task fails and doesn’t import a particular row. Figure 8-4 Creating a simulation package for data consistency checks Chapter 8: Advanced Features of Integration Services 305 Data uploading can fail for a particular record for various reasons. The most common is a data type mismatch or a constraint on the column. If you go to the Campaign database, expand the Vehicle table, and then look in the Columns node in the Object Explorer in SQL Server Management Studio, you will see the column properties of the Vehicle table. Note that the VIN field is a mandatory (not null) field and must have a known value. If no data is received for this field during the import process, that row will not be inserted. 12. Before you proceed, let’s clean up our tables. Run the following queries in the Query pane to delete all the data from the existing tables: DELETE [Campaign].[dbo].[NewCustomer] DELETE [Campaign].[dbo].[EmailAddress] DELETE [Campaign].[dbo].[Vehicle] 13. Switch to BIDS and modify the SQL statement in the Loading Vehicle task by removing the VIN information from the SQL statement; your statement should look as follows: INSERT INTO Vehicle (CustomerID, Series, Model) VALUES ('N501', 'X11 Series', 'Saloon') 14. After you’ve made the change, press 5 to execute the package again. Keep in mind that you’ve still not implemented a transaction in the package. This time, you will see that the first two tasks execute successfully while the Loading Vehicle task fails and turns red. Stop debugging the package. Figure 8-5 The data consistency you want to see in your database 306 Hands-On Microsoft SQL Server 2008 Integration Services 15. Switch to SQL Server Management Studio and execute the query written in Step 10 to see the results this time. You will see that the three fields—VIN, Series, and Model—have NULL values. You have discovered how your data can become inconsistent during the loading process. 16. Run the delete SQL statements written in Step 12 to cleanse this data from all the tables. Exercise (Case I: Avoiding Inconsistency in a Single Container) In real life, you either commit all the data from all the tasks in the tables or roll back the failing rows and throw those rows out and deal with them separately. In the following steps, you will see how you can roll back the data by using transactions. 17. Click anywhere on the blank surface of the Control Flow panel and press 4 to open the Properties window. Scroll down the Properties window and locate the Transactions section. Change the TransactionOption value from Supported to Required. 18. Press 5 to execute the package. You will again see that the first two tasks complete successfully whereas the Loading Vehicle task fails as expected. Stop debugging the package. 19. Switch to SQL Server Management Studio and execute the query written in Step 10 in the preceding sequence to see the results. This time, you will see no data at all in the output query. Setting the TransactionOption to Required resulted in the use of a transaction under which all the three tasks were executed. As the third task failed, the data loaded by first two tasks was rolled back. Exercise (Case II: Transaction Spanning over Multiple Containers) The preceding exercise demonstrated how you can use a transaction to avoid inconsistency in data during loading operation. This was quite simple, involving only one container in the package. What if a data-loading package uses multiple containers? Let’s see how that can be handled. 20. Drop two Sequence Containers on the Control Flow surface from the Toolbox. Delete the precedence constraint connecting Loading EmailAddress with Loading Vehicle. 21. Select Loading NewCustomer and Loading EmailAddress along with the precedence constraint either by using the mouse or by pressing and holding the key on the keyboard while clicking the tasks one by one. With these tasks selected, drag and drop them inside the first Sequence container. Chapter 8: Advanced Features of Integration Services 307 22. Drag and drop the Loading Vehicle task inside Sequence Container 1. Drag the green arrow from the Sequence Container and drop it on Sequence Container 1 to join these containers so that the tables are loaded in sequence. (Note that this step is not required for running of the package, though.) Your package will look like the one shown in Figure 8-6. 23. Your package is now using two sequence containers to load the data. Verify that the two new containers have their TransactionOption set to default value of Supported and the Package container’s TransactionOption property still has the Required value set previously. 24. Press 5 to run the package. You will see that the first container with the two tasks in it successfully executes and turns green. However, the second sequence container with the Loading Vehicle task in it fails and turns red. Press - 5 to switch back to design mode. 25. Switch to SQL Server Management Studio and run the SQL statement you wrote in Step 10 of the preceding sequence to see what’s been loaded in the database. You will see that no data for the NewCustomer and EmailAddress tables has been added, even though the loading tasks were successful. This is Figure 8-6 Package consisting of multiple containers . execution, press - 5 to return to design mode. 304 Hands-On Microsoft SQL Server 2008 Integration Services 10. Switch to the SQL Server Management Studio and run the following query: SELECT. The data consistency you want to see in your database 306 Hands-On Microsoft SQL Server 2008 Integration Services 15. Switch to SQL Server Management Studio and execute the query written in Step. Because Integration Services tasks support programming SQL Server, you can effectively use native transactions supported by SQL Server. But for most of the work, you will be using the Microsoft