128 Hands-On Microsoft SQL Server 2008 Integration Services 11. You are now ready to Start Debugging from the Debug menu or by pressing 5. You will see the For Loop Container turning yellow and the package execution stopped at Execute SQL Task. Open the Locals window from the bottom left of the BIDS screen. If you don’t find it there, you can also open it by clicking Debug | Windows | Locals menu bar command. Expand the variables and scroll down to locate the User::MonthCounter variable. Note that the value is set to 1. Now press 5 to continue processing. The Execute SQL Task will run and will turn green. After that the For Loop Container will start the second iteration by incrementing the value of User::MonthCounter to 2 and the package will break again before executing the Delete monthly records task as shown in Figure 4-11. Note that the value of User::MonthCounter variable is 2 now. 12. Switch to SQL Server Management Studio and run the following SQL Command to see the records left in the Sales table. SELECT * FROM [Campaign].[dbo].[Sales] Note that the records belonging to month 1 have been deleted from the table with only 22 records left in the table. Figure 4-9 Setting an input parameter for Execute SQL task Chapter 4: Integration Services Control Flow Containers 129 13. Keep running the package to completion while checking the values and the records deleted from the Sales table. You will notice that the tasks turn from yellow to green and particularly that the Delete monthly records Execute SQL task inside the For Loop Container task turns yellow and green in turn, demonstrating that this task is running many times—to be precise, it runs 12 times, thus deleting one year worth of data. Once the package completes execution, stop debugging the package by pressing - 5 and click the Execution Results tab to see how the package execution progressed (Figure 4-12). Look in the middle of the window to watch for the Looping for deleting monthly data section, which lists the For Loop Container execution results. Notice the Start (12), then 12 lines specifying the query task 100 percent complete, and then Stop (12). This signifies that the “Delete monthly records” task has been run 12 times successfully and the thirteenth time, the For Loop Container evaluation expression evaluated to False, causing the process to exit from the For Loop Container iterations. 14. Press -- to save all the objects in the package. Close the Project and exit from BIDS. Switch to SQL Server Management Studio and verify that at the end of the package all the records have been deleted from the table. Figure 4-10 Setting an OnPreExecute event break condition on the Execute SQL task 130 Hands-On Microsoft SQL Server 2008 Integration Services Review In this hands-on exercise, you learned how to use and configure the For Loop Container. You have used the For Loop Container to delete data from a table on a month-by-month basis. Whichever way you use the For Loop Container, the bottom line is that it allows you to add iterations—i.e., the looping functionality to your packages whenever you need to use them. If you want to run this exercise for a second time, you will have to bring Sales table to the original state, which you can do by using the Sales_Original table, which contains an initial copy of the Sales table. Figure 4-11 Breaking the package and monitoring the values of variables Chapter 4: Integration Services Control Flow Containers 131 Sequence Container The Sequence Container helps organize a complex package with several tasks by allowing the logical tasks to be grouped together. A Sequence Container is a subset of the package control flow containing tasks and containers within it. You can organize clusters of tasks with Sequence Containers and hence divide the package into smaller containers, rather than dealing with one huge container (Package Container) with lots of tasks. It will be worth highlighting here that a similar grouping feature has also been provided in the Control Flow of an SSIS package. If you have a very complex package that consists of several tasks and containers, sometimes you may find it hard to locate the tasks that you want to work on; in such cases you may choose to group certain tasks together to simplify the visual look and feel of the Control Flow area. You can Figure 4-12 Execution results of the Deleting data month-by-month package 132 Hands-On Microsoft SQL Server 2008 Integration Services group the tasks and containers together by first selecting them and then right-clicking one of them to select the Group option. Once grouped together, these groups can then be collapsed and expanded based on your need to free up the working area on the desktop. However, note that the grouping feature is a design-time feature only and is provided for to simplify the visual display of the Control Flow surface. Such groups have no properties or configurations assigned to them, and they do not affect the execution or run-time behavior of a package. On the other hand, a Sequence Container is a very powerful object that allows you to create multiple separate Control Flows in a package and not only group tasks and containers but also affect their run-time behavior with its own properties. For example, you can disable a Sequence Container to avoid execution of that section of the package, while you can’t do the same with the simple grouping feature. While developing your Integration Services project, you can add this container from the Toolbox into the Control Flow of your package. When you double-click the task, you won’t see any user interface popping up where you can configure this task; rather, the context is changed to the Properties window. The Sequence Container has no custom user interface and can be configured using its Properties window; thus it behaves like a Package Container. You can build your control flow within the Sequence Container by dragging and dropping the tasks and containers within it. Following are some of the uses for which Sequence Containers can be helpful: Organizing a package by logically grouping tasks (tasks focusing on one functional c area of business) makes the package easy to work with. You may choose to run one subset only from the package, depending upon the need. Setting properties on a Sequence Container may apply those settings to multiple c tasks within the container; this allows you to configure properties at one place instead of via individual tasks, making management easier. For example, you can use the TransactionOption property to manage the transaction support for all the tasks in the Sequence Container, that is, if one task fails, all tasks fail. is means you can build a rollback facility for one piece of logic. Such grouping of tasks makes it easy to debug the package. You can disable a c Sequence Container to switch off functionality for some business area and focus on debugging the other areas. Sometimes you may need to isolate variables from certain tasks. In such cases, you c can limit the scope of variables to a group of tasks and containers by keeping them inside a Sequence Container. Chapter 4: Integration Services Control Flow Containers 133 Task Host Container You have worked with the package, the Foreach Loop Container, the For Loop Container, and the Sequence Container in BIDS by using their Editors or Properties windows. The benefits of having a container design in SSIS include sharing of variables between parent and child containers and being able to manage the events happening within a child container. The Task Host Container is not available in BIDS Toolbox as a Container, so you can’t configure it directly and drop tasks in it; rather, it is a wrapper around each Control Flow task. The Task Host Container encapsulates each task when the task is embedded in a workflow on the designer surface. Encapsulating a task empowers the task to use features of a container such as Event Handlers. This container is configured when you configure the properties of the task it encapsulates. Summary The workflow in Integration Services consists of containers. Two containers, For Loop and Foreach Loop, provide the repeating logic, which before SSIS, was available to procedural languages only. With the advent of SSIS, you have at your disposal the For Loop Container to perform repeating logic to the tasks contained within it, the Foreach Loop Container to perform the repeating logic for each item in a collection, and the Sequence Container to group tasks and containers, allowing you to have multiple control flows by having multiple Sequence Containers under one package. In the beginning of the chapter, when you learned about the Package Container, you read that within the Integration Services architecture design, a package sits at the top of the container hierarchy. Yet Integration Services provides a facility to embed a package in another (parent) package by wrapping the (child) package in a wrapper task called the Execute Package task. You will study this task and do a Hands-On exercise for the Execute Package task in the next chapter. Finally, you also learned that each task is treated like a container in Integration Services, as the Task Host Container encapsulates each task. This allows tasks to inherit the benefits of containers. This page intentionally left blank Integration Services Control Flow Tasks Chapter 5 In This Chapter c Categories of Control Flow Tasks c Control Flow Tasks in Detail c FTP Task c Execute Process Task c File System Task c Web Service Task c XML Task c Execute SQL Task c Bulk Insert Task c Message Queue Task c Execute Package Task c Send Mail Task c WMI Data Reader Task c WMI Event Watcher Task c Transfer Database Task c Transfer Error Messages Task c Transfer Jobs Task c Transfer Logins Task c Transfer Master Stored Procedures Task c Transfer SQL Server Objects Task c Back Up Database Task c Check Database Integrity Task c Execute SQL Server Agent Job Task c Execute T-SQL Statement Task c History Cleanup Task c Maintenance Cleanup Task c Notify Operator Task c Rebuild Index Task c Reorganize Index Task c Shrink Database Task c Update Statistics Task c Summary 136 Hands-On Microsoft SQL Server 2008 Integration Services N ow that you understand the architecture of Integration Services, how it uses system and user variables and how the control flow containers can be used in the work flow of a package, it is time to learn more about the control flow tasks that can fabricate a complex workflow to perform a sensible piece of work. This chapter is divided into two main parts: the first covers tasks that play a direct role in building up workflow for your packages, and the second covers tasks that are designed to perform SQL Server maintenance, though they can also be used in the workflow. You have already used some of these control flow tasks, such as the Execute SQL task and the Send Mail task, in the packages you have built so far. The tasks in SQL Server Integration Services (SSIS) define the unit of work necessary to create the control flow in a package. This unit of work can comprise downloading files, moving files, creating folders, running other processes and packages, sending notification mails, and the like. You will be using some of these tasks to build a couple of control flow scenarios to demonstrate how these components can be used together. SSIS also provides a facility to create your own custom tasks if the existing control flow tasks don’t fit the bill. Integration Services supports the Microsoft .NET Framework, so you can use any .NET-compliant programming language, such as Visual Basic .NET or C#, to write custom tasks. In the latter part of this chapter, you will read about Maintenance Plan tasks that can be used to perform database maintenance functions for SQL Server 2000 and above. Using these tasks, you can create quite an elaborative control flow of a package that is designed to manage and maintain databases. Though you may find that these tasks do not directly contribute to building workflow for a package that is designed to solve a business problem, they can still be included in the control flow if required to perform maintenance of SQL Server objects, for example, rebuilding an index after a bulk load. Let’s wade through the Integration Services tasks first before swimming into deeper waters. Categories of Control Flow Tasks Though the Toolbox in Business Intelligence Development Studio (BIDS) shows Control Flow tasks in two main groups, the Control Flow tasks can be categorized on the basis of the functionality they provide. Read through the following introductions quickly to understand how these tasks are categorized and organized in the control flow. You can refer back this section whenever you need a quick description of a task. Chapter 5: Integration Services Control Flow Tasks 137 Data Flow Task This main task performs ETL (extracting, transforming, and loading) operations using the data flow engine. Control Flow Task Description Perform the ETL functions—i.e., extract data from a source, apply transformations to the extracted data, and load this data to a destination. In BIDS, this task has its own designer panel and consists mainly of source adapters, transformations, and destination adapters. This task is covered in detail in Chapters 9 and 10. Data Preparation Tasks These tasks help in managing and applying operations on files and folders and identifying issues with data quality. Control Flow Task Description Perform operations on files and folders in the file system—i.e., you can create, move, delete, or set attributes on directories and files. You will be using this task in a Hands-On exercise later in this chapter. Download and upload files and manage directories—for example, you can download data files from your customer’s remote server using this task to apply transformations on the data in these files using Data Flow task and then upload the transformed data file to a different folder on the customer’s FTP server, again using the FTP task. Read data from a Web Service method and write that data to a variable or a file—for example, you can get a list of postal codes from the local postal company using this task and use this data to cleanse or standardize your data at loading time. Dynamically apply operations to XML documents using XSLT (extensible style sheet language transformation) style sheets and XPath expressions and merge, validate, and save the updated documents to files and variables—i.e., you can get XML documents from many sources, merge these XML documents to consolidate the data, and reformat for presentation in a report layout. You have already used this task in Chapter 2. Using this task, you can profile data in your data sources to identify the data quality issues before loading it into a data warehouse. The knowledge gained from data profiling will help you to design a data cleaning and standardization process in the scheme of your data warehouse. . an OnPreExecute event break condition on the Execute SQL task 130 Hands-On Microsoft SQL Server 2008 Integration Services Review In this hands-on exercise, you learned how to use and configure. Update Statistics Task c Summary 136 Hands-On Microsoft SQL Server 2008 Integration Services N ow that you understand the architecture of Integration Services, how it uses system and user. 4-12 Execution results of the Deleting data month-by-month package 132 Hands-On Microsoft SQL Server 2008 Integration Services group the tasks and containers together by first selecting them