18 Hands-On Microsoft SQL Server 2008 Integration Services Integration Services Architecture Now you understand the benefits Integration Services provides, with its vast array of features, and also know about various versions and feature sets associated with them. Before we move further and get our hands dirty by starting working with it, it’s time to know its architecture. Once you understand its architecture, you will be able to appreciate how the various components perform their jobs to successfully execute an Integration Services package. Let’s start with the architecture diagram provided in Microsoft SQL Server 2005 Books Online and shown in Figure 1-1. Custom applications SSIS designer Integration services service SSIS wizards Native Object model Integration services runtime .dtsx file msdb database Enumerators Container Connection managers Event handlers Task Task Task Task Task Data flow task Data Flow Task Object model Integration services data flow Source Transformation Transformation Destination Data flow components Custom data flow components Destination Source Package Managed Command line utilities Tasks Custom tasks Log providers Data sources Figure 1-1 Integration Services architecture Chapter 1: Introducing SQL Server Integration Services 19 Microsoft SQL Server 2008 Integration Services consists of the following four main components: Integration Services service c Integration Services object model c Integration Services run-time engine c Integration Services data flow engine c Well, you’ve read little bit about these components earlier in this chapter as you were going through the features and uses of Integration Services. The following discussion on each of these components and their functions will clarify how Integration Services is architected. Integration Services Service Shown on the top-right corner of the architecture diagram (Figure 1-1), the Integration Services service is installed as a Windows service when you specifically choose Integration Services during installation. In the next section’s Hands-On exercise, you will see where you make this choice and learn that choosing Integration Services specifically during installations installs other components as well. The Integration Services service allows you to execute Integration Services packages on local or remote computers, stop execution of running packages on local or remote computers, monitor running packages on local or remote computers, and connect to multiple Integration Services servers to manage multiple instances. In Figure 1-1, the Integration Services service points to a .dtsx file and MSDB database, implying that this service can manage SSIS packages stored to a file system or in an MSDB database within SQL Server 2008. The service manages SSIS packages by importing and exporting them from one type of storage location to another. You will learn a lot more about managing packages and their storage locations in Chapter 6. You can connect to the Integration Services service using that SQL Server Management Studio, as you will do later in this chapter. Generally, with other software components, if the service is stopped, most of the components stop working. This is not true with Integration Services, because it is a component used to monitor running packages and manage their storage. You do not need to have this service running to design and run a package. You can save the newly designed package in BIDS on to the file system or in the SQL Server 2008 MSDB database and then execute it as well. However, you may find it a bit faster when running the Integration Services service, as it caches the metadata of the package and the connections. Also, if you need to monitor and list the packages using the SQL Server Management Studio, the Integration Services service must be running. 20 Hands-On Microsoft SQL Server 2008 Integration Services Integration Services Object Model As mentioned earlier, Integration Services is a new product with an object model that supports both native and managed APIs. You can easily use this object model to write custom components such as tasks and transformations using C++ or any common language runtime (CLR)–compliant language. This object model provides easy accessibility for Integration Services tools, command-line utilities, and custom applications as shown on the top section of Figure 1-1. You can also develop custom components, build new packages, load and modify existing packages, and then execute them programmatically. This enables you to automate maintenance and execution of your packages completely. Programming Integration Services is a vast subject and deserves a separate book altogether. A complete discussion is beyond the scope of this book. Integration Services Run Time The Integration Services run time provides support for the package, containers, tasks, and event handlers during package execution. It also provides run-time services such as support for logging, breakpoints, connections to data sources and data stores, and transactions. You read earlier that Integration Services has two separate engines: a run- time engine for workflow and another engine for data flow. Basically, the Integration Services run time consists of whatever you configure in the Control Flow tab of BIDS plus the run-time services. Integration Services Data Flow As mentioned, the second engine of Integration Services provides services to the data flow within a package. This data flow is also known as the pipeline due to the nature of data flowing through various transformations one after another. The Data Flow Task is a unique task provided in the Control Flow tab of BIDS that encapsulates the data flow engine and the data flow components. Integration Services Data Flow consists of one or many Data Flow Sources; none, one, or many Data Flow Transformations; and one or more Data Flow Destinations. The Data Flow engine drives the data out of Data Flow Sources, brings it into pipeline, and lets the Data Flow Transformations perform the aggregations and conversions, merge data streams, conditionally split data into multiple streams, perform lookups, derive columns, and perform several other operations before loading the data into the destination data stores using Data Flow Destinations. You will work with Data Flow components in Chapters 9 and 10 in much detail. Chapter 1: Introducing SQL Server Integration Services 21 Installing Integration Services Now is the time to move forward and get our hands dirty by installing Integration Services. But take a few more minutes before we do that to learn about the installation options and the implications of these options. In real life, either you will be installing Integration Services on a clean Windows platform—that is, a system where no current or previous releases of SQL Server are installed—or you will be installing it on a computer that already has SQL Server 2005 Integration Services or Data Transformation Services of SQL Server 2000 installed. You may choose to install SQL Server 2008 Integration Services alongside SQL Server 2005 Integration Services or DTS 2000, or you may choose to upgrade the existing version of SSIS 2005 or DTS 2000. All these options and their implications have been discussed in the following sections. Installing Integration Services on a Clean System Most of the production systems are built using this method. Administrators prefer to install SQL Server on a fresh installation of a Windows server to avoid any debugging later on because of some old component on the server that doesn’t work properly with SQL Server 2008 Integration Services. I recommend you use a sandbox for doing the Hands-On and install Integration Services clean so that you don’t struggle initially with unwanted issues of compatibility or coexistence. You can install Integration Services either by using the SQL Server Installation Wizard or by running setup program from the command prompt. You’ll install Integration Services using the SQL Server Installation Wizard in the following Hands-On exercise. You will be installing SQL Server 2008 database engine and Integration Services together in this exercise; however, note that Integration Services does not require SQL Server in order to work. You can develop packages in Integration Services that connect to mainframes, Oracle or DB2 database servers, and output in flat files without installing SQL Server. A couple of high-end transformations such as the Fuzzy Lookup Transformation and Fuzzy Grouping Transformation need to create temporary tables for processing data in SQL Server and hence require connection to an SQL Server. However, even in this case, you do not need to have SQL Server running on the same local machine where your Integration Services package is designed or executed. Having said that, Integration Services is a fairly independent product and does not require SQL Server to operate; however, installing the SQL Server Database on the same server might prove beneficial, as most SSIS packages need to be run as SQL Server Agent jobs, which is a database engine feature. 22 Hands-On Microsoft SQL Server 2008 Integration Services Hands-On: Installing SQL Server 2008 Integration Services This is your first Hands-On in which you will install SQL Server 2008 Integration Services using the SQL Server Installation Wizard on a clean system. Method It is important that you follow this process step-by-step, as this installation will be used throughout this book to create and run Integration Services projects. If you do not have SQL Server 2008 Enterprise Edition or Development Edition software, you can download the SQL Server 2008 Enterprise Evaluation Edition from Microsoft’s download web site. This version is valid for 180 days and can be used for trial purposes. Details on non–Integration Services installation options are not covered here and are beyond the scope of this book. Refer to Microsoft SQL Server 2008 Books Online for more details on these installation options. Exercise (Running the SQL Server Installation Wizard) Load SQL Server 2008 DVD media in your computer’s DVD drive and start the installation as follows: 1. After you load the DVD, the Autorun feature will open the Start screen, which displays various options. If Autorun doesn’t do this, browse the DVD and run setup.exe from the root folder. Click the Installation hyperlink from the left sidebar and choose a new SQL Server stand-alone installation or else choose to add features to an existing installation to start the installation. 2. Setup first installs the .NET Framework 3.5 SP1 if it is not already installed. Some versions of Windows Server may require different versions of .NET Framework. Accept the License Agreement and click Install. The Installation Wizard will install .NET Framework, the SQL Server Native Client, and setup support files. Once installation completes, click Exit and the setup installs hot fixes for the operating system if it needs any. Click Finish to complete the preinstallation phase; it may require a restart to continue installation. After restart, again run setup.exe from the installation DVD and choose the New SQL Server installation link. 3. The installation program performs Setup Support Rules checks and lists pass, failure, and warning messages. Click OK to proceed further. On the Setup Support Files screen, click Install to install the required setup support files. 4. On the Feature Selection screen, choose Database Engine Services from the Instance Features section, choose Business Intelligence Development Studio, Integration Services, Client Tools Backwards Compatibility, Client Tools SDK, Chapter 1: Introducing SQL Server Integration Services 23 SQL Server Books Online, and Management Tools—Complete from the Shared Features section as shown in Figure 1-2. This is the most important step in the installation process, as you choose to install Integration Services here. However, even if you do not specifically select Integration Services to install, some components of Integration Services will still be installed because of they are needed to perform specific functions for other selected components. For example, the SQL Server Import and Export Wizard will be installed when you install SQL Server Database Services only. Also, most of the tasks and transformations will be available to you for developing your packages when you install Business Intelligence Development Studio without selecting Integration Services. Bear in mind that Integration Services is not required for developing and executing packages within BIDS; however, to run packages outside the development environment, you do need to choose Integration Services specifically at this step. Integration Services is installed as a Windows service and helps to manage storage of SSIS packages in SQL Server Figure 1-2 Feature Selection for Integration Services 24 Hands-On Microsoft SQL Server 2008 Integration Services or the file system using SQL Server Management Studio and enables you to monitor running packages in local as well as remote SSIS instances. Another benefit of installing Integration Services service is that it caches metadata of package components and speeds up loading of packages in Business Intelligence Development Studio. Selecting Integration Service also installs the ActiveX Script task and the DTS Package Migration Wizard. Also, note that the Integration Services is not listed under Instance Features section. This means that you cannot install more than one instance of Integration Services. Though you don’t have multiple instances of Integration Services on the server, it is instance aware. That is, it can connect to any instance of SQL Server and is not tied to a particular SQL Server instance. All you have to do is modify the Integration Services Configuration file to connect to a different SQL Server instance to store packages. More on this topic is covered in Chapter 6 of the book. As mentioned earlier, Business Intelligence Development Studio is the designing tool for your packages, and selecting this installs the 32-bit design environment along with most of the tasks and transformations. Management Tools installs SQL Server Management Studio and is used to connect to Integration Services, manage storage of packages in the MSDB database and the file system, and monitor running packages. Selecting Client Tools Backward Compatibility feature installs legacy support components and the Execute DTS 2000 Package task to enable you to run your DTS 2000 packages inside the Integration Services 2008 package. If you want the DTS 2000 run-time environment to be installed, you must install it separately. Chapter 14 covers more details on this subject. The Client Tools SDK will be required to install custom-developed managed assemblies for Integration Services, and finally, SQL Server Books Online is the documentation installation feature. Select Shared Feature Directory and click Next. 5. Confirm that on the Instance Configuration page, Default Instance is selected and MESQLSERVER is specified in the Instance ID. Click Next. Again click Next to confirm the Disk Space Requirements. 6. Specify Account Name and password in the Server Configuration page. By default, Integration Service installs with NT AUTHORITY\NETWORK SERVICE account. It is recommended that you choose an appropriate domain account with minimum required permissions assigned. Proceed next to the Database Engine Configuration page to choose an authentication mode. Choose Mixed Mode and specify a password for the built-in system administrator account. Click Add Current User to add yourself to the administrators group and click Next three times to reach to Ready To Install page. 7. Review the options on the Ready To Install screen and press Install when ready. Chapter 1: Introducing SQL Server Integration Services 25 8. You will see the installation of SQL Server 2008 components in the Installation Progress screen. This may take about 15 to 30 minutes to complete on a fast machine. 9. When the process completes, you will see the installation summary log in the Complete screen. Click Close to close the Installation Wizard. If prompted, restart your computer to complete the installation process. Review You installed the SQL Server 2008 database engine default instance and SQL Server Integration Services in the preceding exercise. In real life, you’ll be installing Integration Services on 64-bit servers, as they are becoming more affordable and prevalent these days. Note that the 64-bit version of SQL Server 2008 software installs all the 64-bit components by default. However, BIDS is a 32-bit development environment for Integration Services and selecting this feature installs a 32-bit version of Integration Services tools, enabling you to run your packages in 32-bit mode on a 64-bit server. Also, BIDS is not supported on the 64-bit Itanium operating system and hence is not installed on Itanium servers. You can now check the programs that have been installed via the Control Panel. Also, open the Services console from the Administrative Tools and note the SQL Server 2008 services have been installed along with SQL Server Integration Services 10.0 service. Now you can play around with the software components installed by the Installation Wizard in the Programs group to make yourself acquainted with various components if you haven’t had a chance to look at SQL Server 2008 up until now. Installing Integration Services from the Command Prompt Command prompt installation can help you roll out Installation to a team, install multiple nodes of a failover cluster, or use scripted installation files as a backup plan in case the worst happens. You can install the Integration Services by executing setup.exe along with parameters from the command prompt on a local or remote server. The parameters and their values can be specified either directly in the command or by use of an .ini file. The parameter-value pairs that are relevant for installing Integration Services and its components using a command prompt are as follows: Action c is is a required parameter that specifies the installation type—install, upgrade, or repair. See Books Online for more parameter values. Features c Indicates the SQL Server components to be installed, for instance, (SQL) for Database Engine, (IS) for Integration Services. Other options are (AS) for Analysis Services, (RS) for Reporting Services, and (Tools) for client tools. 26 Hands-On Microsoft SQL Server 2008 Integration Services ISSVCAccount c is is a required parameter option that specifies the service account for the Integration Services service. ISSVCPassword c Specify a password for the ISSVCAccount using this option. ISSVCStartupType c is is an optional parameter to specify the startup type for the service: automatic, manual, or disabled. For example, if you want to install the SQL Server Database Engine, Integration Services, and client tools and online documentation, the syntax for the command will be something like this: setup.exe /q /ACTION=install /FEATURES=IS /ISSVCACCOUNT="DomainName\ UserName" /ISSVCPASSWORD="StrongPassword" To know about more available options, refer to Microsoft SQL Server 2008 Books Online. Installing Side by Side If you already have SQL Server 2005 Integration Services or SQL Server 2000 Data Transformation Services installed on a computer and don’t want to remove or upgrade them, you can still install SQL Server 2008 Integration Services alongside them. They all can coexist on the same computer because all three have different execution and design environments. You may wonder, but in fact the SQL Server 2008 Integration Services has a different designer than its predecessor; for instance, BIDS in 2008 is built on a different architecture than BIDS 2005. Though they can coexist, there are some considerations for you to keep in mind when you’re working with multiple versions of Integration Services. SQL Server 2008 Integration Services has been optimized for performance over its previous version, SSIS 2005. While doing that, the development team at Microsoft has also made some underlying changes such as replacing the word “dts” with “ssis” from several places. As you can expect, this will mean that the code Integration Services works with in the 2005 version will most likely not work in the 2008 version. One such change affects storage of SSIS packages in SQL Server. The sysdtspackages90 table used in SQL Server 2005 to store packages in the MSDB database has been changed to the sysssispackages table in SQL Server 2008. It isn’t hard to imagine that an Integration Services version won’t be able to access packages across both versions due to different storage tables. This also means that you cannot store packages in the MSDB database of one version that have been developed in another version of Integration Services. You must stick to the same version to open and modify packages using Business Intelligence Development Chapter 1: Introducing SQL Server Integration Services 27 Studio. To clarify a bit more, we have a new version of Business Intelligence Development Studio in SQL Server 2008, which is based on Visual Studio 2008. BIDS also has a new scripting environment built around Visual Studio Tools for Applications (VSTA), replacing the earlier environment of Visual Studio for Applications (VSA) used in BIDS 2005. These underlying changes enable you to install BIDS 2008 side by side with BIDS 2005. However, this leaves BIDS 2008 unable to save packages in SSIS 2005 format. You can load and run packages in BIDS 2008 that have been developed in BIDS 2005; however, loading BIDS 2008 converts these packages into SSIS 2008 format and hence runs the package in 2008 environment. You can save this package in SSIS 2008 format but not in SSIS 2005. Hence if you want to modify your SSIS 2005 packages and want to keep them in 2005 format, you have to use BIDS 2005. On the other hand, BIDS 2005 cannot load the higher version—i.e., SSIS 2008, packages at all. For similar reasons, you have a new version of the dtexec utility. While dtexec can run both SSIS 2008 and SSIS 2005 packages, it actually executes only SSIS 2008 format packages, as it converts SSIS 2005 packages temporarily to SSIS 2008 format before execution. This means that if your SSIS 2005 package can’t be converted to SSIS 2008 format by dtexec, it can’t be run. So, you must be a bit careful when working with the packages, as it may require you to keep multiple versions of the same package for SSIS 2005 and SSIS 2008. On the other hand, things are rather simple, as Integration Services can coexist with DTS 2000 without any issues and there are no interoperability, issues as they are totally different in usability, operability, and many other respects. While Integration Services provides wizards and tools to migrate from DTS 2000 to its latest version, it still offers run-time support to run DTS 2000 packages as is. The Execute DTS 2000 Package task allows you to run a DTS 2000 package inside an Integration Services package. This task is not installed by default; you must choose the Client Tools Backward Compatibility feature during the Integration Services installation process. Upgrading to SQL Server 2008 Integration Services You can upgrade from SQL Server 2000 Data Transformation Services or SQL Server 2005 Integration Services. Both the Installation Wizard and the command prompt installation provides options to upgrade. With the Installation Wizard, select the Upgrade from SQL Server 2000 or SQL Server 2005 option, but for a command prompt installation, specify the /ACTION=upgrade option in the parameters. When upgrading SQL Server 2005 Integration Services, you can upgrade either Database Engine and Integration Services together or just the Integration Services or just the Database Engine. The easiest option is to upgrade both Database Engine and Integration Services together when both are on the same computer. This option offers the fewest issues after upgrade, as the MSDB tables that store packages, metadata, and . SQL Server 20 05 Integration Services or Data Transformation Services of SQL Server 2000 installed. You may choose to install SQL Server 2008 Integration Services alongside SQL Server 20 05 Integration. to be run as SQL Server Agent jobs, which is a database engine feature. 22 Hands-On Microsoft SQL Server 2008 Integration Services Hands-On: Installing SQL Server 2008 Integration Services This. 1-1 Integration Services architecture Chapter 1: Introducing SQL Server Integration Services 19 Microsoft SQL Server 2008 Integration Services consists of the following four main components: Integration