1. Trang chủ
  2. » Công Nghệ Thông Tin

Tài liệu Microsoft SQL Server 2000 Data Transformation Services- P4 ppt

50 365 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 50
Dung lượng 752,28 KB

Nội dung

DTS Connections and the Data Transformation Tasks P ART II 126 As indicated by its name, the Transform Data task is at the heart of Data Transformation Services. This task is a data pump that moves data from a data source to a data destination, giving you the opportunity to modify each record as you move it. Three chapters of this book are devoted to the Transform Data task: • This chapter outlines the task’s basic functionality and properties. • Chapter 7, “Writing ActiveX Scripts for a Transform Data Task,” describes the use of ActiveX scripts to programmatically control data transformations. This chapter also dis- cusses creating and using lookups. • Chapter 9, “The Multiphase Data Pump,” shows how to use the new SQL Server 2000 capability to write code for eight different events in the operation of the Data Pump. There are also chapters devoted to the other two data transformation tasks: • Chapter 8, “The Data Driven Query Task,” describes a task that can define several output queries in the process of data transformation. • Chapter 10, “The Parallel Data Pump Task,” describes a new task that lets the data pump use hierarchical recordsets. Additional key information relating to the Transform Data task can be found in these chapters: • Chapter 5, “DTS Connections” • Chapter 27, “Handling Errors in a Package and Its Transformations” • Chapter 28, “High Performance DTS Packages” • Chapter 32, “Creating a Custom Transformation with VC++” It’s possible to get confused about the naming of the Transform Data task. Some peo- ple refer to it as the Data Pump task, reflecting the DataPumpTask and DataPumpTask2 objects that implement this task. It is also called the Data Transformation task. NOTE When to Use the Transform Data Task I have built DTS packages that don’t have any Transform Data tasks, and I have built other packages in which this task did all the movement and manipulation of the data. The Transform Data task is one of the most versatile of all the DTS tasks. Many of the others have limitations that prevent them from being used in certain circumstances. The Transform Data task can be used with a variety of data sources and destinations, it delivers high perfor- mance, and you can manipulate data in a very precise way. 09 0672320118 CH06 11/13/00 4:56 PM Page 126 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. I decide whether or not to use the Transform Data task by going through a process of elimina- tion. If another task will do the job better, I choose it. If I can’t use any of the other tasks because of their limitations, I use the Transform Data task. Consider these specialized situations where other tasks are more effective: • If you are transferring whole databases from SQL Server 7.0/2000 to SQL Server 2000, use the Transfer Databases task. • If you are transferring database objects (tables, views, stored procedures, and so on) from a SQL Server 7.0/2000 database to a SQL Server 7.0/2000 database, use a Transfer SQL Server Objects task. • If you need to choose between several queries when transforming each row of data, con- sider using the Data Driven Query task. (But the Transform Data task in SQL Server 2000 now allows you to modify data using lookups, which removes some of the Data Driven Query task’s advantage in this area.) • If your data source is a text file, your data destination is SQL Server, you are not trans- forming the data as it’s being imported, and you want the fastest possible speed for your data movement, use the Bulk Insert task. • If you are moving data between tables in the same type of relational database, consider using an Execute SQL task. It will be faster than the Transform Data task, but you lose the flexibility of row-by-row processing. • If you are moving hierarchical rowsets, take advantage of the new Parallel Data Pump task. • If you need to move data files to another location, use the FTP task. In all other cases, use the Transform Data task to transform your data. The Transform Data Task C HAPTER 6 6 T HE T RANSFORM D ATA TASK 127 When I was first learning DTS development, I used the Transform Data task a lot more than I do now. I’ve realized that there are many situations where one or more Execute SQL tasks will move my data significantly faster. The Transform Data task is a high-speed data pump, but it still has to process each row of data sequentially, and the high perfor- mance of set-oriented SQL queries can often beat it. I’ve also started using the Bulk Insert task more often because it delivers much better performance. If you need the Transform Data task, use it. It gives you Rapid Application Development and excellent performance. But it’s also good to be aware of the alter- natives. TIP 09 0672320118 CH06 11/13/00 4:56 PM Page 127 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Creating a New Transform Data Task You can create Transform Data tasks in the Package Designer, in the DTS Import/Export Wizard, and in code. Using the Package Designer You can create a new Transform Data task in the Package Designer in several different ways. I recommend the new way provided in SQL Server 2000: 1. Create two connections, one for the data source and the other for the data destination. 2. Select the Transform Data task from the task palette, the toolbar, the Task menu, or Add Task on the pop-up menu. 3. An icon will appear that contains the words “Select source connection.” Move the cursor to the connection you are going to use for the source and select it. 4. The icon will change and will now have the words “Select destination connection,” as shown in Figure 6.1. Click on the connection to be used for the destination. You’ve just created a Transform Data task. DTS Connections and the Data Transformation Tasks P ART II 128 FIGURE 6.1 An icon directs you to choose a source connection and then a destination connection. 09 0672320118 CH06 11/13/00 4:56 PM Page 128 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. You can also create a Transform Data task by doing any of the following: •Reverse steps 2 and 3. If you select a connection before choosing the Transform Data task, that connection will be used as the source. • Select a connection for the source. Press and hold the Shift key while selecting the con- nection for the destination. Then select the Transform Data task. •Draw a marquee around the two connections to be used for the Transform Data task. Then select the Transform Data task. The first connection included in the marquee will usually be used as the source (but not always). Using the DTS Import/Export Wizard If you want to create Transform Data tasks for several tables at the same time, consider using the Import/Export Wizard. If the tables have the same names in the source and the destination, those tables will be connected automatically. If any table does not exist in the destination, the wizard will also make an Execute SQL task with a CREATE TABLE statement for that table. This statement creates a destination table with the same design and structure as the source table. The wizard sets a precedence constraint so that the table is created before the Transform Data task is executed. Using Code The Transform Data task is implemented in SQL Server 2000 with a DataPumpTask2 object. This object inherits all the collections, properties, and methods of the SQL Server 7.0 DataPumpTask object and adds some new properties. All these collections and properties are described in this chapter. The last two sections of the chapter have code samples showing how to create a Transform Data task and all the different types of transformations. The Description and Name of the Task The Source tab of the Transform Data Task Properties dialog has a place to enter a description of the task. This sets the Description property of the task, which is displayed for each task in the DTS Designer and when the package is executed. The Description property of a task is more important than the Name property—unless you want to refer to a task in code. The names of many of the tasks, including the Transform Data task, are not shown in the Package Designer interface. If you want to view or set the Name property, you have to use Disconnected Edit or code. The Transform Data Task C HAPTER 6 6 T HE T RANSFORM D ATA TASK 129 09 0672320118 CH06 11/13/00 4:56 PM Page 129 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The most convenient way to refer to a task in code is by using its name, as shown in this sam- ple of VBScript: Dim pkg, tsk, cus set pkg = DTSGlobalVariables.Parent set tsk = pkg.Tasks(“tskLoadSalesFact”) DTS Connections and the Data Transformation Tasks P ART II 130 When I create a task using the Package Designer, I often rename it immediately using Disconnected Edit. The name has to be changed in two places—the Name property of the Task object and the TaskName object of the Step object. The default names created by the Package Designer are not very descriptive: DTSTask_DTSDataPumpTask_1 DTSTask_DTSDataPumpTask_2 DTSTask_DTSDataPumpTask_3 The names created by the Import/Export Wizard are very descriptive, but they are long and difficult to type in code: Copy Data from dbEmployee to [SalesDataMart].[dbo].[Employee] Task Copy Data from dbCustomer to [SalesDataMart].[dbo].[Customer] Task Copy Data from dbProductInfo to [SalesDataMart].[dbo].[Product] Task I prefer task names that are short but also descriptive: tskLoadEmployee tskLoadCustomer tskLoadProduct Make sure you change the TaskName of the Step object at the same time as you change the Name of the Task object. If you don’t, the task will not be executed. I don’t believe there are any other risks in changing task names in Disconnected Edit, unless the existing names are referenced in code. If you aren’t planning to refer to a task in code, you don’t need to rename it. But if you are referencing your tasks in ActiveX Scripts or exporting your packages to VB for editing, you can make your code clearer by creating better task names. TIP The Source of a Transform Data Task The Source tab of the Transform Data Task Properties dialog, shown in Figure 6.2, displays the name of the source connection. You cannot change this connection without using code or Disconnected Edit. 09 0672320118 CH06 11/13/00 4:56 PM Page 130 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. FIGURE 6.2 The first tab of the Transform Data Task Properties dialog displays the data source properties. In some cases, you have the opportunity to specify which data from the source is to be used. Your choices differ depending on the type of source you are using—a text file, a relational database, or a multidimensional database. Text File Source If the data source is a text file, you don’t have any more choices to make on this tab. The file, as it is specified in the connection, will be the source for the transformation. The Transform Data Task C HAPTER 6 6 T HE T RANSFORM D ATA TASK 131 You cannot use binary files as the source for the Transform Data task. You have to convert them to text files first, and you cannot use any of the built-in DTS tasks to do this conversion. NOTE SQL Table, View, or Query for a Relational Database Source If the data source is a relational database, you can choose between using a table, a view, or a query as the source for the transformation. A list shows the names of all the tables and views. 09 0672320118 CH06 11/13/00 4:56 PM Page 131 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. If you elect to use a query as the transformation source, you have three options for creating the query: •Type the query into the box on the Source tab. • Choose the Browse button to find a file that has a SQL statement in it. • Choose the Build Query button and design the query in the Data Transformation Services Query Designer. There is also a Parse Query button that checks the query syntax and the validity of all the field and table names used. DTS Connections and the Data Transformation Tasks P ART II 132 Do as much of the data manipulation as possible in the source query of the data transformation. Consider using CASE statements or joins to lookup tables to homoge- nize data values. You can greatly improve performance, especially if you are able to move from ActiveX Script transformations to the faster Copy Column transformations. TIP The Data Transformation Services Query Designer The Data Transformation Services Query Designer is shown in Figure 6.3. It is the same query designer that is available in the Enterprise Manager for looking at table data and for creating a view. FIGURE 6.3 The Data Transformation Services Query Designer provides an interactive design environment for creating queries. 09 0672320118 CH06 11/13/00 4:56 PM Page 132 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. There are four panes in the Query Designer: • The Diagram pane is shown at the top of Figure 6.3. Any changes that you make in this box are immediately reflected in the Grid and SQL panes. In the Diagram pane, you can do the following: Drag tables into the pane from the table list at the left. Join tables by dragging a field from one table to another. Right-click the join line to choose a dialog for setting the properties of the join. Select fields to include in the query output. Right-click a field and choose it for sorting. Highlight a field and pick the group by icon on the toolbar. • The Grid pane provides a more detailed view for specifying how individual columns are used in the query. Changes in this pane are immediately reflected in the Diagram pane and the SQL pane. • The SQL pane shows the text of the SQL statement that is being generated for this query. Changes here are not made immediately in the Diagram and Grid panes, but they are made as soon as you click any object outside the SQL pane. • The Results pane shows the results of running the query you are designing. The effects of the changes you make in the query design are not reflected until you rerun the query by clicking the Execute button on the toolbar. The Transform Data Task C HAPTER 6 6 T HE T RANSFORM D ATA TASK 133 Right-clicking in any of the panes brings up a menu that includes the Properties dia- log for the query. Among other things, you can choose the TOP X or TOP X PERCENT of the records in a resultset. TIP MDX Query for a Multidimensional Cube Source You may also want to get data from an OLAP cube. You can connect to Microsoft OLAP Services cubes with the Microsoft OLE DB Provider for OLAP Services. On the Source tab of the Transform Data Task Properties dialog, select SQL Query and type your MDX Statement in the box. You can also use the browse button to find a file that has the MDX statement in it. Don’t try to use the Query Designer. It’s not ready to generate MDX queries—yet! 09 0672320118 CH06 11/13/00 4:56 PM Page 133 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. I’ve used MDX statements to return a single value to verify the results of a data load and cube process. For example, if I know the number of new orders that are being imported into the cube’s fact table, I can query the cube before and after it’s processed to verify that number: select {[Measures].[Order Count]} on columns from OrdersCube DTS Connections and the Data Transformation Tasks P ART II 134 You could choose to use a Table/View option, but the choices that show up in the list are entire cubes. You will generate a cellset that returns every cell of the cube. The lowest level of every dimension is returned. It can take a long time to load even a small cube like Warehouse from the Foodmart sample OLAP database. NOTE The MDX language allows you to return a cubeset of any number of dimensions from 0 to 64. The Transform Data task can only handle 1- and 2-dimension cubesets. The task won’t handle the following valid MDX query, which returns a 0-dimension cellset: select from warehouse This query fails because it doesn’t supply a column heading, so the resulting value can’t be referenced to create a transformation. NOTE Using XML as the Source You can use an XML document as the data source for a Transform Data query, if you have an OLE DB provider that supports XML. An XML provider was not shipped with the initial release of SQL Server 2000. I have used the DataDirect XML ADO Provider from Merant. NOTE 09 0672320118 CH06 11/13/00 4:56 PM Page 134 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Using Parameters in a Source Query One of the new features in SQL Server 2000 is the ability to use parameters in a source query of the Transform Data task: SELECT ProductID, Quantity, Price, SalesDate FROM Sales WHERE SalesDate = ? You assign a value to the parameter by using a global variable. This reference is resolved at runtime. You make the assignments by clicking on the Parameters button. Then, on the Parameter Mapping dialog (shown in Figure 6.4), choose a global variable to use as the Input Global Variable for each of your parameters. The Transform Data Task C HAPTER 6 6 T HE T RANSFORM D ATA TASK 135 FIGURE 6.4 You map the parameters in your source query to global variables using the Parameter Mapping dialog. If you want to create a new global variable, click the Create Global Variables button. Within the Global Variables dialog, you can create, modify, or delete each global variable in the DTS package. Each global variable must have a unique name and a datatype. You can also assign the variable a default value. 09 0672320118 CH06 11/13/00 4:56 PM Page 135 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... in the Testing Transformation dialog, and the data produced by the test is shown in the View Data dialog The Collections That Implement a Transformation A Transform Data task has a Transformations collection that contains one object for each transformation that has been defined Each mapping line corresponds to one Transformation object The Transform Data Task CHAPTER 6 ‘Assume DTS .Transformation variable... file • 2—DTSExceptionFile_ErrorFile—Create the SQL Server 2000 Error Text file This cannot be used at the same time as the 7.0 format file because they are assigned the same filename • 4—DTSExceptionFile_SourceRowFile—Create the SQL Server 2000 Source Row file • 8—DTSExceptionFile_DestRowFile—Create the SQL Server 2000 Destination Row file The Transform Data Task CHAPTER 6 • 256—DTSExceptionFile_Ansi—Create... Properties of the Transform Data Task You can set error handling, data movement, and SQL Server- specific properties on the Options tab of the Transform Data Task Properties dialog, shown in Figure 6.22 FIGURE 6.22 Error handling and data movement are among the properties set on the Options tab of the Transform Data Task Properties dialog 154 DTS Connections and the Data Transformation Tasks PART II... TransformDataTest, the following assignments would be made: • Task Description—TransformDataTest • Step Description—TransformDataTest • Source Connection Description—SourceTransformDataTest • Destination Connection Description—DestTransformDataTest • Task Name—tskTransformDataTest The Transform Data Task CHAPTER 6 • Step Name—stpTransformDataTest • Destination Connection Name—conDestTransformDataTest... this transformation 6 THE TRANSFORM DATA TASK The Transformation object itself has two collections, one containing the source columns and the other containing the destination columns These collections are referenced in Visual Basic as the SourceColumns and DestinationColumns of the Transformation object: 147 148 DTS Connections and the Data Transformation Tasks PART II The Transformation Types In the SQL. .. SQL Server 7.0 version of Data Transformation Services, you could choose between two types of transformations, Copy Column or ActiveX script There are seven more choices in SQL Server 2000 The DateTime String In the previous version of DTS, it was possible to convert dates to new formats, but it took a lot of ActiveX programming You can get the same results much faster with the new DateTime String transformation. .. provider for SQL Server for the destination connection NOTE Chapter 28, “High Performance DTS Packages,” has charts showing the relative performance of the Transform Data task with different options The most important performance choice with the Transform Data task is to use fast load, which is selected by default A data transformation with fast load executes about 130 times faster than a data transformation. .. the Custom Transformation You can create a new type of transformation, or use a Custom Transformation that someone else has made For more information about Custom Transformations, refer to Chapter 32 The Transform Data Task CHAPTER 6 153 6 THE TRANSFORM DATA TASK FIGURE 6.21 The ActiveX Script Transformation Properties dialog gives you a place to write code that executes for each row of data Other... requires references to the Microsoft DTSPackage Object Library and the Microsoft DTSDataPump Scripting Object Library LISTING 6.1 The Visual Basic Code to Create a Transform Data Task Option Explicit Public Function fctCreateTransformDataTask( _ pkg As DTS.Package2, _ Optional sBaseName As String = “TransformDataTask”, _ Optional sSourceDataSource As String = “”, Optional sDestDataSource As String = “”,... However, a new feature in SQL Server 2000 is the addition of the Populate from Source button on the Define Columns dialog Clicking this button automatically rematches the columns from the source The Transform Data Task CHAPTER 6 139 DataPumpTask Destination Properties 6 The properties for the destination of a Transform Data task are similar to those for the source: THE TRANSFORM DATA TASK • DestinationConnectionID—An . procedures, and so on) from a SQL Server 7.0 /2000 database to a SQL Server 7.0 /2000 database, use a Transfer SQL Server Objects task. • If you need to choose. SQL Server 2000, use the Transfer Databases task. • If you are transferring database objects (tables, views, stored procedures, and so on) from a SQL Server

Ngày đăng: 26/01/2014, 15:20

TỪ KHÓA LIÊN QUAN

w