lookup, rank and normalizer transformation in informatica

15 339 0
lookup, rank and normalizer transformation in informatica

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Lookup, Rank and Normalizer transformation in Informatica May, 2006 Prepared at INFOSYS TECHNOLOGIES LIMITED India Document Name Lookup, Rank and Normalizer transformation in Informatica.doc Version Rev. 0.0a Author’s Name Rajat Kashyap Author’s Email Rajat_Kashyap@infosys.com 1 Table of Contents 1.1 Active Transformtion 3 1.2 Passive Transformation 3 2. Lookup Transformation 4 2.1 Connected lookup 5 2.2 Unconnected lookups 7 2.3 Specifying database location for a Lookup transformation 9 2.4 SQL override in Lookups 10 3.1 The Rank Port 12 3.2 Rank Index 13 4. Normalizer Transformation 14 5. References 15 2 1. Transformations Overview: A transformation is a repository object that generates, modifies, or passes data. Transformations, in a mapping, represent the operations the Informatica Server performs on the data. Data passes into and out of transformations through ports which are linked in a mapping or mapplet. Transformations can be broadly classified as Active/Passive or Connected/ Unconnected. 1.1 Active Transformtion An Active transformation is the one which can increase or reduce the no of rows passing through it. Example of an Active transformation is Filter transformation. Filter transformation passes those rows, through it, which matches a specified condition or criterion. Another example of an Active transformation is a Normalizer which can increase the no of rows passing through it. 1.2 Passive Transformation A Passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation. Transformations can be Connected or Unconnected. Connected transformations are connected to other transformations. An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation. Filter, Joiner, Expression etc are some examples of connected transformations whereas Lookup and Stored Procedure can be both connected and unconnected. 3 2. Lookup Transformation Lookup transformation is a Passive transformation and it can be either Connected or Unconnected. It is used to look up data in a relational table or view. Lookup definition can be imported either from source or from target tables. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping. You can create a lookup by clicking Transformations > Create > Lookup, in the designer. Fig Lkp1.1 Lookup definition can be imported either from source or from target tables. Fig Lkp1.2 Lookups can be configured to be connected or unconnected, cached or uncached. 4 The Informatica Server queries the lookup table based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup table column values based on the lookup condition. Use the result of the lookup to pass to other transformations and the target. Connected and unconnected lookup (or transformations) receive input and send output in different ways. You can improve session performance by caching the lookup table. If you cache the lookup table, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Informatica Server inserts or updates rows in the cache during the session. When you cache the target table as the lookup, you can look up values in the target and insert them if they do not exist, or update them if they do. You can configure a connected Lookup transformation to receive input directly from the mapping pipeline, or you can configure an unconnected Lookup transformation to receive input from the result of an expression in another transformation. 2.1 Connected lookup Connected lookups receives input values directly from the pipeline, uses a dynamic or static cache and can return multiple columns from the same row or insert into the dynamic lookup cache. If there is no match for the lookup condition, the Informatica Server returns the default value for all output ports. If you configure dynamic caching, the Informatica Server inserts rows into the cache or leaves it unchanged. If there is a match for the lookup condition, the Informatica Server returns the result of the lookup condition for all lookup/output ports. Connected Lookup pass multiple output values to another transformation, links lookup/output ports to another transformation and also supports user-defined default values. For each input row, the Informatica Server queries the lookup table or cache based on the lookup ports and the condition in the transformation. Fig Lkp1.3 5 Here, in Fig Lkp1.3, EmployeeId is taken as an input port and lookup condition will be based on it. If you are making a shared lookup which will be used in different mappings, based on different conditions, you should create as many input ports as required. You can specify the condition for lookup in the condition tab: Fig Lkp1.4 Fields will be fetched based on the above lookup condition. While creating a shared lookup, make sure that you specify at least one lookup condition. You can add condition based on your business need. Fig Lkp 1.5 6 2.2 Unconnected lookups Unconnected Lookups receives input values from the result of a :LKP expression in another transformation. You can only use a static cache here and designate one return port (R) i.e. you can only return one column from each row. In unconnected lookups, if there is no match for the lookup condition, the Informatica Server returns NULL. If there is a match for the lookup condition, the Informatica Server returns the result of the lookup condition into the return port. Fig Lkp1.6 The lookup/output/return port passes the value to the transformation calling :LKP expression. The general format of that is: :LKP.lookup_transformation_name(argument, argument, ) The arguments are local input ports that match the Lookup transformation input ports used in the lookup condition. Following guidelines should be used to write an expression that calls an unconnected Lookup transformation: • The order in which you list each argument must match the order of the lookup conditions in the Lookup transformation. Also the no of arguments must match the no of lookup condition and input ports in the lookup. • The datatypes for the ports in the expression must match the datatypes for the input ports in the Lookup transformation. The Designer does not validate the expression if the datatypes do not match. 7 • If one port in the lookup condition is not a lookup/output port, the Designer does not validate the expression. • The arguments (ports) in the expression must be in the same order as the input ports in the lookup condition. • If you use incorrect :LKP syntax, the Designer marks the mapping invalid. • If you call a connected Lookup transformation in a :LKP expression, the Designer marks the mapping invalid. Fig Lkp1.7 In the figure above, an unconnected lookup is used. The expression exp_lookupEmployee makes a call to the unconnected lookup to fetch the last name of the Employee based on Employee_ID. Fig Lkp1.8 8 Fig Lkp1.9 Unconnected lookups have a major drawback. If you are creating a shared unconnected lookup, and if someone else changes that lookup, according to his requirements, and adds an additional input port, all the calls to that lookup will become invalid because the calls will no longer satisfy the order and no of the input ports in the lookup. 2.3 Specifying database location for a Lookup transformation While configuring the lookup transformation, you can use either the $Source or $Target variable when you specify the database location for a Lookup transformation. You can use these variables in the Location Information property for a Lookup transformation. Fig Lkp1.10 9 When you configure a session, you can specify a database connection value for $Source or $Target. This ensures the Informatica Server uses the correct database connection for the variable when it runs the session. These parameters are passed from the Unix script, while running the ETL. You can also hardcode the connection name but it is not considered to be a good practice and should be limited to testing purposes only. There might be a requirement that the lookup has to be done on a table which is neither in source database nor in target database. Like in my project requirement, I had to code ETLs which move data from one table in Stage to another. Now in a scenario like this you cannot parameterize the database location, if you are looking up a table from the warehouse. So in these cases you have to hardcode the lookup database location. 2.4 SQL override in Lookups You can use SQL override property, while configuring the lookup, to overrides the default SQL statement to query the lookup table. It specifies the SQL statement you want the Informatica Server to use for querying lookup values and can be used only with the lookup with cache enabled. Enter only the SELECT, FROM, and WHERE clauses when you enter the SQL override. By default, the Informatica Server generates an ORDER BY statement for a cached lookup that contains all lookup ports. To increase performance, you can suppress the default ORDER BY statement and enter an override ORDER BY with fewer columns. To override the default ORDER BY statement, specify the ORDER BY statement and place a comment notation after the ORDER BY statement to suppress the default ORDER BY statement that the Informatica Server generates. Make sure that ORDER BY statement contains the condition ports in the same order they appear in the Lookup condition, otherwise the session will fail. 10 [...]... row, the Informatica Server replaces the cached row with the input row 3.1 The Rank Port The Rank transformation includes input/output ports, variable ports and a rank port Rank port is used to designate the column for which we want to rank values Only one rank port can be used in the transformation and Rank port must be linked to another transformation Fig Rank1 .2 Here in the above example ranking is... would be ranked based on the value of the Marks field 12 3.2 Rank Index The Designer automatically creates a RANKINDEX port for each Rank transformation The Informatica Server uses the Rank Index port to store the ranking position for each row in a group It is an output port only and can be passed directly to the target If two rank values match, they receive the same value in the rank index and the transformation. ..3 Rank Transformation Rank Transformation is an Active and Connected transformation The Rank transformation allows you to rank the data, coming inside a rank transformation, based on some particular field It also lets you select only the top or bottom rank or specified no of top/ bottom ranks You can use a Rank transformation to return the largest or smallest numeric value in a port or group Rank transformation. .. needed in the target Fig Rank1 .1 11 You can connect ports from only one transformation to the Rank transformation The Rank transformation allows you to create local variables and write non-aggregate expressions During the workflow, the Informatica Server caches input data until it can perform the rank calculations Informatica Server compares an input row with rows in the data cache If the input row out-ranks... you to group information While configuring the transformation, you can set one of its input/output ports as a group by port For example, if you want to select the 10 top rankers in a particular class as class id (Fig Rank1 .2) For each unique value in the group port, the transformation creates a group of rows falling within the rank definition 13 4 Normalizer Transformation Normalizer Transformation. .. Fig Rank1 .3 In case there is a requirement that it is needed to rank all the rows and all the ranks are needed in the target, specify the maximum no of records that can come, in the rank transformation, from the source In case you are not aware the no of records that can come from the source, specify the maximum no of ranks or records that the rank transformation can rank, i.e 2147483647 The rank transformation. .. will rank whatever no of records that comes in In my project requirement, I was supposed to rank a record based on eight different fields So I needed to use eight rank transformations as a rank transformation can be used to rank on one and only one field Also I needed to have eight expressions, one each before the rank transformation which holds all the fields along with the previous rank index Rank transformation. .. the Rank transformation As an active transformation, the Rank transformation might change the number of rows passed through it You might pass 100 rows to the Rank transformation, but select to rank only the top 10 rows, which pass from the Rank transformation to another transformation While configuring the rank transformation, the designer asks for the no of ranks Specify the no of top/bottom ranks... Transformation is an Active and Connected transformation The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs Normalization is the process of organizing data In database terms, this includes creating normalized tables and establishing relationships between those tables A Normalizer transformation is used mainly with COBOL... different from MIN/ MAX function of the aggregator in the sense that MIN or MAX function only lets you select one maximum/ minimum value from the data whereas Rank transformation allows you to select a set of too pr bottom records You connect all ports representing the same row set to the transformation Only the rows that fall within that rank, based on some measure you set when you configure the transformation, . Lookup, Rank and Normalizer transformation in Informatica May, 2006 Prepared at INFOSYS TECHNOLOGIES LIMITED India Document Name Lookup, Rank and Normalizer transformation in Informatica. doc Version. creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for each row in a group. It is an output port only and can. a group of rows falling within the rank definition. 13 4. Normalizer Transformation Normalizer Transformation is an Active and Connected transformation. The Normalizer transformation normalizes

Ngày đăng: 18/04/2014, 10:18

Từ khóa liên quan

Mục lục

  • 1.1 Active Transformtion

  • 1.2 Passive Transformation

  • 2. Lookup Transformation

    • 2.1 Connected lookup

    • 2.2 Unconnected lookups

    • 2.3 Specifying database location for a Lookup transformation

    • 2.4 SQL override in Lookups

    • 3.1 The Rank Port

    • 3.2 Rank Index

    • 4. Normalizer Transformation

    • 5. References

Tài liệu cùng người dùng

Tài liệu liên quan