• No results found

SCD Type

In document Notes Informatica (Page 85-91)

 It stores entire history of data in the dimensional table.

In type 2, you can store the data in three different ways. They are  Versioning

 Flagging  Effective Date

SCD Type 2 Versioning

: In versioning method, a sequence number is used to represent the change. The latest sequence number always represents the current row and the previous sequence numbers represents the past data.

As an example, let’s use the same example of customer who changes the location. Initially the customer is in Illions location and the data in dimension table will look as.

surrogate_key customer_id customer_name Location Version ---

1 1 Marston Illions 1

The customer moves from Illions to Seattle and the version number will be incremented. The dimension table will look as

surrogate_key customer_id customer_name Location Version --- 1 1 Marston Illions 1

2 1 Marston Seattle 2

Now again if the customer is moved to another location, a new record will be inserted into the dimension table with the next version number.

SCD Type 2 Flagging

: In flagging method, a flag column is created in the dimension table. The current record will have the flag value as 1 and the previous records will have the flag as 0.

Now for the first time, the customer dimension will look as.

surrogate_key customer_id customer_name Location flag --- 1 1 Marston Illions 1

Now when the customer moves to a new location, the old records will be updated with flag value as 0 and the latest record will have the flag value as 1.

surrogate_key customer_id customer_name Location Version ---

1 1 Marston Illions 0 2 1 Marston Seattle 1

SCD Type 2 Effective Date: In Effective Date method, the period of the change is tracked using the start_date and end_date columns in the dimension table.

surrogate_key customer_id customer_name Location Start_date End_date --- 1 1 Marston Illions 01-Mar-2010 20-Fdb-2011 2 1 Marston Seattle 21-Feb-2011 NULL

The NULL in the End_Date indicates the current version of the data and the remaining records indicate the past data.

SCD Type 3

: In type 3 methods, only the current status and previous status of the row is

maintained in the table. To track these changes two separate columns are created in the table. The customer dimension table in the type 3 method will look as

surrogate_key customer_id customer_name Current_Location previous_location --- 1 1 Marston Illions NULL

Let say, the customer moves from Illions to Seattle and the updated table will look as

--- 1 1 Marston Seattle Illions

Now again if the customer moves from seattle to NewYork, then the updated table will be

surrogate_key customer_id customer_name Current_Location previous_location --- 1 1 Marston NewYork Seattle

The type 3 method will have limited history and it depends on the number of columns you create.

Things to know

 In SCD O Dimensional Table, we just keep the data as it is and it will never change.  SCD type 4 provides a solution to handle the rapid changes in the dimension tables.

Stopping or Aborting a Session Task:

When you issue a stop command on a session, the integration service first stops reading the data from the sources. It continues processing and writing data to the targets and then commits the data.

Abort command is handled the same way as the stop command, except that the abort command has timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.

Load Manager And DTM Process - Informatica

While running a Workflow, the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks. When the PowerCenter Server runs a workflow, the Load Manager performs the following tasks:

1. Locks the workflow and reads workflow properties. 2. Reads the parameter file and expands workflow variables. 3. Creates the workflow log file.

4. Runs workflow tasks.

5. Starts the DTM to run sessions.

6. Sends post-session email if the DTM terminates abnormally.

When the PowerCenter Server runs a session, the DTM performs the following tasks: 1. Fetches session and mapping metadata from the repository.

2. Creates and expands session variables. 3. Creates the session log file.

4. Verifies connection object permissions. 5. Runs pre-session shell commands.

6. Runs pre-session stored procedures and SQL.

7. Creates and runs mapping, reader, writer, and transformation threads to extract,transform, and load data. 8. Runs post-session stored procedures and SQL.

9. Runs post-session shell commands. 10. Sends post-session email

DTM

Data Transformation Manager

 The PowerCenter Integration Service process starts the DTM process to run a session.  The DTM is the process associated with the session task.

DTM process does the below tasks

Read the session information

The DTM retrieves the mapping and session metadata from the repository and validates it.

Perform Pushdown Optimization

If the session is configured for pushdown optimization, the DTM runs an SQL statement to push transformation logic to the source or target database

If the workflow uses a parameter file, the PowerCenter Integration Service process sends the parameter file to the DTM when it starts the DTM. The DTM creates and expands session- level, service-level, and mapping-level variables and parameters

Create the Session Log

The DTM creates logs for the session. The session log contains a complete history of the session run, including initialization, transformation, status, and error messages

Verify Connection Object Permissions

The DTM verifies that the user who started or scheduled the workflow has execute permissions for connection objects associated with the session.

Run Pre-Session Operations

After verifying connection object permissions, the DTM runs pre-session shell commands. The DTM then runs pre-session stored procedures and SQL commands

Run the Processing Threads

After initializing the session, the DTM uses reader, transformation, and writer threads to extract, transform, and load data.

Run Post-Session Operations

After the DTM runs the processing threads, it runs post-session SQL commands and stored procedures. The DTM then runs post-session shell commands.

Send Post-Session Email

When the session finishes, the DTM composes and sends email that reports session completion or failure

Test Load

 With test load the Integration service reads and transforms data without writing it to target.  The integration service generates all session files and performs pre and post-session

functions.

 The Integration Service writes data to the relational targets but rollback the data when session completes.

 For other targets, Integration Service does not write data to the targets.

 Enter the number of source rows you want to test in the Number of Rows to Test Field.

In document Notes Informatica (Page 85-91)

Related documents