• No results found

PowerCenter Developer: Tips & Tricks for Mapping Designer

N/A
N/A
Protected

Academic year: 2021

Share "PowerCenter Developer: Tips & Tricks for Mapping Designer"

Copied!
53
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

2

PowerCenter Developer:

Tips & Tricks for Mapping Designer

Lingaraju Ramasamy (Raju),

Technical Architecture Manager

Informatica Professional Services

(3)

Agenda

Introduction

Architecture Best Practices

Mapping Tips & Tricks

Transformation Techniques

Use of Metadata

Repository maintenance

(4)

4

(5)

Presenter Contact

Lingaraju Ramasamy (Raju)

[email protected]

408-368-2475 (Mobile)

Technical Architecture Manager, Informatica

Professional Services

(6)

6

(7)

Architecture Best Practices

Consistency

Applying consistent standards reduces long term complications

Naming Conventions (Velocity)

Descriptions

Environments

Documentation (Hyperlink to SharePoint)

Modularity

Develop according to a modular design

Common Error Handling

Reprocessing

Mapping Assistants

(8)

8

Architecture Best Practices

Scalability

Keep volumes in mind in order to create efficient mappings

Caching

Queries

Partitioning

Initial vs. Incremental Loads

Simplicity

Multiple simple processes are often better than few complex

processes

Multiple mappings

Simple Queries

Staging Tables

Advantages

(9)

Sample Complex Mapping

EXP_SEQ_HEA DER_HIERARC HY_RECS11 EXP_SEQ_DET AIL_HIERARCH Y_RECS SQ_SC_T_STR _ATTR_OUTL_ ORG_WK SC_T_STR_AT TR_OUTL_LAN G_FN (Oracle) JAVA_GENERA TE_MSGID JAVA_GENERA TE_SESSID JNR_OUTL_HR _HOL JNR_OUTL_HR S SEQ_HR_HDR SC_T_STR_AT TR_OUTL_HRS _WK1 (Oracle) SC_T_STR_AT TR_OUTL_TM_ FN (Oracle) SC_T_STR_AT TR_OUTL_ORG _WK (Oracle) SC_T_STR_AT TR_OUTL_HRS _WK (Oracle) SC_T_STR_AT TR_OUTL_HOL _FN (Oracle) SC_T_STR_AT TR_OUTL_FN1 (Oracle) SEQ_HOL TC_TRANSACT ION_RECS RTR_HDR_DE T_DATA SEQ_PK_FK_O UTLET EXP_SEQ_DET AIL_HOLIDAY_ RECS EXP_HRS_BOO KEND EXP_SEQ_REC _TM_ZONE SEQ_HRS EXP_LANG_BO

OKEND JNR_OUTL_LA NG_HOL_HRS

EXP_SEQ_HEA DER_TEAM_M EMBER_RECS SQ_TEAM_ME

MBER EXP_TEAM_ME MBER SEQ_TEAM_ME MBER JNR_TEAM_ME MBER JNR_HRS_HDR EXP_HRS_HDR SQ_SC_T_STR _ATTR_OUTL_ FN1 EXP_OUTL_BO OKEND SQ_SC_T_STR _ATTR_OUTL_ HRS_WK SEQ_ORG JNR_ORGANIZ ATION SEQ_LANG EXP_HOL_BOO KEND EXP_SEQ_SER VICE_REC_3 EXP_HEADER_ HOURS EXP_CREATE_ HDR_ELEMEN TS EXP_SEQ_REC _GEO_TYPE SQ_SC_T_STR _ATTR_OUTL_ HOL_FN EXP_SEQ_REC _BASIC SEQ_OTHERS EXP_DETAIL_H OURS SRT SQ_SC_T_STR _ATTR_OUTL_ LANG_FN SQ_SC_T_STR _ATTR_OUTL_ HRS_WK1 EXP_CHK_NE W_RECS EXP_OUTL_OR G_BOOKEND SEQ_OUTLET EXP_PASS_TH ROUGH

(10)

10

Simplified Complex Mapping

RTR_HDR_DE T_DATA EXP_DETAIL_L ANG EXP_CHK_NE W_RECS SEQ_OUTLET LANG_DETAIL_ STR_ATTR_OU TL_WK (Oracle) LANG_HEADER _SC_T_STR_A TTR_OUTL_W K (Oracle) SQ_SC_T_STR _ATTR_OUTL_ CHG_WK SC_T_STR_AT TR_OUTL_CHG _WK (Oracle) SC_T_STR_AT TR_OUTL_LAN G_FN (Oracle) SC_LKP_GET_ MSGID SC_EXP_CREA TE_HDR_ELEM ENTS SRT_INCM_RE CS SEQ_LANG SQ_SC_T_STR _ATTR_OUTL_ LANG_FN EXP_HEADER_ LANG SEQ_OTHERS EXP_PASS_TH ROUGH EXP_OUTL_BO OKEND JNR_OUTL_LA NG_HOL_HRS EXP_LANG_BO OKEND SQ_SC_T_STR _ATTR_OUTL_ HRS_WK SC_LKP_GET_ MSGID SC_T_STR_AT TR_OUTL_HRS _WK (Oracle) SC_EXP_CREA TE_HDR_ELEM ENTS DETAIL_HOUR S_SC_T_STR_ ATTR_OUTL_ WK1 (Oracle) HOURS_HDR_ SC_T_STR_AT TR_OUTL_WK ( Oracle) EXP_SEQ_DAY _OF_WEEK EXP_SRC_BOO KEND SEQ_HRS SQ_SC_T_STR _ATTR_OUTL_ CHG_WK SC_T_STR_AT TR_OUTL_CHG _WK1 (Oracle) EXP_HRS_BOO

KEND EXP_CHK_NE W_RECS RTR_HDR_DE T_DATA DAY_OF_WEEK _SC_T_STR_A TTR_OUTL_W K2 (Oracle) SRT_INCM_RE CS EXP_DETAIL_H OURS EXP_HEADER_ HOURS JNR_OUTL_HR S SEQ_OTHERS EXP_PASS_TH ROUGH SEQ_OUTLET SC_T_STR_AT TR_OUTL_CHG _WK (Oracle) EXP_CHK_NE W_RECS EXP_LANG_BO OKEND SC_T_STR_AT TR_OUTL_LAN G_FN (Oracle) SEQ_OUTLET LANG_DETAIL_ STR_ATTR_OU TL_WK (Oracle) RTR_HDR_DE T_DATA EXP_DETAIL_L ANG SQ_SC_T_STR _ATTR_OUTL_ CHG_WK SC_LKP_GET_ MSGID SC_EXP_CREA TE_HDR_ELEM ENTS SRT_INCM_RE CS LANG_HEADER _SC_T_STR_A TTR_OUTL_W K (Oracle) SEQ_LANG SQ_SC_T_STR _ATTR_OUTL_ LANG_FN EXP_HEADER_ LANG SEQ_OTHERS EXP_PASS_TH ROUGH EXP_OUTL_BO OKEND JNR_OUTL_LA NG_HOL_HRS

Staging

1

Staging 2

Staging 3

Staging –> WebServices

JAVA_GEN_MS G_ID SEQ_ID EXP_GET_SEQ _NUM SQ_SC_T_STR _ATTR_SITE_F N WSC_STR_ATT R_SAVE_SITE EXP_SRC_BOO KEND JAVA_GEN_SE SSID SC_T_STR_AT TR_SITE_FN1 ( Oracle) SC_T_STR_AT TR_SITE_FN (O racle)

(11)
(12)

12

Mapping Tips

Sources and Targets

Use shortcuts from shared folders

Extract only what is necessary

Limit reads on source

Distinguish between similar sources and targets

• Example

• DIM_CUSTOMER1 = DIM_CUSTOMER_insert

(13)

Mapping Tricks

Parameters & Variables

Reduce overhead of creating multiple mappings Replace hard coded values

Use to incrementally extract data

Example

UpdateDateTime >= TO_DATE (‘$$PREV_RUN_TS’)

(14)

14

Mapping Tricks

Parameters & Variables

Assign Parameter/Variable values in a Session

Pass values from one session to a subsequent session in same

workflow/worklet

On Components Tab in Session Properties

Use user-defined workflow/worklet variables

(15)

Mapping Tricks

Built-in Mapping Variables

Mapping Name

Workflow Name

Session Name

Integration Service Name

Repository Service Name

Repository User Name

Folder Name

Session Run Mode

Source Table Names

(16)

16

Mapping Tricks

Group Expression (Anchor transformation)

Add expression transformation after a source qualifier and

before a target

If source or target definition changes, reconnecting ports is

much easier

(17)

Mass Update

pmrep massupdate

Session properties

Session config attributes

Transformation instance attributes

(18)

18

Mapping Assistants

Preview Data

View Data

• Accommodate anomalies early

• Verification of extraction/loading strategies

Type of Data

• Source/Targets

• Relational, Flat file

• XML Files

For further analysis, use Informatica Analyst

• Analyze the content, quality and structure of source data

(19)

Mapping Assistants

Mapping Wizard

• Pass-Through

• Slowly Changing Dimension

• Type 1 Dimension (No History)

• Type 2 Dimension (All History)

• Version Data

• Flag Current

• Effective Date Range

(20)

20

Mapping Assistants

Standardize specifications

Enhance collaboration between analyst and developer

Improve documentation & audit ability of business logic

Mapping Analyst for Excel (MAE)

Data Analyst

Defines Business Terms

Specifies Transformation Rules Standardize Excel format

DI Developer Augments, Tunes Generated Mappings from Specifications Generate Mapping Generate Specification

(21)

Mapping Assistants

• Define consistent methodology & structure for data integration projects

• Build custom wizard based on pattern without coding

• Generate multiple mappings at one time

• Document data flow

Mapping Architect for Visio (MAV)

DI Architect

Creates & Publishes mapping template DI Developer Augments, Tunes Generated Mappings Generate Mappings Informatica Toolbar Informatica Stencil Drawing Window Template File Publish Template

(22)

22

Mapping Assistants

Mapping Architect for Visio (MAV)

Case Study #1

•7 templates were used across 2 projects to generate 600 mappings

•97% of mappings were automatically generated and required no additional changes

•3% needed to be manually modified or custom developed

Case Study #2

•1 template was used to create 150 mappings for a data migration project along with PowerCenter sessions and workflows

•Total effort was less than one day

•Equivalent effort to create 150 mappings manually would have been 2 weeks (10x effort)

(23)
(24)

24

Apply Default Query when possible

• Utilize SQ Attributes

(i.e., User Defined Join, Source Filter)

Understand advantages and limitations of the SQL override

PROS

• Utilize database optimizers (i.e., indexes, hints)

• Can accommodate complex queries CONS

• Processing impacts database resources

• Lose transformation logic in metadata searched

• Unable to utilize Partitioning or Pushdown Optimization options

Minimize complicated queries

Add the SQL Override Query to the Description

Source Qualifier

(25)

Understand Port process order

• INs or IN/OUTs

• VARIABLEs

• OUTPUTs

Reduce code complexity

• Use local variables

• Redundant calculations

• Check previous record

• Provide comments (-- or //) in expressions

Optimize Expressions

• Numeric operations are faster than string operations

• Operators are faster than functions

• Un-Nest complicated logic (use IIF or DECODE)

Expressions

(26)

26

Transformation Tips

Build complex expressions and reuse

them within repository

Two Types:

• Public: Callable from any transformation expression

• Private: Only callable from another user-defined function

Include any valid function except

aggregate functions

Can export to XML Files

(27)

Transformation Tips

Consider Source Qualifier with

a filter to limit rows within

relational sources

Filter as close to the source as

possible

Replace multiple filters with a

router

Pertaining to routers, rows will

go to each path where the

criteria is TRUE

(28)

28

Use sorted input to decrease use of aggregate caches

Limit connected input/output or output port

Filter data before aggregating

Use as early as possible

Joiners

Perform joins in Source Qualifier when possible

• Limit use to heterogeneous and flat file sources

Perform normal joins when possible

Join sorted input when possible

Designate the master source as the source with fewer rows

Transformation Tips

Aggregators

(29)

Transformation Tips

Using SQL Override in Lookup

• Similar to Source Qualifier, avoid when possible

• Can apply Parameters and Variables

• Can query against multiple tables in same database

• Suppress ORDER BY statement by appending two dashes (--)

Add indexes to database columns

Replace large lookup tables with joins in the Source Qualifier

when possible

Relational Lookups should only return ports that meet the

condition

Remove all ports not used downstream or in the SQL Override

(30)

30

Transformation Tips

Lookup Cache Types

• Persistent Caches

• Save lookup cache files for reuse

• Dynamic Caches

• Retains the latest changes to data as rows pass through the mapping

• Updating a master table

• Real-time sessions

• Slowly changing dimension

Cache Sizes

• Eliminate Paging

• Stores condition values in index, .idx, files

• Stores output values in data, .dat, files

• Apply the Cache Calculator in Session

(31)

Transformation Tips

Cache Updates

• Update the dynamic lookup cache with results of an expression

Use Case: Update QTY on hand for new timestamp

Add WHERE incoming row timestamp > cached timestamp

SQL Overrides for Uncached Lookups

• You must choose the Use Any Value on Lookup Policy on Multiple Match condition

Multiple Rows Return

Use Case: Aggregate customer orders and store the total value

Database Deadlock Resilience

Lookups

(32)

32

Transformation Tricks

Perform a lookup on an application source that is not a

relational table or file

Partial pipeline contains Source & Source Qualifier but no target

Integration Service reads source data and passes to Lookup

Transformation to create cache

Create partitions to improve performance

(33)

Transformation Tips

Transaction in PowerCenter is

a set of rows bound by commit

or rollback

Control commit and rollback

transactions based on a row or

set of rows that pass through

the transformation

Use Case: Each invoice number is committed to the target database as a single transaction

Change Tracing Level to

‘Terse’

• At higher tracing levels, every

(34)

34

Transformation Tips

Associated Source Qualifier

Use ASQ when MQ data is flat file or COBOL

(35)

Transformation Tips

Non-Reusable the counter is 0

Performance will be affected if cached is low

Increase of caching will improve the performance

This doesn’t involve any database operation

The caching allows to reserve number of rows in the

memory

(36)

36

(37)

File Source and Target Commands

Commands for File Sources

Use a command to generate flat file source data

input rows or file list or a session

Unix – any valid UNIX command or shell script

Windows – any valid DOS or batch file.

Service process variables ($PMSourceFileDir)

can be used in the command.

(38)

38

File Source Command

Input Type: Command (default: file)

Command Type: Command Generating File List

Command writes list of file names to stdout

PowerCenter interprets this as a file list.

(39)

File Source Command

Input Type: Command

(default: file)

Command Type: Command Generating Data

Command generates rows to stdout

Flat file reader reads directly from stdout

Removes need for staging data

Example use, reading compressed files

uncompress –c $PMSourceFileDir/myCompressedFile.Z

(40)

40

File Target Command

Output Type: Command

(default: file)

Flat file writer writes to the command

Writing compressed files

compress -c - > $PMTargetFileDir/myCompressedFile.Z

Sorting output data

(41)

Filename Port

Input Filename can be processed and passed

to target

(42)

42

Filename Port

Write records to a dynamically named flat file

Target Filename

(43)
(44)

44

Change Detection for Updates

Challenge: a record with a lot of columns needs

to be checked for changes

Solution: calculate an MD5 checksum on the

columns and use a lookup to compare the value

with any existing record

(45)

Sample Change data detection

Calculate MD5 for all columns except key

Create lookup for primary key and MD5 value

(46)

46

(47)

Querying the PowerCenter repository

Query in designer

Limit querying on OPB tables

Use the MX views instead

Utilize Reporting Service

Use Meta Query tool

(48)

48

(49)
(50)

50

Repository Maintenance

Purge repository versions

Define version strategy for Dev,

QA and Prod

Archieve if required for future

analysis

Purge unwanted versions

Run the purge in regular interval

daily, weekly or monthly

pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n

$ADMIN_USER -X INFA_ENCRYPTED_PASSWD

(51)

Repository Maintenance

Purge repository logs

Define log strategy for Dev, QA and Prod

Archieve if required for future analysis

Purge unwanted logs

Run the purge in regular interval daily, weekly or monthly

Compute statistics on metadata tables

pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME

-n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD

pmrep truncatelog -t $DAYS_TO_KEEP

(52)

52

Additional Informatica Resources

Refer the following...

• http://mysupport.informatica.com

• http://velocity.informatica.com/

• http://marketplace.informatica.com

Product manuals

(53)

References

Related documents

employment) it is important to understand that the applicable employment law may vary with the citizenship of the employee. For this reason, it is important to understand the

University of Maryland College Park, Philip Merrill College of Journalism, Capital News Service Video Essay (Single Camera Only).

Astartes, Battlefleet Gothic, Black Flamer, Black Library, the Black Library logo, BL Publishing, Blood Angels, Bloodquest, Blood Bowl, the Blood Bowl logo, the Blood Bowl

The average TCP throughput, TCP RTT and CPU utilisation results for both Windows 8.1 and Debian 7.5.0 showed that ACL, RA Guard, Validate Source MAC and VLAN were effective

accelerations. But on the whole those determinants do a very poor job of predicting the turning points. It would appear that growth accelerations are caused predominantly by

[r]

To ascertain how much influence teachers’ education has on at-risk children’s assessment scores, I suggest that licensed teachers with high levels of education in early childhood

In the Third District, the number of borrowers rose from just over 1.1 million (11.5 percent of the CCP) at the start of 2005 to just under 1.8 million (17.5 percent of the CCP)