Informatica
®
PowerCenter
®
8
Level I Developer
Student Guide
Informatica PowerCenter 8 Level I Developer Student Guide Version 8.1
April 2006
Copyright (c) 1998–2006 Informatica Corporation. All rights reserved. Printed in the USA.
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. Informatica Corporation does not warrant that this documentation is error free. Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.
Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.
Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.
Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark of Meta Integration Technology, Inc.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 The Apache Software Foundation. All rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms available at http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved.
The zlib library included with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler.
The Curl license provided with this Software is Copyright 1996-200, Daniel Stenberg, <[email protected]>. All Rights Reserved.
The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/ pcre.
InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved.
Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html. This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending. DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described in this documentation at any time without notice.
Table of Contents
List of Figures . . . xiii
Preface . . . xix
About This Guide . . . .xx
Purpose . . . .xx
Audience . . . .xx
Document Conventions . . . .xx
Other Informatica Resources . . . xxi
Obtaining Informatica Documentation . . . xxi
Visiting Informatica Customer Portal . . . xxi
Visiting the Informatica Web Site . . . xxi
Visiting the Informatica Developer Network . . . xxi
Visiting the Informatica Knowledge Base . . . xxi
Obtaining Informatica Professional Certification . . . xxi
Providing Feedback . . . xxii
Obtaining Technical Support . . . xxii
Unit 1: Data Integration Concepts . . . 1
Lesson 1-1. Introducing Informatica . . . 1
Lesson 1-2. Data Integration . . . 3
Lesson 1-3. Mappings and Transformations . . . 5
Mappings . . . 5
Transformations . . . 5
Lesson 1-4. Tasks and Workflows . . . 7
Tasks . . . 7
Workflows . . . 8
Lesson 1-5. Metadata . . . 9
Unit 2: PowerCenter Components and User Interface . . . 11
Lesson 2-1. PowerCenter Architecture . . . 11
Lesson 2-2. PowerCenter Client Tools . . . 12
Designer . . . 13
Workflow Manager . . . 15
Unit 2 Lab: Using the Designer and Workflow Manager . . . 17
Instructions . . . 18
Step 1: Launch the Designer and Log Into the Repository . . . 18
Step 2: Navigate Folders . . . 18
Step 4: Create and Save Shortcuts . . . 20
Step 5: Launch the Workflow Manager . . . 22
Step 6: Navigating the Workflow Manager Tools . . . 23
Step 7: Workflow Manager Task Toolbar . . . 23
Step 8: Database Connection Objects . . . 24
Unit 3: Source Qualifier . . . 27
Lesson 3-1. Source Qualifier Transformation . . . 27
Source Qualifier Transformation . . . 27
Datatype Conversion . . . 29
Lesson 3-2. Velocity Methodology . . . 30
Lab Project . . . 30
Architecture and Connectivity . . . 33
Unit 3 Lab A: Load Payment Staging Table. . . 35
Section 1: Pass-Through Mapping . . . 35
Instructions . . . 38
Step 1: Launch the Designer and Review the Source and Target Definitions . . . 38
Step 2: Create a Mapping . . . 38
Step 3: Create a Workflow and a Session Task . . . 40
Step 4: Run the Workflow and Monitor the Results . . . 43
Data Results . . . 45
Lesson 3-3. Source Qualifier Joins . . . 46
Unit 3 Lab B: Load Product Staging Table . . . 49
Section 2: Homogeneous Join . . . 49
Instructions . . . 52
Step 1: Import the Source Definitions . . . 52
Step 2: Import the Relational Target Definition . . . 53
Step 3: Create the Mapping . . . 53
Step 4: Create the Session and Workflow . . . 55
Step 5: Run the Workflow and Monitor the Results . . . 55
Lesson 3-4. Source Pipelines . . . 57
Unit 3 Lab C: Load Dealership and Promotions Staging Table . . . 59
Section 3: Two Pipeline Mapping . . . 59
Instructions . . . 62
Step 1: Import the Source Definitions . . . 62
Step 2: Import the Target Definitions . . . 62
Step 3: Create the Mapping . . . 62
Unit 4: Expression, Filter, File Lists, and Workflow Scheduler . . . 67
Lesson 4-1. Expression Transformation . . . 67
Lesson 4-2. Filter Transformation . . . 73
Lesson 4-3. File Lists . . . 75
Lesson 4-4. Workflow Scheduler . . . 76
Run Options . . . 76
Unit 4 Lab: Load the Customer Staging Table . . . 79
Instructions . . . 82
Step 1: Create a Flat File Source Definition . . . 82
Step 2: Create a Relational Target Definition . . . 82
Step 3: Create a Mapping . . . 83
Step 4: Create a Filter Transformation . . . 83
Step 5: Create an Expression Transformation . . . 85
Step 6: Create and Run the Workflow . . . 88
Step 7: Schedule a Workflow . . . 92
References . . . 95
Unit 5: Joins, Features and Techniques . . . 97
Lesson 5-1. Joiner Transformation . . . 97
Joiner Properties . . . 99
Join Types . . . 100
Joiner Cache . . . 100
Lesson 5-2. Shortcuts . . . 101
Unit 5 Lab A: Load Sales Transaction Staging Table . . . 103
Instructions . . . 106
Step 1: Create a Flat File Source Definition . . . 106
Step 2: Create a Relational Source Definition . . . 106
Step 3: Create a Relational Target Definition . . . 106
Step 4: Create a Mapping . . . 106
Step 5: Create a Joiner Transformation . . . 107
Step 6: Link the Target Table . . . 109
Step 7: Create a Workflow and Session Task . . . 110
Step 8: Start the Workflow and View Results in the Workflow Monitor . . . 111
Data Results . . . 112
Unit 5 Lab B: Features and Techniques I . . . 115
Instructions . . . 116
Open a Mapping . . . 116
Feature 1: Auto Arrange . . . 116
Feature 3: Revert to Saved . . . 118
Feature 4: Link Path . . . 119
Feature 5: Propagating Ports . . . 120
Feature 6: Autolink by Name and Position . . . 121
Feature 7: Moving Ports . . . 123
Feature 8: Shortcut to Port Editing from Normal View . . . 124
Feature 9: Create Transformation Methods . . . 124
Feature 10: Scale-to-Fit . . . 125
Feature 11: Designer Options . . . 126
Feature 12: Object Shortcuts and Copies . . . 127
Feature 13: Copy Objects Within and Between Mappings . . . 128
Unit 6: Lookups and Reusable Transformations . . . 131
Lesson 6-1. Lookup Transformation (Connected) . . . 131
Lesson 6-2. Reusable Transformations . . . 134
Unit 6 Lab A: Load Employee Staging Table . . . 137
Instructions . . . 142
Step 1: Create a Flat File Source Definition . . . 142
Step 2: Create a Relational Target Definition . . . 142
Step 3: Step Three: Create a Reusable Transformation . . . 142
Step 4: Create a Mapping . . . 144
Step 5: Create a Lookup Transformation . . . 145
Step 6: Add a Reusable Expression Transformation . . . 147
Step 7: Link Transformations . . . 147
Step 8: Create and Run the Workflow . . . 148
Unit 6 Lab B: Load Date Staging Table . . . 153
Instructions . . . 157
Step 1: Create a Flat File Source Definition . . . 157
Step 2: Create a Relational Target Definition . . . 157
Step 3: Create a Mapping . . . 157
Step 4: Create a Workflow and a Session Task . . . 159
Step 5: Run the Workflow and Monitor the Results . . . 159
Data Results . . . 161
Unit 7: Debugger . . . 163
Lesson 7-1. Debugging Mappings . . . 163
Unit 7 Lab: Using the Debugger . . . 167
Instructions . . . 170
Step 2: Step Through the Debug Wizard . . . 170
Step 3: Use the Debugger to Locate the Error . . . 172
Step 4: Fix the Error and Confirm the Data is Correct . . . 174
Unit 8: Sequence Generator . . . 177
Lesson 8-1. Sequence Generator Transformation . . . 177
Unit 8 Lab: Load Date Dimension Table . . . 181
Instructions . . . 185
Step 1: Create a Shortcut to a Shared Relational Source Table . . . 185
Step 2: Create a Shortcut to a Shared Relational Target Table . . . 185
Step 3: Create a Mapping . . . 185
Step 4: Create a Sequence Generator Transformation . . . 186
Step 5: Link the Target Table . . . 187
Step 6: Create and Run the Workflow . . . 187
Data Results . . . 189
Unit 9: Lookup Caching, More Features and Techniques. . . 191
Lesson 9-1. Lookup Caching . . . 191
Lookup Cache . . . 192
Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache) . . . . 195
Instructions . . . 198
Step 1: Create a Shortcut to a Shared Relational Source Table . . . 198
Step 2: Create a Shortcut to Shared Relational Target Table . . . 198
Step 3: Create a Mapping . . . 198
Step 4: Create Lookups for the Start and Expiry Date Keys . . . 198
Step 5: Create and Run the Workflow . . . 201
Data Results . . . 203
Unit 9 Lab B: Features and Techniques II . . . 205
Instructions . . . 206
Open a Mapping . . . 206
Feature 1: Find in Workspace . . . 206
Feature 2: View Object Dependencies . . . 207
Feature 3: Compare Objects . . . 208
Feature 4: Overview Window . . . 211
Unit 10: Sorter, Aggregator and Self-Join . . . 213
Lesson 10-1. Sorter Transformation . . . 213
Lesson 10-2. Aggregator Transformation . . . 216
Aggregator Cache . . . 219
Lesson 10-3. Active and Passive Transformations. . . 220
Lesson 10-4. Data Concatenation . . . 221
Lesson 10-5. Self-Join . . . 222
Unit 10 Lab: Reload the Employee Staging Table . . . 225
Instructions . . . 230
Step 1: Copy an Existing Mapping . . . 230
Step 2: Examine Source Data to Determine a Key for Self-Join . . . 230
Step 3: Prepare the New Mapping for Modification . . . 231
Step 4: Create a Sorter Transformation . . . 231
Step 5: Create a Filter Transformation . . . 232
Step 6: Create an Aggregator Transformation . . . 233
Step 7: Create a Joiner Transformation for the Self-Join . . . 233
Step 8: Get Salaries from the Lookup . . . 234
Step 9: Connect the Joiner and Lookup to the Target . . . 235
Step 10: Create and Run the Workflow . . . 236
Data Results . . . 238
Unit 11: Router, Update Strategy and Overrides . . . 241
Lesson 11-1. Router Transformation . . . 241
Lesson 11-2. Update Strategy Transformation . . . 244
Lesson 11-3. Expression Default Values . . . 246
Lesson 11-4. Source Qualifier Override . . . 247
Lesson 11-5. Target Override . . . 249
Lesson 11-6. Session Task Mapping Overrides . . . 250
Unit 11 Lab: Load Employee Dimension Table . . . 253
Instructions . . . 257
Step 1: Copy the Mapping . . . 257
Step 2: Edit the Expression Transformation . . . 257
Step 3: Create a Router Transformation . . . 258
Step 4: Create an Update Strategy for INSERTS . . . 258
Step 5: Create Lookup to DIM_DATES . . . 259
Step 6: Link upd_INSERTS and lkp_DIM_DATES_INSERTS to Target DIM_EMPLOYEE_INSERTS . . . 260
Step 7: Create an Update Strategy for UPDATES . . . 260
Step 8: Create Second Lookup to DIM_DATES . . . 260
Step 9: Link upd_UPDATES and lkp_DIM_DATES_UPDATES to Target DIM_EMPLOYEE_UPDATES . . . 260
Step 10: Link ERRORS Router Group to DIM_EMPLOYEES_ERR . . . 261
Data Results . . . 264
Step 12: Prepare, Run, and Monitor the Second Run . . . 264
Unit 12: Dynamic Lookup and Error Logging . . . 269
Lesson 12-1. Dynamic Lookup Cache . . . 269
Lesson 12-2. Error Logging . . . 274
Error Log Types . . . 275
Log Row Data . . . 277
Log Source Row Data . . . 277
Unit 12 Lab: Load Customer Dimension Table . . . 279
Instructions . . . 282
Step 1: Create a Relational Source Definition . . . 282
Step 2: Create a Relational Target Definition . . . 282
Step 3: Create a Mapping . . . 282
Step 4: Create a Lookup Transformation . . . 282
Step 5: Create a Filter Transformation . . . 284
Step 6: Create an Update Strategy . . . 284
Step 7: Create and Run the Workflow . . . 285
Data Results . . . 287
Error Log Results . . . 288
Reference . . . 288
Unit 13: Unconnected Lookup, Parameters and Variables . . . 291
Lesson 13-1. Unconnected Lookup Transformations . . . 291
Connected versus Unconnected Lookup Transformations . . . 294
Joins versus Lookups . . . 295
Lesson 13-2. System Variables . . . 295
Lesson 13-3. Mapping Parameters and Variables . . . 297
Unit 13 Lab: Load Sales Fact Table . . . 303
Instructions . . . 307
Step 1: Create an Internal Relationship Between two Source Tables . . . 307
Step 2: Create a Mapping Parameter . . . 307
Step 3: Step Three: Create an Unconnected Lookup . . . 308
Step 4: Add Unconnected Lookup Test to Expression . . . 309
Step 5: Create Aggregator Transformation . . . 309
Step 6: Create and Run the Workflow . . . 312
Data Results . . . 314
Unit 14: Mapplets . . . 317
Mapplets . . . 317
Mapping Input Transformation . . . 319
Mapping Output Transformation . . . 321
Unit 14 Lab: Create a Mapplet . . . 325
Instructions . . . 326
Step 1: Create the Mapplet . . . 326
Step 2: Add Mapplet to Mapping . . . 327
Unit 15: Mapping Design . . . 331
Lesson 15-1. Designing Mappings . . . 331
High Level Process Overview . . . 332
Mapping Specifics. . . 332
Unit 15 Workshop: Load Promotions Daily Aggregate Table . . . 335
Workshop Details . . . 336
Sources and Targets . . . 336
Mapping Details . . . 336
Workflow Details . . . 339
Run Details . . . 340
Unit 16: Workflow Variables and Tasks . . . 343
Lesson 16-1. Link Conditions . . . 343
Lesson 16-2. Workflow Variables . . . 344
Lesson 16-3. Assignment Task . . . 346
Lesson 16-4. Decision Task . . . 347
Lesson 16-5. Email Task . . . 348
Unit 16 Lab: Load Product Weekly Aggregate Table . . . 351
Instructions . . . 353
Step 1: Copy the Mappings . . . 353
Step 2: Copy the Existing Workflow . . . 353
Step 3: Create the Assignment Task . . . 354
Step 4: Create the Decision Task . . . 356
Step 5: Create the Session Task . . . 356
Step 6: Create the Email Task . . . 357
Step 7: Start the Workflow and Monitor the Results . . . 358
Unit 17: More Tasks and Reusability . . . 363
Lesson 17-1. Event Wait Task . . . 363
User-Defined Event . . . 365
Lesson 17-2. Event Raise Task . . . 366
Lesson 17-3. Command Task . . . 367
Lesson 17-4. Reusable Tasks . . . 369
Lesson 17-5. Reusable Session Tasks . . . 369
Lesson 17-6. Reusable Session Configurations . . . 370
Lesson 17-7. pmcmd Utility . . . 371
Unit 18: Worklets and More Tasks . . . 373
Lesson 18-1. Worklets . . . 373
Lesson 18-2. Timer Task . . . 376
Lesson 18-3. Control Task . . . 377
Unit 18 Lab: Load Inventory Fact Table . . . 381
Instructions . . . 383
Step 1: Copy the Mappings . . . 383
Step 2: Create a Worklet . . . 383
Step 3: Create a Session Task . . . 383
Step 4: Create a Timer Task . . . 383
Step 5: Create an Email Task . . . 384
Step 6: Create a Control Task . . . 385
Step 7: Create the Workflow . . . 386
Step 8: Start the Workflow and Monitor the Results . . . 387
Unit 19: Workflow Design . . . 389
Lesson 19-1. Designing Workflows . . . 389
Workflow Overview . . . 390
Workflow Specifics . . . 390
Unit 19 Workshop: Load All Staging Tables in Single Workflow . . . 393
Workshop Details . . . 393
List of Figures
Figure 2-1. Navigator Window . . . 18
Figure 2-2. DEV_SHARED Folder and Subfolders . . . 18
Figure 2-3. Designer Tools . . . 19
Figure 2-4. DEV_SHARED Target subfolder . . . 21
Figure 2-5. Student folder with new objects . . . 22
Figure 2-6. Application Toolbar . . . 22
Figure 2-7. Task Toolbar Default Position . . . 23
Figure 2-8. Task Toolbar After Being Moved . . . 24
Figure 2-9. Relational Connection Browser . . . 25
Figure 3-1. Normal view of the payment flat file definition displayed in the Source Analyzer . . . 38
Figure 3-2. Mapping with Source and Target Definitions . . . 39
Figure 3-3. Normal view of the completed mapping . . . 40
Figure 3-4. Completed Session Task Target Properties . . . 42
Figure 3-5. Completed Workflow . . . 43
Figure 3-6. Successful Run of a Workflow Depicted in the Task View of the Workflow Monitor . . . 43
Figure 3-7. Properties for the Completed Session Run . . . 44
Figure 3-8. Source/Target Statistics for the Completed Session Run . . . 44
Figure 3-9. Data Preview of the STG_PAYMENT Target Table . . . 45
Figure 3-10. Source Definitions with a PK/FK Relationship Displayed in the Source Analyzer . . . 52
Figure 3-11. Normal View of the Completed Mapping . . . 54
Figure 3-12. Generated SQL for the m_Stage_Product Mapping . . . 54
Figure 3-13. Properties of the Completed Session Run . . . 56
Figure 3-14. Source/Target Statistics for the Completed Session Run . . . 56
Figure 3-15. Data Preview of the STG_PRODUCT Target Table . . . 56
Figure 3-16. Normal view of the promotions flat file definition displayed in the Source Analyzer . . . 62
Figure 3-17. Iconic View of the Completed Mapping . . . 63
Figure 3-18. Properties of the Completed Session Run . . . 63
Figure 3-19. Source/Target Statistics for the Completed Session Run . . . 64
Figure 3-20. Data Preview of the STG_DEALERSHIP Target Table . . . 64
Figure 3-21. Data Preview of the STG_PROMOTIONS Target Table . . . 65
Figure 4-1. Source Analyzer View of the customer_layout Flat File Definition . . . 82
Figure 4-2. Target Designer View of the STG_CUSTOMERS Table Relational Definition . . . 83
Figure 4-3. Mapping with Source and Target Definitions . . . 83
Figure 4-4. Mapping with Newly Added Filter Transformation . . . 84
Figure 4-5. Properties Tab of the Filter Transformation . . . 85
Figure 4-6. Completed Properties Tab of the Filter Transformation . . . 85
Figure 4-7. Filter Transformation Linked to the Expression Transformation . . . 86
Figure 4-9. Iconic View of the Completed Mapping . . . 88
Figure 4-8. Sample Expression . . . 88
Figure 4-10. Session Task Source Properties . . . 90
Figure 4-11. Contents of the customer_list.txt File List . . . 90
Figure 4-12. Properties for the Completed Session Run . . . 91
Figure 4-13. Source/Target Statistics for the Completed Session Run . . . 91
Figure 4-14. Data Preview of the STG_CUSTOMERS Target Table . . . 92
Figure 4-15. General Properties for the Workflow . . . 93
Figure 4-16. Customized Repeat Selections . . . 94
Figure 5-1. Normal View of the Heterogeneous Sources, Source Qualifiers and Target . . . 106
Figure 5-2. Joiner Transformation Button . . . 107
Figure 5-3. Normal View of Heterogeneous Sources Connected to a Joiner Transformation . . . 107
Figure 5-5. Edit View of the Condition Tab for Joiner Transformation Without a Condition . . . 108
Figure 5-4. Edit View of the Ports Tab for the Joiner Transformation . . . 108
Figure 5-6. Edit View of the Condition Tab for the Joiner Transformation with Completed Condition . . . . 109
Figure 5-7. Normal View of Completed Mapping Heterogeneous Sources Not Displayed . . . 110
Figure 5-8. Task Details of the Completed Session Run . . . 111
Figure 5-9. Source/Target Statistics for the Session Run . . . 111
Figure 5-10. Data Preview of the STG_TRANSACTIONS Table . . . 112
Figure 5-11. View of an Unorganized Mapping . . . 116
Figure 5-12. Arranged View of a Mapping . . . 117
Figure 5-13. Iconic View of an Arranged Mapping . . . 117
Figure 5-14. Selecting Multiple Links . . . 118
Figure 5-15. Designer Warning Box . . . 118
Figure 5-16. Selecting the forward link path . . . 119
Figure 5-17. Highlighted forward link path . . . 119
Figure 5-18. Highlighted link path going forward and backward . . . 120
Figure 5-19. Selecting to propagate the attributes . . . 120
Figure 5-20. Propagation attribute dialog box . . . 121
Figure 5-21. Autolink dialog box . . . 122
Figure 5-22. Defining a prefix in the autolink dialog box . . . 123
Figure 5-23. Expression after the AGE port has been moved . . . 124
Figure 5-24. Click and drag method of moving ports . . . 124
Figure 5-25. Creating a transformation using the menu . . . 125
Figure 5-26. Create Transformation dialog box . . . 125
Figure 5-27. Normal View of the Newly Created Aggregator Transformation . . . 125
Figure 5-28. Zoom options . . . 126
Figure 5-29. Navigator window in the Designer . . . 127
Figure 6-1. Source Analyzer view of the employees_layout flat file definition . . . 142
Figure 6-2. Target Designer view of the STG_EMPLOYEES relational table definition . . . 142
Figure 6-3. Transformation edit dialog box showing how to make a transformation reusable . . . 143
Figure 6-4. Question box letting you know the action is irreversible . . . 143
Figure 6-5. Transformation edit dialog box of a reusable transformation . . . 143
Figure 6-6. Navigator window depicting the Transformations node . . . 144
Figure 6-7. Partial mapping with source and target . . . 144
Figure 6-8. Transformation Toolbar . . . 145
Figure 6-9. Lookup Transformation table location dialog box . . . 145
Figure 6-10. Dialog box 1 of the 3 step Flat File Import Wizard . . . 145
Figure 6-11. Normal view of the newly created Lookup Transformation . . . 146
Figure 6-12. Lookup Transformation condition box . . . 147
Figure 6-13. Source properties for the employee_list file list . . . 148
Figure 6-14. Task Details of the completed session run . . . 149
Figure 6-15. Source/Target Statistics of the completed session run . . . 149
Figure 6-16. Data Preview of the STG_EMPLOYEES target table. . . 150
Figure 6-17. Mapping with Source and Target definitions . . . . 157
Figure 6-18. Completed Mapping . . . 158
Figure 6-19. Task Details of the completed session run . . . 159
Figure 6-20. Source/Target Statistics for the session run . . . 160
Figure 6-22. Data preview of the STG_DATES table - screen 2 scrolled right . . . 161
Figure 7-1. Debug Session creation dialog box . . . 171
Figure 7-2. Debug Session connections dialog box . . . 171
Figure 7-3. Designer while running a Debug Session . . . 172
Figure 7-4. Customize Toolbars Dialog Box . . . 173
Figure 7-5. Debugger Toolbar . . . 173
Figure 8-1. Expanded view of m-DIM_DATES_LOAD . . . 186
Figure 8-2. Sequence Generator Transformation icon . . . 186
Figure 8-3. Normal view of the sequence generator NEXTVAL port connected to a target column . . . 186
Figure 8-4. Normal view of connected ports to the target . . . 187
Figure 8-5. Task Details of the completed session run . . . 188
Figure 8-6. Source/Target statistics for the session run . . . 188
Figure 8-7. Data Preview of the DIM_DATES table . . . 189
Figure 9-1. m_DIM_PROMOTIONS_LOAD mapping . . . 198
Figure 9-2. m_DIM_DATES from the previous lab that populated the DIM_DATES table . . . 199
Figure 9-3. Select Lookup Table . . . 199
Figure 9-4. Lookup Condition . . . 200
Figure 9-5. m_DIM_POROMOTIONS_LOAD completed mapping . . . 201
Figure 9-6. Task Details of the completed session run . . . 202
Figure 9-7. Source/Target Statistics of the completed session run . . . . 202
Figure 9-8. Data Preview of the DIM_PROMOTIONS target table . . . 203
Figure 9-9. Preview files created when Persistent Cache is set on Lookup Transformation . . . 203
Figure 9-10. Find in workspace dialog box . . . 206
Figure 9-11. View Dependencies dialog box . . . 207
Figure 9-12. Transformation compare objects dialog box . . . 208
Figure 9-13. Compare Transformation objects Properties details . . . 209
Figure 9-14. Target comparison dialog box . . . 210
Figure 9-15. Column differences between two target tables . . . 210
Figure 10-1. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD mapping . . . 230
Figure 10-2. Employee_central.txt. . . 230
Figure 10-3. Renaming an instance of a Reusable Transformation . . . 231
Figure 10-4. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD after most links removed . . . 231
Figure 10-5. Sorter Transformation Icon on Toolbar . . . 231
Figure 10-6. Aggregator Transformation Icon on Toolbar . . . . 233
Figure 10-7. Partial mapping flow depicting the flow from the Sorter to the Filter to the Aggregator. . . 233
Figure 10-8. Split data stream joined back together . . . 234
Figure 10-9. Iconic view of the completed self-join mapping. . . . 236
Figure 10-10. Source properties for the employee_list.txt file list . . . 236
Figure 10-11. Task Details of the completed session run . . . 237
Figure 10-12. Source/Target Statistics of the completed session run . . . 237
Figure 10-13. Data preview of the self-join of Managers and Employees in the STG_EMPLOYEES target table - screen 1 . . . 238
Figure 10-14. Data preview of the STG_EMPLOYEES target table - screen 2 scrolled right . . . 238
Figure 11-1. Mapping copy Target Dependencies dialog box . . . 257
Figure 11-2. Iconic view of the m_DIM_EMPLOYEES_MAPPING . . . 257
Figure 11-3. Router Groups . . . 258
Figure 11-4. Update Strategy set to INSERT . . . 259
Figure 11-5. Iconic view of the completed mapping . . . 261
Figure 11-6. Source Filter Value . . . 262
Figure 11-8. Task Details of the completed session run . . . 263
Figure 11-9. Source/Target Statistics . . . 263
Figure 11-10. Data Results for DIM_EMPLOYEES . . . 264
Figure 11-11. Data Results for the Error Flat File (Located on the Machine Hosting the Integration Service Process . . . 264
Figure 11-12. Task Details tab results for second run . . . 265
Figure 11-13. Source/Target Statistics for second run . . . 265
Figure 11-14. Data preview showing updates to the target table . . . 266
Figure 12-1. Port tab view of a dynamic Lookup . . . 283
Figure 12-2. Port to Port Association . . . 284
Figure 12-3. Iconic View of the Completed Mapping . . . 285
Figure 12-4. Error Log Choice Screen . . . 285
Figure 12-5. Task Details of the Completed Session Run . . . 286
Figure 12-6. Source/Target Statistics for the Session Run . . . 286
Figure 12-7. Data preview of the DIM_CUSTOMERS table . . . 287
Figure 12-8. Flat file error log. . . 288
Figure 13-1. Source Analyzer view of the STG_TRANSACTIONS and STG_PAYMENT tables . . . 307
Figure 13-2. Declare Parameters and Variables screen. . . 308
Figure 13-3. Parameter entry . . . 308
Figure 13-4. Lookup Ports tab showing input, output and return ports checked/unchecked . . . 309
Figure 13-5. Aggregator ports with Group By ports checked . . . 310
Figure 13-6. Finished Aggregator . . . 311
Figure 13-7. Aggregator to Target Links . . . 312
Figure 13-8. Iconic view of the completed mapping . . . 312
Figure 13-9. Task Details of the completed session run . . . 313
Figure 13-10. Source/Target Statistics of the completed session run . . . 313
Figure 13-11. Data Preview of the FACT_SALES target table . . . 314
Figure 14-1. Mapplet Designer view of mplt_AGG_SALES . . . 326
Figure 14-2. Mapplet Designer view of MPLT_AGG_SALES with Input and Output transformations . . . 327
Figure 14-3. Iconic view of the m_FACT_SALES_LOAD_MAPPLET_xx mapping . . . 328
Figure 15-1. Source table definition . . . 336
Figure 15-2. Target table definition . . . 336
Figure 15-3. Task Details of the completed session run . . . 340
Figure 15-4. Source/Target Statistics of the completed session run . . . 340
Figure 15-5. Data Preview of the FACT_PROMOTIONS_AGG_DAILY table . . . 340
Figure 16-1. Workflow variable declaration . . . 354
Figure 16-2. Link condition testing if a session run was successful . . . . 355
Figure 16-3. Assignment Task expression declaration . . . 355
Figure 16-4. Decision Task Expression . . . 356
Figure 16-5. Link condition testing for a Decision Task condition of TRUE . . . 357
Figure 16-6. Email Task Properties . . . 358
Figure 16-7. Completed Workflow . . . 358
Figure 16-8. Gantt chart view of the completed workflow run . . . 359
Figure 16-9. View Workflow Variables . . . 359
Figure 16-10. Value of the $$WORKFLOW_RUNS variable after first run . . . 360
Figure 16-11. Gantt chart view of the completed workflow run after the weekly load runs . . . 360
Figure 16-12. Task Details of the completed session run . . . 360
Figure 18-1. Timer Task Relative time setting . . . 384
Figure 18-2. Email Task Properties Tab . . . 385
Figure 18-4. Completed Worklet. . . 386 Figure 18-5. Completed Workflow . . . 387 Figure 18-6. Gantt chart view of the completed workflow run. . . 387
Preface
Welcome to the PowerCenter 8 Level I Developer course. Data integration is a large undertaking with many potential areas of concern. The PowerCenter infrastructure will greatly assist you in your data integration efforts and alleviate much of your risk. This course will prepare the developers for that challenge by teaching you the most commonly used components of the product.
The students will build a small data warehouse using PowerCenter to extract from source tables and files, transform the data, load it into a staging area and finally into the data warehouse. The instructor will teach you about mappings, transformations, sources, targets, workflows, sessions, workflow tasks, connections and the Velocity methodology.
About This Guide
Purpose
Welcome to the PowerCenter 8 Level I Developer course. This course is designed to:
♦ Enable you to use PowerCenter developer tools to:
♦ Create and debug mappings
♦ Create, run, monitor and troubleshoot workflows
♦ Provide experience in designing mappings
Audience
This course is designed for data integration and data warehousing implementers. You should be familiar with data integration and data warehousing terminology and in using Microsoft Windows.
Document Conventions
This guide uses the following formatting conventions:
If you see… It means… Example
> Indicates a submenu to navigate to. Click Repository > Connect.
In this example, you should click the Repository menu or button and choose Connect.
boldfaced text Indicates text you need to type or enter. Click the Rename button and name the new source definition S_EMPLOYEE.
UPPERCASE Database tables and column names are
shown in all UPPERCASE. T_ITEM_SUMMARY
italicized text Indicates a variable you must replace with
specific information. Connect to the Repository using the assigned login_id.
Note: The following paragraph provides
additional facts. Note: You can select multiple objects to import by using the Ctrl key. Tip: The following paragraph provides
Other Informatica Resources
In addition to the student guides, Informatica provides these other resources:
♦ Informatica Documentation
♦ Informatica Customer Portal
♦ Informatica web site
♦ Informatica Developer Network
♦ Informatica Knowledge Base
♦ Informatica Professional Certification
♦ Informatica Technical Support
Obtaining Informatica Documentation
You can access Informatica documentation from the product CD or online help.
Visiting Informatica Customer Portal
As an Informatica customer, you can access the Informatica Customer Portal site at http://
my.informatica.com. The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica Knowledge Base, and access to the Informatica user community.
Visiting the Informatica Web Site
You can access Informatica’s corporate web site at http://www.informatica.com. The site contains
information about Informatica, its background, upcoming events, and locating your closest sales office. You will also find product information, as well as literature and partner information. The services area of the site includes important information on technical support, training and education, and
implementation services.
Visiting the Informatica Developer Network
The Informatica Developer Network is a web-based forum for third-party software developers. You can access the Informatica Developer Network at the following URL:
http://devnet.informatica.com
The site contains information on how to create, market, and support customer-oriented add-on solutions based on interoperability interfaces for Informatica products.
Visiting the Informatica Knowledge Base
As an Informatica customer, you can access the Informatica Knowledge Base at http://
my.informatica.com. The Knowledge Base lets you search for documented solutions to known technical issues about Informatica products. It also includes frequently asked questions, technical white papers, and technical tips.
Obtaining Informatica Professional Certification
You can take, and pass, exams provided by Informatica to obtain Informatica Professional Certification. For more information, go to:
Providing Feedback
Email any comments on this guide to [email protected].
Obtaining Technical Support
There are many ways to access Informatica Technical Support. You can call or email your nearest Technical Support Center listed in the following table, or you can use our WebSupport Service. Use the following email addresses to contact Informatica Technical Support:
♦ [email protected] for technical inquiries
♦ [email protected] for general customer service requests
WebSupport requires a user name and password. You can request a user name and password at http:// my.informatica.com.
North America / South
America Europe / Middle East / Africa Asia / Australia
Informatica Corporation
Headquarters 100 Cardinal Way Redwood City, California 94063 United States Toll Free 877 463 2435 Standard Rate United States: 650 385 5800 Informatica Software Ltd. 6 Waltham Park
Waltham Road, White Waltham Maidenhead, Berkshire SL6 3TN United Kingdom Toll Free 00 800 4632 4357 Standard Rate Belgium: +32 15 281 702 France: +33 1 41 38 92 26 Germany: +49 1805 702 702 Netherlands: +31 306 022 797 United Kingdom: +44 1628 511 445 Informatica Business Solutions Pvt. Ltd.
301 & 302 Prestige Poseidon 139 Residency Road Bangalore 560 025 India Toll Free Australia: 00 11 800 4632 4357 Singapore: 001 800 4632 4357 Standard Rate India: +91 80 5112 5738
Unit 1: Data Integration Concepts
In this unit you will learn about:
♦ Informatica
♦ Data Integration
♦ Mapping and Transformations
♦ Workflows and Tasks
♦ Metadata
Lesson 1-1. Introducing Informatica
Informatica is affiliated with many standards organizations, including:
♦ Integration Consortium. www.eaiindustry.org
♦ Object Management Group (OMG). www.omg.org.
♦ Common Warehouse Metamodel (CWM). www.omg.org/cwm
♦ Enterprise Grid Alliance. www.gridalliance.org
♦ XML.org. www.xml.org
♦ Web Services Interoperability Organization. www.ws-i.org
♦ Supply-Chain Council. www.supply-chain.org
♦ Carnegie-Mellon Software Engineering Institute (SEI). www.sei.cmu.edu
♦ APICS Educational and Research Foundation. www.apics.org
♦ Shared Services and Business Process Outsourcing Association (SBPOA). www.sharedxpertise.org.
Additional resources about Informatica can be found on the following websites:
♦ www.informatica.com—provides information on Professional Services and Education Services
♦ my.informatica.com—provides access to Technical Support, product documentation, Velocity methodology, knowledge base, and mapping templates
♦ devnet.informatica.com—the Informatica Developers Network offers discussion forums, web seminars, and technical papers.
Lesson 1-2. Data Integration
Traditionally, data integration is a batch process—to extract, transform and load (ETL) data from transactional systems to data warehouses.
The ETL process can be imagined as an assembly line.
Informatica PowerCenter is deployed for a variety of batch and real-time data integration purposes:
♦ Data Migration. ERP consolidation, legacy conversion, new application implementation, system
consolidation
♦ Data Warehousing. Business intelligence reporting, data marts, data mart consolidation, operational
data stores
♦ Data Hubs. master data management; reference data hubs; single view of customer, product, supplier,
employee, etc.
♦ Business Activity Monitoring. business process improvement, real-time reporting
Informatica partners with Composite Software for Enterprise Information Integration (EII): on-the-fly federated views and real-time reporting of information spread across multiple data sources, without moving the data into a centralized repository.
Lesson 1-3. Mappings and Transformations
Mappings
Transformations
PowerCenter includes the following types of transformations:
♦ Passive. The number of rows entering and exiting the transformation are the same.
♦ Active. The number of rows exiting the transformation may not be the same as the number of rows
Commonly used PowerCenter transformations include:
♦ Source Qualifier - reads sources
♦ Filter - filters data conditionally
♦ Sorter - sorts data
♦ Expression - performs logical/mathematical functions on data
♦ Aggregator - sums, averages, maximum, minimum
♦ Joiner - joins two data flows
♦ Lookup - looks up a corresponding value from a table or flat file
Lesson 1-4. Tasks and Workflows
Tasks
Workflows
A workflow is a set of ordered tasks that describe runtime ETL processes. Tasks can be sequenced serially, in parallel and conditionally. Each linked icon represents a task.
Lesson 1-5. Metadata
Metadata, which means “data about data,” is information that describes data. Common contents of metadata include the source or author of a dataset, how the dataset should be accessed, and its limitations.
Unit 2: PowerCenter Components and User Interface
In this unit you will learn about:
♦ PowerCenter Architecture
♦ PowerCenter Client Tools
Lesson 2-1. PowerCenter Architecture
The following screenshot shows the PowerCenter Architecture:
♦ Sources—Can be relational tables or heterogeneous files (flat files, VSAM files and XML)
♦ Targets—Can be relational tables or heterogeneous files
♦ Integration Service—The engine that performs all of the extract, transform and load logic
♦ Repository Service—Manages connectivity to the metadata repositories that contain mapping and workflow definitions
♦ Repository Service Process—Multi-threaded process that retrieves, inserts and updates repository metadata
♦ Repository—Contains all of the metadata needed to run ETL processes
♦ Client Tools—Desktop tools used to populate the repository with metadata, execute workflows on the Integration Service, monitor the workflows and manage the repository
Lesson 2-2. PowerCenter Client Tools
Client tools run on Microsoft Windows.
All tools access the repository through the Repository Service.
Workflow Manager and Workflow Monitor connect to Integration Service.
Each client application has its own interface. The interfaces have toolbars, a navigation window to the left, a workspace to the right, and an output window at the bottom.
Within the Designer, you can display transformations in the following views:
♦ Iconized. Shows the transformation in relation to the rest of the mapping. This also minimizes the
screen space needed to display a mapping.
♦ Normal. Shows the flow of data through the transformation. This view is typically used when
copying/linking ports to other objects.
♦ Edit. Shows transformation ports and properties; allows editing. This view is used to add, edit, or
In the Workflow Manager, you can display tasks in the following views:
♦ Iconized (Session task example)
Unit 2 Lab: Using the Designer and Workflow Manager
Business Purpose
You have been asked to learn how to use Informatica PowerCenter in order to more efficiently accomplish your ETL objectives and automate the development process. Because you have limited or no prior exposure to this software, this exercise will serve to orient you to the basic development interfaces.
Technical Description
PowerCenter includes two development applications: the Designer, which you will use to create
mappings, and the Workflow Manager, which you will use to create and start workflows. This exercise is designed to serve as your first hands-on experience with PowerCenter, and supplement the instructor demonstrations. You will import source and target definitions from a shortcut folder into your own folder.
Objectives
♦ Learn how to navigate the repository folder structure.
♦ Understand the purpose of the tools accessed from the Designer and Workflow Manager.
♦ Create and save source and target shortcuts.
♦ Learn how to access and edit the database connection objects
Duration
Instructions
Step 1: Launch the Designer and Log Into the Repository
1. Launch the Designer client application from the desktop icon. If no desktop icon is present, select Start > Programs > Informatica PowerCenter … > Client > PowerCenter Designer.
2. Maximize the Designer window.
Note: Notice the Navigator window on the left side, which should resemble Figure 2-1. However, you
may see additional or fewer repositories, depending on your classroom environment.
3. Log into the PC8_DEV repository with the user name studentxx, where xx represents your student number as assigned by the instructor. The password is the same. Passwords are always case-sensitive.
Step 2: Navigate Folders
1. Double-click the folder DEV_SHARED. This opens the folder and shows you the subfolders associated with it. Figure 2-2 shows the Navigator:
Figure 2-1. Navigator Window
Tip: The user name to log into the repository is an application-level user name—it allows
PowerCenter to admit you to the repository with a specific set of application privileges. It is not a database user name.
Note: Notice that the DEV_SHARED folder has a small blue arm holding it. This icon denotes that DEV_SHARED is a shortcut folder. As you will see later in this lab, objects dragged from a shortcut folder into an open folder create shortcuts to the object.
2. Expand some of the subfolders to see the objects they hold.
Note that some subfolders are empty. When a new object, such as a target definition, is created within a folder, it automatically goes into the appropriate subfolder.
Note: Notice that within the Sources subfolder, the source objects are organized under individual
“nodes” (branches in the hierarchy), such as FlatFile, ODBC_EDW, etc. These are based on the type of source and the name of the Data Source Name that was used to import the source definition (more on this later). Very Important: You will need to click on these source nodes to locate source
definitions that may be “hiding” from view.
Each PowerCenter application, such as the Designer, shows only subfolders related to the objects that can be created and modified by that application. For example, in the Designer you only see subfolders for sources, targets, mappings, etc.
3. Double-click on your individual student folder.
For the remainder of the class, you will create and modify objects in this folder. Some pre-made objects have been provided as well.
Note: Your student folder is now the “open” folder. Only one folder at a time can be open. The
DEV_SHARED folder is now “expanded.” This distinction is important, as you will see later in this lab.
Step 3: Navigating the Designer Tools
1. Select the menu option Tools > Source Analyzer. The workspace to the right of the Navigator window changes to an empty space.
Note: Note the small toolbar directly to the right of the Navigator window, at the top. These are the
five Designer tools. Each tool allows you to create and modify one specific type of object, such as sources. Figure 2-3 shows the Designer tools with the first tool (the Source Analyzer) selected.
2. With your left mouse button, alternately toggle between the five tools.
Tip: Technically, all folders are “shared” with all users who have the appropriate folder permissions,
regardless whether it has the blue arm or not. Do not confuse repository folders with the directories visible in Windows Explorer. The folders are PowerCenter repository objects and are not related to Windows directories.
Tip: Subfolders are created and managed automatically. Users cannot create, delete, nest, or rename
subfolders.
The name of each tool is displayed in the upper left corner of the workspace when that tool is active.
Note: The main menu bar (very top of your screen) changes depending on which tool is active.
Because these menus are context-sensitive to which tool is active, you must already be in the appropriate tool to create or modify a specific type of object.
♦ The Source Analyzer tool is used to create or modify source objects. They may be relational, flat file, XML or COBOL sources.
♦ The Target Designer tool is used to create or modify target objects. They may be relational, flat file, or XML. It does not matter whether these targets are part of an actual data warehouse.
♦ The Transformation Developer tool is used to create or modify reusable transformations. Non-reusable transformations are created directly in a mapping or mapplet. This distinction will be covered later in the class.
♦ The Mapplet Designer tool is used to create or modify mapplets.
♦ The Mapping Designer tool is used to create or modify mappings.
Step 4: Create and Save Shortcuts
1. Ensure that the Target Designer is active and that your student folder is open.
2. To help view which folder is active, choose View > Workbook to view the PowerCenter Client in Workfbook view.
The PowerCenter Client displays tabs for each folder at the bottom of the Main window:
Important: In order to copy/shortcut any object into a folder, the destination folder (the folder you
are adding to) must be the open folder. If the destination folder is not open, the copy/shortcut will not work.
3. In the DEV_SHARED folder, expand the Targets subfolder by clicking on the + sign to the left of the subfolder. Figure 2-4 shows the Navigator window:
4. Drag and drop the STG_PAYMENT target from the Navigator into the Target Designer workspace. You will see the confirmation message, “Create a shortcut to the target table STG_PAYMENT?”
5. Click Yes at the confirmation message.
6. Expand the Targets subfolder in your Student folder. Note that you have added a shortcut of the STG_PAYMENT staging target table to your own folder.
7. Open the Source Analyzer tool in your student folder.
8. In the DEV_SHARED folder, expand the Sources subfolder and expand the FlatFile container.
9. Add shortcuts to your folder to the two source definitions listed below.
♦ PROMOTIONS
♦ PAYMENT
Figure 2-4. DEV_SHARED Target subfolder
Tip: PowerCenter shortcuts are “pointers” to the original object. They can be used but they cannot
be modified as shortcuts. The original object can be modified, and any changes will immediately affect all shortcuts to that object.
10.Confirm that your student folder appears similar to Figure 2-5:
11.Use the menu option Repository > Save to save these objects in your student folder.
Step 5: Launch the Workflow Manager
1. Left-click the toolbar icon for the Workflow Manager shown in Figure 2-6. This toolbar is usually above the Navigator window.
2. Confirm that the Workflow Manager launches and you are automatically logged into the repository the same way as you were in the Designer.
3. Maximize the Workflow Manager application.
Figure 2-5. Student folder with new objects
Tip: You should periodically save changes to the repository when using the Designer or the
Workflow Manager. The keyboard shortcut Ctrl+S can also be used. There is no “auto-save” feature.
Figure 2-6. Application Toolbar
Tip: Avoid having two or more “instances” of the same PowerCenter application (such as the
Workflow Manager) running on a machine at the same time. There is no benefit in doing this, and it can result in confusion when editing objects.
4. Browse through the various folders and subfolders in the Workflow Manager Navigator window as you did in the Designer. Note that only subfolders for the objects that can be created with the Workflow Manager are present, Tasks, Sessions, Worklets, and Workflows.
Note: Although a session object is a type of task, it gets its own subfolder because you will typically
have many more sessions than the other types of tasks. Only reusable sessions will appear in the Sessions subfolder. Likewise, only reusable tasks (except for sessions) will appear in the Tasks subfolder.
Step 6: Navigating the Workflow Manager Tools
1. Select the menu option Tools > Task Developer.Just as in the Designer, you will see the workspace clear itself and a toolbar appear to the right of the Navigator window. The idea is the same as with the Designer, except there are three tools instead of five.
2. With your left mouse button, alternately toggle between the three tools.
Note that the name of each tool is displayed in the upper left corner of the workspace when that tool is active. Note also the context-sensitive menus, as we did in the Designer.
♦ The Task Developer tool is used to create or modify reusable tasks.
♦ The Worklet Designer tool is used to create or modify worklets.
♦ The Workflow Designer tool is used to create or modify workflows
Step 7: Workflow Manager Task Toolbar
The Workflow Manager is equipped with a toolbar that shows an icon for each type of task that can be created. This toolbar is visible by default, but the default location of is at the top right-hand corner of the screen. We will move the toolbar to a more central location.
1. Locate the “vertical stripe” at the far-left hand side of the task bar, as shown in Figure 2-7:
2. With your left mouse button, drag the toolbar toward the left and drop it in a convenient location so that all of the buttons are visible. The top of your Workflow Manager should appear similar to Figure 2-8:
Step 8: Database Connection Objects
Later in the class, we will create sessions that will read data from database source and target tables. In order to open a connection to the respective databases, PowerCenter needs the database log-in and the designation (i.e., connection string, database name or server name).
Instead of requiring the user to type this information each time a session is created, PowerCenter allows us to create reusable and sharable database connection objects. These objects contain properties
describing one database connection. The objects can be associated with multiple sessions to describe either source, target, or lookup connections.
1. In the Workflow Manager, select the menu option Connections > Relational.
You will see the Relational Connection Browser similar to Figure 2-9.
Note: Note that each connection object is organized under a database type.
2. Double-click on the NATIVE_TRANS connection object to display its properties.
3. You will not have write privileges. Click OK.
Note: Note that the connection NATIVE_TRANS will log into the database with the user name sdbu.
The connection object will be shared among the students in the class.
4. Double-click on any of the other objects that have your student number. The NATIVE_STG07 connection, for example, will have the user name tdbu07. These are the individual student connections to be used to read from and write to your individual staging tables and the enterprise data warehouse (EDW) tables.
It’s intuitive to create additional connection objects. Experiment if you have extra time.
Figure 2-9. Relational Connection Browser
Unit 3: Source Qualifier
In this unit you will learn about:
♦ Source Qualifier transformation
♦ Velocity Methodology
♦ Source Qualifier joins
♦ Source pipelines
Lesson 3-1. Source Qualifier Transformation
Source Qualifier Transformation
Type
Active.Description
A Source Qualifier transformation:
♦ Selects records from flat file and relational table sources. Only those fields or columns used in the mapping are selected, based on the output connections.
♦ Converts the data from the source’s native datatype to the most compatible PowerCenter transformation datatype.
♦ Generates a SQL query for relational sources.
Properties
The following table describes the Source Qualifier transformation properties:
Business Purpose
The use of a Source Qualifier is a product requirement; other types of sources require equivalent transformations (XML Source Qualifier, etc.). It provides an efficient way to filter input fields/columns and to perform homogeneous joins.
Property Description
Sql Query Allows you to override the default SQL query that PowerCenter creates at runtime. User Defined Join Allows you to specify a join that replaces the default join created by PowerCenter.
Source Filter Allows you to create a where clause that will be inserted into the SQL query that is generated at runtime. The “where” portion of the statement is not required. For example:
Table1.ID = Table2.ID
Number of Sorted Ports PowerCenter will insert an order by clause in the generated SQL query. The order by will be on the number of ports specified, from the top down. For example, in the sq_Product_Product_Cost Source Qualifier, if the number of sorted ports = 2, the order by will be:
ORDER BY PRODUCT.PRODUCT_ID, PRODUCT.GROUP_ID.
Tracing Level Specifies the amount of detail written to the session log. Select Distinct Allows you to select distinct values only.
Pre SQL Allows you to specify SQL that will be run prior to the pipeline being run. The SQL will be run using the connection specified in the session task.
Post SQL Allows you to specify SQL that will be run after the pipeline has been run. The SQL will be run using the connection specified in the session task.
Datatype Conversion
Data can be converted from one datatype to another by:
♦ Passing data between ports with different datatypes
♦ Passing data from an expression to a port
♦ Using transformation functions
♦ Using transformation arithmetic operators Supported conversions are:
♦ Numeric datatypes <=> Other numeric datatypes
♦ Numeric datatypes <=> String
♦ Date/Time <=> Date or String (to convert from string to date the string must be in the default PowerCenter data format MM/DD/YYYY HH24:MI:SS)
Similarly, when writing to a target the Integration Service converts the data to the target’s native datatype. For further information, see the PowerCenter Client Help > Index > port-to-port data conversion.
Lesson 3-2. Velocity Methodology
In labs, we will use Informatica's Velocity methodology.
This methodology includes:
♦ Templates
♦ Mapping specification templates
♦ Source to target field matrix
♦ Naming conventions
♦ Object type prefixes: m_, exp_, agg_, wf_, s_, …
♦ Best practices
Velocity covers the entire data integration project life cycles: Phase 1: Manage Phase 2: Architect Phase 3: Design Phase 4: Build Phase 5: Test Phase 6: Deploy Phase 7: Operate
For more information, see http://devnet.informatica.com (requires registration).
Lab Project
The Mersche Motors data model consists of the following star schemas. The labs predominately use the Sales star schema.
Data is moved first to the staging area and from there to the data warehouse and target flat files.
The labs can source from flat files and/or a relational database.
Source Tables and Files
The source system has the following relational tables:
PRODUCT
PRODUCT_COST
The source system has the following flat files:
customer_layout dates employees_layout inventory payment promotions sales_transactions
Staging Area
The staging area has the following tables:
STG_CUSTOMERS STG_DATES STG_DEALERSHIP STG_EMPLOYEES STG_INVENTORY STG_PAYMENT STG_PRODUCT STG_PROMOTIONS STG_TRANSACTIONS
Data Warehouse
The data warehouse has the following tables:
DIM_CUSTOMERS DIM_DATES DIM_DEALERSHIP DIM_EMPLOYEES DIM_PAYMENT DIM_PRODUCT DIM_PROMOTIONS FACT_INVENTORY FACT_PRODUCT_AGG_DAILY FACT_PRODUCT_AGG_WEEKLY FACT_PROMOTIONS_AGG_DAILY FACT_SALES
Architecture and Connectivity
Architecture
The labs use the following architecture and connections: Integration Service: PC_IService
Repository Name: PC8_DEV Folders: Student 01 - 20 User Names: student01 - 20 Passwords: student01 - 20
Connectivity
ODBC Connections:Native Connections:
Source Tables ODBC_TRANS
Staging Area ODBC_STG (01 - 20)
Data Warehouse ODBC_EDW (01 - 20)
Source Tables NATIVE_TRANS
Staging Area NATIVE_STG (01 - 20)
Data Warehouse NATIVE_EDW (01 - 20)
Relational Source sdbu with password sdbu
Relational Targets tdbu01 - 20
Unit 3 Lab A: Load Payment Staging Table
Section 1: Pass-Through Mapping
Business Purpose
The staging area of the Mersche Motors data warehouse contains a table that assigns payment type descriptions for each payment ID. Because these descriptions may change, the table must be synchronized daily with the corresponding data located in the operational system. The operational system administrator uses a simple flat file to record and edit these descriptions.
Technical Description
PowerCenter will source from a delimited flat file and insert the data into a database table without performing data transformations. In order to avoid duplicate records in subsequent loads, we will configure PowerCenter to truncate the target table before each load.
Objectives
♦ Open the Designer Tools and switch between Workspaces
♦ Import a flat file definition
♦ Import a table definition
♦ Create a simple pass-through mapping
♦ Create a Session task to run the mapping and configure connectivity
♦ Create a Workflow to run the Session task
♦ Run the Workflow and monitor the results
Duration
Velocity Deliverable: Mapping Specifications
SOURCES
TARGETS
HIGH LEVEL PROCESS OVERVIEW
PROCESSING DESCRIPTION (DETAIL)
This is a pass-through mapping with no data transformation.
Mapping Name m_Stage_Payment_Type_xx
Source System Flat file Target System Oracle Table
Initial Rows 5 Rows/Load 5
Short Description Simple pass-through mapping, comma-delimited flat-file to Oracle table
Load Frequency
Preprocessing Target truncate
Post Processing
Error Strategy Default
Reload Strategy Unique Source Fields
PAYMENT_ID, PAYMENT_TYPE_DESC
Tables
Table Name Schema/Owner Selection/Filter
Files
File Name File Location Fixed/Delimited Additional File Info
payment.txt C:\pmfiles\SrcFiles Delimited Comma delimiter
Tables Schema Owner TDBUxx
Table Name Update Delete Insert Unique Key
STG_PAYMENT X
S
O
URCE T
O
T
A
RGET
F
IEL
D M
A
T
R
IX
Ta rg et Ta bl e Ta rg et C olum n Dat at yp e S ou rce Ta bl e S ou rce Co lu m n Dat at yp e E xp res si on Def au lt Va lu e if N ulls D at a I ssu es / qu alit y S TG _PAY M EN T P ay m ent _i d Num ber (3 ,0 ) PAY M EN T P ay m ent _i d D ec im al (3, 0) S TG _PAY M EN T P ay m ent _t ype_d es c Va rc har 2( 20 ) PAY M EN T P ay m ent _t ype_d es c S tri ng (1 0) Ve lo ci ty B est P ract ice: Th is i s th e V el o ci ty Source to T arg et F iel d Ma tri x. I t is d is p la ye d h ere fo r you referen ce. I n fu ture la b s w e w ill b e us in g a s h or ten ed v ers io n of th e m atri x.Instructions
Step 1: Launch the Designer and Review the Source and Target Definitions
1. Launch the Designer application by selecting Start > Programs > Informatica PowerCenter … >Client > PowerCenter Designer.
2. Log into the PC8_DEV repository with the user name studentxx and password studentxx where xx represents your student number as assigned by the instructor.
3. Open your student folder by double-clicking on it.
4. Open the Source Analyzer selecting the menu option Tools > Source Analyzer.
5. Drag the Shortcut_to_payment source file from the Sources subfolder into the Source Analyzer workspace.
Confirm that your source definition appears the same as displayed in Figure 3-1. You may have to drag the box wider to see the Length column.
6. Open the Target Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:
7. Drag the Shortcut_to_STG_PAYMENT target table definition from the Targets subfolder into the Target Designer workspace.
8. Review the target definition.
Step 2: Create a Mapping
1. Open the Mapping Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:
Tip: If an instance of the Designer is already running on your workstation, do not launch another
instance. It is unnecessary and potentially confusing to run more than one instance per workstation.