• No results found

Informatica Student Guide

N/A
N/A
Protected

Academic year: 2021

Share "Informatica Student Guide"

Copied!
420
0
0

Loading.... (view fulltext now)

Full text

(1)

Informatica

®

PowerCenter

®

8

Level I Developer

Student Guide

(2)

Informatica PowerCenter 8 Level I Developer Student Guide Version 8.1

April 2006

Copyright (c) 1998–2006 Informatica Corporation. All rights reserved. Printed in the USA.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.

The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. Informatica Corporation does not warrant that this documentation is error free. Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.

Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.

Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.

Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark of Meta Integration Technology, Inc.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 The Apache Software Foundation. All rights reserved.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms available at http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved.

The zlib library included with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler.

The Curl license provided with this Software is Copyright 1996-200, Daniel Stenberg, <[email protected]>. All Rights Reserved.

The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/ pcre.

InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved.

Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html. This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending. DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,

including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described in this documentation at any time without notice.

(3)

Table of Contents

List of Figures . . . xiii

Preface . . . xix

About This Guide . . . .xx

Purpose . . . .xx

Audience . . . .xx

Document Conventions . . . .xx

Other Informatica Resources . . . xxi

Obtaining Informatica Documentation . . . xxi

Visiting Informatica Customer Portal . . . xxi

Visiting the Informatica Web Site . . . xxi

Visiting the Informatica Developer Network . . . xxi

Visiting the Informatica Knowledge Base . . . xxi

Obtaining Informatica Professional Certification . . . xxi

Providing Feedback . . . xxii

Obtaining Technical Support . . . xxii

Unit 1: Data Integration Concepts . . . 1

Lesson 1-1. Introducing Informatica . . . 1

Lesson 1-2. Data Integration . . . 3

Lesson 1-3. Mappings and Transformations . . . 5

Mappings . . . 5

Transformations . . . 5

Lesson 1-4. Tasks and Workflows . . . 7

Tasks . . . 7

Workflows . . . 8

Lesson 1-5. Metadata . . . 9

Unit 2: PowerCenter Components and User Interface . . . 11

Lesson 2-1. PowerCenter Architecture . . . 11

Lesson 2-2. PowerCenter Client Tools . . . 12

Designer . . . 13

Workflow Manager . . . 15

Unit 2 Lab: Using the Designer and Workflow Manager . . . 17

Instructions . . . 18

Step 1: Launch the Designer and Log Into the Repository . . . 18

Step 2: Navigate Folders . . . 18

(4)

Step 4: Create and Save Shortcuts . . . 20

Step 5: Launch the Workflow Manager . . . 22

Step 6: Navigating the Workflow Manager Tools . . . 23

Step 7: Workflow Manager Task Toolbar . . . 23

Step 8: Database Connection Objects . . . 24

Unit 3: Source Qualifier . . . 27

Lesson 3-1. Source Qualifier Transformation . . . 27

Source Qualifier Transformation . . . 27

Datatype Conversion . . . 29

Lesson 3-2. Velocity Methodology . . . 30

Lab Project . . . 30

Architecture and Connectivity . . . 33

Unit 3 Lab A: Load Payment Staging Table. . . 35

Section 1: Pass-Through Mapping . . . 35

Instructions . . . 38

Step 1: Launch the Designer and Review the Source and Target Definitions . . . 38

Step 2: Create a Mapping . . . 38

Step 3: Create a Workflow and a Session Task . . . 40

Step 4: Run the Workflow and Monitor the Results . . . 43

Data Results . . . 45

Lesson 3-3. Source Qualifier Joins . . . 46

Unit 3 Lab B: Load Product Staging Table . . . 49

Section 2: Homogeneous Join . . . 49

Instructions . . . 52

Step 1: Import the Source Definitions . . . 52

Step 2: Import the Relational Target Definition . . . 53

Step 3: Create the Mapping . . . 53

Step 4: Create the Session and Workflow . . . 55

Step 5: Run the Workflow and Monitor the Results . . . 55

Lesson 3-4. Source Pipelines . . . 57

Unit 3 Lab C: Load Dealership and Promotions Staging Table . . . 59

Section 3: Two Pipeline Mapping . . . 59

Instructions . . . 62

Step 1: Import the Source Definitions . . . 62

Step 2: Import the Target Definitions . . . 62

Step 3: Create the Mapping . . . 62

(5)

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler . . . 67

Lesson 4-1. Expression Transformation . . . 67

Lesson 4-2. Filter Transformation . . . 73

Lesson 4-3. File Lists . . . 75

Lesson 4-4. Workflow Scheduler . . . 76

Run Options . . . 76

Unit 4 Lab: Load the Customer Staging Table . . . 79

Instructions . . . 82

Step 1: Create a Flat File Source Definition . . . 82

Step 2: Create a Relational Target Definition . . . 82

Step 3: Create a Mapping . . . 83

Step 4: Create a Filter Transformation . . . 83

Step 5: Create an Expression Transformation . . . 85

Step 6: Create and Run the Workflow . . . 88

Step 7: Schedule a Workflow . . . 92

References . . . 95

Unit 5: Joins, Features and Techniques . . . 97

Lesson 5-1. Joiner Transformation . . . 97

Joiner Properties . . . 99

Join Types . . . 100

Joiner Cache . . . 100

Lesson 5-2. Shortcuts . . . 101

Unit 5 Lab A: Load Sales Transaction Staging Table . . . 103

Instructions . . . 106

Step 1: Create a Flat File Source Definition . . . 106

Step 2: Create a Relational Source Definition . . . 106

Step 3: Create a Relational Target Definition . . . 106

Step 4: Create a Mapping . . . 106

Step 5: Create a Joiner Transformation . . . 107

Step 6: Link the Target Table . . . 109

Step 7: Create a Workflow and Session Task . . . 110

Step 8: Start the Workflow and View Results in the Workflow Monitor . . . 111

Data Results . . . 112

Unit 5 Lab B: Features and Techniques I . . . 115

Instructions . . . 116

Open a Mapping . . . 116

Feature 1: Auto Arrange . . . 116

(6)

Feature 3: Revert to Saved . . . 118

Feature 4: Link Path . . . 119

Feature 5: Propagating Ports . . . 120

Feature 6: Autolink by Name and Position . . . 121

Feature 7: Moving Ports . . . 123

Feature 8: Shortcut to Port Editing from Normal View . . . 124

Feature 9: Create Transformation Methods . . . 124

Feature 10: Scale-to-Fit . . . 125

Feature 11: Designer Options . . . 126

Feature 12: Object Shortcuts and Copies . . . 127

Feature 13: Copy Objects Within and Between Mappings . . . 128

Unit 6: Lookups and Reusable Transformations . . . 131

Lesson 6-1. Lookup Transformation (Connected) . . . 131

Lesson 6-2. Reusable Transformations . . . 134

Unit 6 Lab A: Load Employee Staging Table . . . 137

Instructions . . . 142

Step 1: Create a Flat File Source Definition . . . 142

Step 2: Create a Relational Target Definition . . . 142

Step 3: Step Three: Create a Reusable Transformation . . . 142

Step 4: Create a Mapping . . . 144

Step 5: Create a Lookup Transformation . . . 145

Step 6: Add a Reusable Expression Transformation . . . 147

Step 7: Link Transformations . . . 147

Step 8: Create and Run the Workflow . . . 148

Unit 6 Lab B: Load Date Staging Table . . . 153

Instructions . . . 157

Step 1: Create a Flat File Source Definition . . . 157

Step 2: Create a Relational Target Definition . . . 157

Step 3: Create a Mapping . . . 157

Step 4: Create a Workflow and a Session Task . . . 159

Step 5: Run the Workflow and Monitor the Results . . . 159

Data Results . . . 161

Unit 7: Debugger . . . 163

Lesson 7-1. Debugging Mappings . . . 163

Unit 7 Lab: Using the Debugger . . . 167

Instructions . . . 170

(7)

Step 2: Step Through the Debug Wizard . . . 170

Step 3: Use the Debugger to Locate the Error . . . 172

Step 4: Fix the Error and Confirm the Data is Correct . . . 174

Unit 8: Sequence Generator . . . 177

Lesson 8-1. Sequence Generator Transformation . . . 177

Unit 8 Lab: Load Date Dimension Table . . . 181

Instructions . . . 185

Step 1: Create a Shortcut to a Shared Relational Source Table . . . 185

Step 2: Create a Shortcut to a Shared Relational Target Table . . . 185

Step 3: Create a Mapping . . . 185

Step 4: Create a Sequence Generator Transformation . . . 186

Step 5: Link the Target Table . . . 187

Step 6: Create and Run the Workflow . . . 187

Data Results . . . 189

Unit 9: Lookup Caching, More Features and Techniques. . . 191

Lesson 9-1. Lookup Caching . . . 191

Lookup Cache . . . 192

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache) . . . . 195

Instructions . . . 198

Step 1: Create a Shortcut to a Shared Relational Source Table . . . 198

Step 2: Create a Shortcut to Shared Relational Target Table . . . 198

Step 3: Create a Mapping . . . 198

Step 4: Create Lookups for the Start and Expiry Date Keys . . . 198

Step 5: Create and Run the Workflow . . . 201

Data Results . . . 203

Unit 9 Lab B: Features and Techniques II . . . 205

Instructions . . . 206

Open a Mapping . . . 206

Feature 1: Find in Workspace . . . 206

Feature 2: View Object Dependencies . . . 207

Feature 3: Compare Objects . . . 208

Feature 4: Overview Window . . . 211

Unit 10: Sorter, Aggregator and Self-Join . . . 213

Lesson 10-1. Sorter Transformation . . . 213

(8)

Lesson 10-2. Aggregator Transformation . . . 216

Aggregator Cache . . . 219

Lesson 10-3. Active and Passive Transformations. . . 220

Lesson 10-4. Data Concatenation . . . 221

Lesson 10-5. Self-Join . . . 222

Unit 10 Lab: Reload the Employee Staging Table . . . 225

Instructions . . . 230

Step 1: Copy an Existing Mapping . . . 230

Step 2: Examine Source Data to Determine a Key for Self-Join . . . 230

Step 3: Prepare the New Mapping for Modification . . . 231

Step 4: Create a Sorter Transformation . . . 231

Step 5: Create a Filter Transformation . . . 232

Step 6: Create an Aggregator Transformation . . . 233

Step 7: Create a Joiner Transformation for the Self-Join . . . 233

Step 8: Get Salaries from the Lookup . . . 234

Step 9: Connect the Joiner and Lookup to the Target . . . 235

Step 10: Create and Run the Workflow . . . 236

Data Results . . . 238

Unit 11: Router, Update Strategy and Overrides . . . 241

Lesson 11-1. Router Transformation . . . 241

Lesson 11-2. Update Strategy Transformation . . . 244

Lesson 11-3. Expression Default Values . . . 246

Lesson 11-4. Source Qualifier Override . . . 247

Lesson 11-5. Target Override . . . 249

Lesson 11-6. Session Task Mapping Overrides . . . 250

Unit 11 Lab: Load Employee Dimension Table . . . 253

Instructions . . . 257

Step 1: Copy the Mapping . . . 257

Step 2: Edit the Expression Transformation . . . 257

Step 3: Create a Router Transformation . . . 258

Step 4: Create an Update Strategy for INSERTS . . . 258

Step 5: Create Lookup to DIM_DATES . . . 259

Step 6: Link upd_INSERTS and lkp_DIM_DATES_INSERTS to Target DIM_EMPLOYEE_INSERTS . . . 260

Step 7: Create an Update Strategy for UPDATES . . . 260

Step 8: Create Second Lookup to DIM_DATES . . . 260

Step 9: Link upd_UPDATES and lkp_DIM_DATES_UPDATES to Target DIM_EMPLOYEE_UPDATES . . . 260

Step 10: Link ERRORS Router Group to DIM_EMPLOYEES_ERR . . . 261

(9)

Data Results . . . 264

Step 12: Prepare, Run, and Monitor the Second Run . . . 264

Unit 12: Dynamic Lookup and Error Logging . . . 269

Lesson 12-1. Dynamic Lookup Cache . . . 269

Lesson 12-2. Error Logging . . . 274

Error Log Types . . . 275

Log Row Data . . . 277

Log Source Row Data . . . 277

Unit 12 Lab: Load Customer Dimension Table . . . 279

Instructions . . . 282

Step 1: Create a Relational Source Definition . . . 282

Step 2: Create a Relational Target Definition . . . 282

Step 3: Create a Mapping . . . 282

Step 4: Create a Lookup Transformation . . . 282

Step 5: Create a Filter Transformation . . . 284

Step 6: Create an Update Strategy . . . 284

Step 7: Create and Run the Workflow . . . 285

Data Results . . . 287

Error Log Results . . . 288

Reference . . . 288

Unit 13: Unconnected Lookup, Parameters and Variables . . . 291

Lesson 13-1. Unconnected Lookup Transformations . . . 291

Connected versus Unconnected Lookup Transformations . . . 294

Joins versus Lookups . . . 295

Lesson 13-2. System Variables . . . 295

Lesson 13-3. Mapping Parameters and Variables . . . 297

Unit 13 Lab: Load Sales Fact Table . . . 303

Instructions . . . 307

Step 1: Create an Internal Relationship Between two Source Tables . . . 307

Step 2: Create a Mapping Parameter . . . 307

Step 3: Step Three: Create an Unconnected Lookup . . . 308

Step 4: Add Unconnected Lookup Test to Expression . . . 309

Step 5: Create Aggregator Transformation . . . 309

Step 6: Create and Run the Workflow . . . 312

Data Results . . . 314

Unit 14: Mapplets . . . 317

(10)

Mapplets . . . 317

Mapping Input Transformation . . . 319

Mapping Output Transformation . . . 321

Unit 14 Lab: Create a Mapplet . . . 325

Instructions . . . 326

Step 1: Create the Mapplet . . . 326

Step 2: Add Mapplet to Mapping . . . 327

Unit 15: Mapping Design . . . 331

Lesson 15-1. Designing Mappings . . . 331

High Level Process Overview . . . 332

Mapping Specifics. . . 332

Unit 15 Workshop: Load Promotions Daily Aggregate Table . . . 335

Workshop Details . . . 336

Sources and Targets . . . 336

Mapping Details . . . 336

Workflow Details . . . 339

Run Details . . . 340

Unit 16: Workflow Variables and Tasks . . . 343

Lesson 16-1. Link Conditions . . . 343

Lesson 16-2. Workflow Variables . . . 344

Lesson 16-3. Assignment Task . . . 346

Lesson 16-4. Decision Task . . . 347

Lesson 16-5. Email Task . . . 348

Unit 16 Lab: Load Product Weekly Aggregate Table . . . 351

Instructions . . . 353

Step 1: Copy the Mappings . . . 353

Step 2: Copy the Existing Workflow . . . 353

Step 3: Create the Assignment Task . . . 354

Step 4: Create the Decision Task . . . 356

Step 5: Create the Session Task . . . 356

Step 6: Create the Email Task . . . 357

Step 7: Start the Workflow and Monitor the Results . . . 358

Unit 17: More Tasks and Reusability . . . 363

Lesson 17-1. Event Wait Task . . . 363

(11)

User-Defined Event . . . 365

Lesson 17-2. Event Raise Task . . . 366

Lesson 17-3. Command Task . . . 367

Lesson 17-4. Reusable Tasks . . . 369

Lesson 17-5. Reusable Session Tasks . . . 369

Lesson 17-6. Reusable Session Configurations . . . 370

Lesson 17-7. pmcmd Utility . . . 371

Unit 18: Worklets and More Tasks . . . 373

Lesson 18-1. Worklets . . . 373

Lesson 18-2. Timer Task . . . 376

Lesson 18-3. Control Task . . . 377

Unit 18 Lab: Load Inventory Fact Table . . . 381

Instructions . . . 383

Step 1: Copy the Mappings . . . 383

Step 2: Create a Worklet . . . 383

Step 3: Create a Session Task . . . 383

Step 4: Create a Timer Task . . . 383

Step 5: Create an Email Task . . . 384

Step 6: Create a Control Task . . . 385

Step 7: Create the Workflow . . . 386

Step 8: Start the Workflow and Monitor the Results . . . 387

Unit 19: Workflow Design . . . 389

Lesson 19-1. Designing Workflows . . . 389

Workflow Overview . . . 390

Workflow Specifics . . . 390

Unit 19 Workshop: Load All Staging Tables in Single Workflow . . . 393

Workshop Details . . . 393

(12)
(13)

List of Figures

Figure 2-1. Navigator Window . . . 18

Figure 2-2. DEV_SHARED Folder and Subfolders . . . 18

Figure 2-3. Designer Tools . . . 19

Figure 2-4. DEV_SHARED Target subfolder . . . 21

Figure 2-5. Student folder with new objects . . . 22

Figure 2-6. Application Toolbar . . . 22

Figure 2-7. Task Toolbar Default Position . . . 23

Figure 2-8. Task Toolbar After Being Moved . . . 24

Figure 2-9. Relational Connection Browser . . . 25

Figure 3-1. Normal view of the payment flat file definition displayed in the Source Analyzer . . . 38

Figure 3-2. Mapping with Source and Target Definitions . . . 39

Figure 3-3. Normal view of the completed mapping . . . 40

Figure 3-4. Completed Session Task Target Properties . . . 42

Figure 3-5. Completed Workflow . . . 43

Figure 3-6. Successful Run of a Workflow Depicted in the Task View of the Workflow Monitor . . . 43

Figure 3-7. Properties for the Completed Session Run . . . 44

Figure 3-8. Source/Target Statistics for the Completed Session Run . . . 44

Figure 3-9. Data Preview of the STG_PAYMENT Target Table . . . 45

Figure 3-10. Source Definitions with a PK/FK Relationship Displayed in the Source Analyzer . . . 52

Figure 3-11. Normal View of the Completed Mapping . . . 54

Figure 3-12. Generated SQL for the m_Stage_Product Mapping . . . 54

Figure 3-13. Properties of the Completed Session Run . . . 56

Figure 3-14. Source/Target Statistics for the Completed Session Run . . . 56

Figure 3-15. Data Preview of the STG_PRODUCT Target Table . . . 56

Figure 3-16. Normal view of the promotions flat file definition displayed in the Source Analyzer . . . 62

Figure 3-17. Iconic View of the Completed Mapping . . . 63

Figure 3-18. Properties of the Completed Session Run . . . 63

Figure 3-19. Source/Target Statistics for the Completed Session Run . . . 64

Figure 3-20. Data Preview of the STG_DEALERSHIP Target Table . . . 64

Figure 3-21. Data Preview of the STG_PROMOTIONS Target Table . . . 65

Figure 4-1. Source Analyzer View of the customer_layout Flat File Definition . . . 82

Figure 4-2. Target Designer View of the STG_CUSTOMERS Table Relational Definition . . . 83

Figure 4-3. Mapping with Source and Target Definitions . . . 83

Figure 4-4. Mapping with Newly Added Filter Transformation . . . 84

Figure 4-5. Properties Tab of the Filter Transformation . . . 85

Figure 4-6. Completed Properties Tab of the Filter Transformation . . . 85

Figure 4-7. Filter Transformation Linked to the Expression Transformation . . . 86

Figure 4-9. Iconic View of the Completed Mapping . . . 88

Figure 4-8. Sample Expression . . . 88

Figure 4-10. Session Task Source Properties . . . 90

Figure 4-11. Contents of the customer_list.txt File List . . . 90

Figure 4-12. Properties for the Completed Session Run . . . 91

Figure 4-13. Source/Target Statistics for the Completed Session Run . . . 91

Figure 4-14. Data Preview of the STG_CUSTOMERS Target Table . . . 92

Figure 4-15. General Properties for the Workflow . . . 93

Figure 4-16. Customized Repeat Selections . . . 94

(14)

Figure 5-1. Normal View of the Heterogeneous Sources, Source Qualifiers and Target . . . 106

Figure 5-2. Joiner Transformation Button . . . 107

Figure 5-3. Normal View of Heterogeneous Sources Connected to a Joiner Transformation . . . 107

Figure 5-5. Edit View of the Condition Tab for Joiner Transformation Without a Condition . . . 108

Figure 5-4. Edit View of the Ports Tab for the Joiner Transformation . . . 108

Figure 5-6. Edit View of the Condition Tab for the Joiner Transformation with Completed Condition . . . . 109

Figure 5-7. Normal View of Completed Mapping Heterogeneous Sources Not Displayed . . . 110

Figure 5-8. Task Details of the Completed Session Run . . . 111

Figure 5-9. Source/Target Statistics for the Session Run . . . 111

Figure 5-10. Data Preview of the STG_TRANSACTIONS Table . . . 112

Figure 5-11. View of an Unorganized Mapping . . . 116

Figure 5-12. Arranged View of a Mapping . . . 117

Figure 5-13. Iconic View of an Arranged Mapping . . . 117

Figure 5-14. Selecting Multiple Links . . . 118

Figure 5-15. Designer Warning Box . . . 118

Figure 5-16. Selecting the forward link path . . . 119

Figure 5-17. Highlighted forward link path . . . 119

Figure 5-18. Highlighted link path going forward and backward . . . 120

Figure 5-19. Selecting to propagate the attributes . . . 120

Figure 5-20. Propagation attribute dialog box . . . 121

Figure 5-21. Autolink dialog box . . . 122

Figure 5-22. Defining a prefix in the autolink dialog box . . . 123

Figure 5-23. Expression after the AGE port has been moved . . . 124

Figure 5-24. Click and drag method of moving ports . . . 124

Figure 5-25. Creating a transformation using the menu . . . 125

Figure 5-26. Create Transformation dialog box . . . 125

Figure 5-27. Normal View of the Newly Created Aggregator Transformation . . . 125

Figure 5-28. Zoom options . . . 126

Figure 5-29. Navigator window in the Designer . . . 127

Figure 6-1. Source Analyzer view of the employees_layout flat file definition . . . 142

Figure 6-2. Target Designer view of the STG_EMPLOYEES relational table definition . . . 142

Figure 6-3. Transformation edit dialog box showing how to make a transformation reusable . . . 143

Figure 6-4. Question box letting you know the action is irreversible . . . 143

Figure 6-5. Transformation edit dialog box of a reusable transformation . . . 143

Figure 6-6. Navigator window depicting the Transformations node . . . 144

Figure 6-7. Partial mapping with source and target . . . 144

Figure 6-8. Transformation Toolbar . . . 145

Figure 6-9. Lookup Transformation table location dialog box . . . 145

Figure 6-10. Dialog box 1 of the 3 step Flat File Import Wizard . . . 145

Figure 6-11. Normal view of the newly created Lookup Transformation . . . 146

Figure 6-12. Lookup Transformation condition box . . . 147

Figure 6-13. Source properties for the employee_list file list . . . 148

Figure 6-14. Task Details of the completed session run . . . 149

Figure 6-15. Source/Target Statistics of the completed session run . . . 149

Figure 6-16. Data Preview of the STG_EMPLOYEES target table. . . 150

Figure 6-17. Mapping with Source and Target definitions . . . . 157

Figure 6-18. Completed Mapping . . . 158

Figure 6-19. Task Details of the completed session run . . . 159

Figure 6-20. Source/Target Statistics for the session run . . . 160

(15)

Figure 6-22. Data preview of the STG_DATES table - screen 2 scrolled right . . . 161

Figure 7-1. Debug Session creation dialog box . . . 171

Figure 7-2. Debug Session connections dialog box . . . 171

Figure 7-3. Designer while running a Debug Session . . . 172

Figure 7-4. Customize Toolbars Dialog Box . . . 173

Figure 7-5. Debugger Toolbar . . . 173

Figure 8-1. Expanded view of m-DIM_DATES_LOAD . . . 186

Figure 8-2. Sequence Generator Transformation icon . . . 186

Figure 8-3. Normal view of the sequence generator NEXTVAL port connected to a target column . . . 186

Figure 8-4. Normal view of connected ports to the target . . . 187

Figure 8-5. Task Details of the completed session run . . . 188

Figure 8-6. Source/Target statistics for the session run . . . 188

Figure 8-7. Data Preview of the DIM_DATES table . . . 189

Figure 9-1. m_DIM_PROMOTIONS_LOAD mapping . . . 198

Figure 9-2. m_DIM_DATES from the previous lab that populated the DIM_DATES table . . . 199

Figure 9-3. Select Lookup Table . . . 199

Figure 9-4. Lookup Condition . . . 200

Figure 9-5. m_DIM_POROMOTIONS_LOAD completed mapping . . . 201

Figure 9-6. Task Details of the completed session run . . . 202

Figure 9-7. Source/Target Statistics of the completed session run . . . . 202

Figure 9-8. Data Preview of the DIM_PROMOTIONS target table . . . 203

Figure 9-9. Preview files created when Persistent Cache is set on Lookup Transformation . . . 203

Figure 9-10. Find in workspace dialog box . . . 206

Figure 9-11. View Dependencies dialog box . . . 207

Figure 9-12. Transformation compare objects dialog box . . . 208

Figure 9-13. Compare Transformation objects Properties details . . . 209

Figure 9-14. Target comparison dialog box . . . 210

Figure 9-15. Column differences between two target tables . . . 210

Figure 10-1. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD mapping . . . 230

Figure 10-2. Employee_central.txt. . . 230

Figure 10-3. Renaming an instance of a Reusable Transformation . . . 231

Figure 10-4. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD after most links removed . . . 231

Figure 10-5. Sorter Transformation Icon on Toolbar . . . 231

Figure 10-6. Aggregator Transformation Icon on Toolbar . . . . 233

Figure 10-7. Partial mapping flow depicting the flow from the Sorter to the Filter to the Aggregator. . . 233

Figure 10-8. Split data stream joined back together . . . 234

Figure 10-9. Iconic view of the completed self-join mapping. . . . 236

Figure 10-10. Source properties for the employee_list.txt file list . . . 236

Figure 10-11. Task Details of the completed session run . . . 237

Figure 10-12. Source/Target Statistics of the completed session run . . . 237

Figure 10-13. Data preview of the self-join of Managers and Employees in the STG_EMPLOYEES target table - screen 1 . . . 238

Figure 10-14. Data preview of the STG_EMPLOYEES target table - screen 2 scrolled right . . . 238

Figure 11-1. Mapping copy Target Dependencies dialog box . . . 257

Figure 11-2. Iconic view of the m_DIM_EMPLOYEES_MAPPING . . . 257

Figure 11-3. Router Groups . . . 258

Figure 11-4. Update Strategy set to INSERT . . . 259

Figure 11-5. Iconic view of the completed mapping . . . 261

Figure 11-6. Source Filter Value . . . 262

(16)

Figure 11-8. Task Details of the completed session run . . . 263

Figure 11-9. Source/Target Statistics . . . 263

Figure 11-10. Data Results for DIM_EMPLOYEES . . . 264

Figure 11-11. Data Results for the Error Flat File (Located on the Machine Hosting the Integration Service Process . . . 264

Figure 11-12. Task Details tab results for second run . . . 265

Figure 11-13. Source/Target Statistics for second run . . . 265

Figure 11-14. Data preview showing updates to the target table . . . 266

Figure 12-1. Port tab view of a dynamic Lookup . . . 283

Figure 12-2. Port to Port Association . . . 284

Figure 12-3. Iconic View of the Completed Mapping . . . 285

Figure 12-4. Error Log Choice Screen . . . 285

Figure 12-5. Task Details of the Completed Session Run . . . 286

Figure 12-6. Source/Target Statistics for the Session Run . . . 286

Figure 12-7. Data preview of the DIM_CUSTOMERS table . . . 287

Figure 12-8. Flat file error log. . . 288

Figure 13-1. Source Analyzer view of the STG_TRANSACTIONS and STG_PAYMENT tables . . . 307

Figure 13-2. Declare Parameters and Variables screen. . . 308

Figure 13-3. Parameter entry . . . 308

Figure 13-4. Lookup Ports tab showing input, output and return ports checked/unchecked . . . 309

Figure 13-5. Aggregator ports with Group By ports checked . . . 310

Figure 13-6. Finished Aggregator . . . 311

Figure 13-7. Aggregator to Target Links . . . 312

Figure 13-8. Iconic view of the completed mapping . . . 312

Figure 13-9. Task Details of the completed session run . . . 313

Figure 13-10. Source/Target Statistics of the completed session run . . . 313

Figure 13-11. Data Preview of the FACT_SALES target table . . . 314

Figure 14-1. Mapplet Designer view of mplt_AGG_SALES . . . 326

Figure 14-2. Mapplet Designer view of MPLT_AGG_SALES with Input and Output transformations . . . 327

Figure 14-3. Iconic view of the m_FACT_SALES_LOAD_MAPPLET_xx mapping . . . 328

Figure 15-1. Source table definition . . . 336

Figure 15-2. Target table definition . . . 336

Figure 15-3. Task Details of the completed session run . . . 340

Figure 15-4. Source/Target Statistics of the completed session run . . . 340

Figure 15-5. Data Preview of the FACT_PROMOTIONS_AGG_DAILY table . . . 340

Figure 16-1. Workflow variable declaration . . . 354

Figure 16-2. Link condition testing if a session run was successful . . . . 355

Figure 16-3. Assignment Task expression declaration . . . 355

Figure 16-4. Decision Task Expression . . . 356

Figure 16-5. Link condition testing for a Decision Task condition of TRUE . . . 357

Figure 16-6. Email Task Properties . . . 358

Figure 16-7. Completed Workflow . . . 358

Figure 16-8. Gantt chart view of the completed workflow run . . . 359

Figure 16-9. View Workflow Variables . . . 359

Figure 16-10. Value of the $$WORKFLOW_RUNS variable after first run . . . 360

Figure 16-11. Gantt chart view of the completed workflow run after the weekly load runs . . . 360

Figure 16-12. Task Details of the completed session run . . . 360

Figure 18-1. Timer Task Relative time setting . . . 384

Figure 18-2. Email Task Properties Tab . . . 385

(17)

Figure 18-4. Completed Worklet. . . 386 Figure 18-5. Completed Workflow . . . 387 Figure 18-6. Gantt chart view of the completed workflow run. . . 387

(18)
(19)

Preface

Welcome to the PowerCenter 8 Level I Developer course. Data integration is a large undertaking with many potential areas of concern. The PowerCenter infrastructure will greatly assist you in your data integration efforts and alleviate much of your risk. This course will prepare the developers for that challenge by teaching you the most commonly used components of the product.

The students will build a small data warehouse using PowerCenter to extract from source tables and files, transform the data, load it into a staging area and finally into the data warehouse. The instructor will teach you about mappings, transformations, sources, targets, workflows, sessions, workflow tasks, connections and the Velocity methodology.

(20)

About This Guide

Purpose

Welcome to the PowerCenter 8 Level I Developer course. This course is designed to:

♦ Enable you to use PowerCenter developer tools to:

♦ Create and debug mappings

♦ Create, run, monitor and troubleshoot workflows

♦ Provide experience in designing mappings

Audience

This course is designed for data integration and data warehousing implementers. You should be familiar with data integration and data warehousing terminology and in using Microsoft Windows.

Document Conventions

This guide uses the following formatting conventions:

If you see… It means… Example

> Indicates a submenu to navigate to. Click Repository > Connect.

In this example, you should click the Repository menu or button and choose Connect.

boldfaced text Indicates text you need to type or enter. Click the Rename button and name the new source definition S_EMPLOYEE.

UPPERCASE Database tables and column names are

shown in all UPPERCASE. T_ITEM_SUMMARY

italicized text Indicates a variable you must replace with

specific information. Connect to the Repository using the assigned login_id.

Note: The following paragraph provides

additional facts. Note: You can select multiple objects to import by using the Ctrl key. Tip: The following paragraph provides

(21)

Other Informatica Resources

In addition to the student guides, Informatica provides these other resources:

♦ Informatica Documentation

♦ Informatica Customer Portal

♦ Informatica web site

♦ Informatica Developer Network

♦ Informatica Knowledge Base

♦ Informatica Professional Certification

♦ Informatica Technical Support

Obtaining Informatica Documentation

You can access Informatica documentation from the product CD or online help.

Visiting Informatica Customer Portal

As an Informatica customer, you can access the Informatica Customer Portal site at http://

my.informatica.com. The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica Knowledge Base, and access to the Informatica user community.

Visiting the Informatica Web Site

You can access Informatica’s corporate web site at http://www.informatica.com. The site contains

information about Informatica, its background, upcoming events, and locating your closest sales office. You will also find product information, as well as literature and partner information. The services area of the site includes important information on technical support, training and education, and

implementation services.

Visiting the Informatica Developer Network

The Informatica Developer Network is a web-based forum for third-party software developers. You can access the Informatica Developer Network at the following URL:

http://devnet.informatica.com

The site contains information on how to create, market, and support customer-oriented add-on solutions based on interoperability interfaces for Informatica products.

Visiting the Informatica Knowledge Base

As an Informatica customer, you can access the Informatica Knowledge Base at http://

my.informatica.com. The Knowledge Base lets you search for documented solutions to known technical issues about Informatica products. It also includes frequently asked questions, technical white papers, and technical tips.

Obtaining Informatica Professional Certification

You can take, and pass, exams provided by Informatica to obtain Informatica Professional Certification. For more information, go to:

(22)

Providing Feedback

Email any comments on this guide to [email protected].

Obtaining Technical Support

There are many ways to access Informatica Technical Support. You can call or email your nearest Technical Support Center listed in the following table, or you can use our WebSupport Service. Use the following email addresses to contact Informatica Technical Support:

[email protected] for technical inquiries

[email protected] for general customer service requests

WebSupport requires a user name and password. You can request a user name and password at http:// my.informatica.com.

North America / South

America Europe / Middle East / Africa Asia / Australia

Informatica Corporation

Headquarters 100 Cardinal Way Redwood City, California 94063 United States Toll Free 877 463 2435 Standard Rate United States: 650 385 5800 Informatica Software Ltd. 6 Waltham Park

Waltham Road, White Waltham Maidenhead, Berkshire SL6 3TN United Kingdom Toll Free 00 800 4632 4357 Standard Rate Belgium: +32 15 281 702 France: +33 1 41 38 92 26 Germany: +49 1805 702 702 Netherlands: +31 306 022 797 United Kingdom: +44 1628 511 445 Informatica Business Solutions Pvt. Ltd.

301 & 302 Prestige Poseidon 139 Residency Road Bangalore 560 025 India Toll Free Australia: 00 11 800 4632 4357 Singapore: 001 800 4632 4357 Standard Rate India: +91 80 5112 5738

(23)

Unit 1: Data Integration Concepts

In this unit you will learn about:

♦ Informatica

♦ Data Integration

♦ Mapping and Transformations

♦ Workflows and Tasks

♦ Metadata

Lesson 1-1. Introducing Informatica

(24)

Informatica is affiliated with many standards organizations, including:

♦ Integration Consortium. www.eaiindustry.org

♦ Object Management Group (OMG). www.omg.org.

♦ Common Warehouse Metamodel (CWM). www.omg.org/cwm

♦ Enterprise Grid Alliance. www.gridalliance.org

(25)

♦ XML.org. www.xml.org

♦ Web Services Interoperability Organization. www.ws-i.org

♦ Supply-Chain Council. www.supply-chain.org

♦ Carnegie-Mellon Software Engineering Institute (SEI). www.sei.cmu.edu

♦ APICS Educational and Research Foundation. www.apics.org

♦ Shared Services and Business Process Outsourcing Association (SBPOA). www.sharedxpertise.org.

Additional resources about Informatica can be found on the following websites:

♦ www.informatica.com—provides information on Professional Services and Education Services

♦ my.informatica.com—provides access to Technical Support, product documentation, Velocity methodology, knowledge base, and mapping templates

♦ devnet.informatica.com—the Informatica Developers Network offers discussion forums, web seminars, and technical papers.

Lesson 1-2. Data Integration

Traditionally, data integration is a batch process—to extract, transform and load (ETL) data from transactional systems to data warehouses.

(26)

The ETL process can be imagined as an assembly line.

Informatica PowerCenter is deployed for a variety of batch and real-time data integration purposes:

Data Migration. ERP consolidation, legacy conversion, new application implementation, system

consolidation

(27)

Data Warehousing. Business intelligence reporting, data marts, data mart consolidation, operational

data stores

Data Hubs. master data management; reference data hubs; single view of customer, product, supplier,

employee, etc.

Business Activity Monitoring. business process improvement, real-time reporting

Informatica partners with Composite Software for Enterprise Information Integration (EII): on-the-fly federated views and real-time reporting of information spread across multiple data sources, without moving the data into a centralized repository.

Lesson 1-3. Mappings and Transformations

Mappings

Transformations

(28)

PowerCenter includes the following types of transformations:

Passive. The number of rows entering and exiting the transformation are the same.

Active. The number of rows exiting the transformation may not be the same as the number of rows

(29)

Commonly used PowerCenter transformations include:

♦ Source Qualifier - reads sources

♦ Filter - filters data conditionally

♦ Sorter - sorts data

♦ Expression - performs logical/mathematical functions on data

♦ Aggregator - sums, averages, maximum, minimum

♦ Joiner - joins two data flows

♦ Lookup - looks up a corresponding value from a table or flat file

Lesson 1-4. Tasks and Workflows

Tasks

(30)

Workflows

A workflow is a set of ordered tasks that describe runtime ETL processes. Tasks can be sequenced serially, in parallel and conditionally. Each linked icon represents a task.

(31)

Lesson 1-5. Metadata

Metadata, which means “data about data,” is information that describes data. Common contents of metadata include the source or author of a dataset, how the dataset should be accessed, and its limitations.

(32)
(33)

Unit 2: PowerCenter Components and User Interface

In this unit you will learn about:

♦ PowerCenter Architecture

♦ PowerCenter Client Tools

Lesson 2-1. PowerCenter Architecture

The following screenshot shows the PowerCenter Architecture:

♦ Sources—Can be relational tables or heterogeneous files (flat files, VSAM files and XML)

♦ Targets—Can be relational tables or heterogeneous files

♦ Integration Service—The engine that performs all of the extract, transform and load logic

♦ Repository Service—Manages connectivity to the metadata repositories that contain mapping and workflow definitions

♦ Repository Service Process—Multi-threaded process that retrieves, inserts and updates repository metadata

♦ Repository—Contains all of the metadata needed to run ETL processes

♦ Client Tools—Desktop tools used to populate the repository with metadata, execute workflows on the Integration Service, monitor the workflows and manage the repository

(34)

Lesson 2-2. PowerCenter Client Tools

Client tools run on Microsoft Windows.

All tools access the repository through the Repository Service.

Workflow Manager and Workflow Monitor connect to Integration Service.

Each client application has its own interface. The interfaces have toolbars, a navigation window to the left, a workspace to the right, and an output window at the bottom.

(35)
(36)

Within the Designer, you can display transformations in the following views:

Iconized. Shows the transformation in relation to the rest of the mapping. This also minimizes the

screen space needed to display a mapping.

Normal. Shows the flow of data through the transformation. This view is typically used when

copying/linking ports to other objects.

Edit. Shows transformation ports and properties; allows editing. This view is used to add, edit, or

(37)
(38)

In the Workflow Manager, you can display tasks in the following views:

♦ Iconized (Session task example)

(39)

Unit 2 Lab: Using the Designer and Workflow Manager

Business Purpose

You have been asked to learn how to use Informatica PowerCenter in order to more efficiently accomplish your ETL objectives and automate the development process. Because you have limited or no prior exposure to this software, this exercise will serve to orient you to the basic development interfaces.

Technical Description

PowerCenter includes two development applications: the Designer, which you will use to create

mappings, and the Workflow Manager, which you will use to create and start workflows. This exercise is designed to serve as your first hands-on experience with PowerCenter, and supplement the instructor demonstrations. You will import source and target definitions from a shortcut folder into your own folder.

Objectives

♦ Learn how to navigate the repository folder structure.

♦ Understand the purpose of the tools accessed from the Designer and Workflow Manager.

♦ Create and save source and target shortcuts.

♦ Learn how to access and edit the database connection objects

Duration

(40)

Instructions

Step 1: Launch the Designer and Log Into the Repository

1. Launch the Designer client application from the desktop icon. If no desktop icon is present, select Start > Programs > Informatica PowerCenter … > Client > PowerCenter Designer.

2. Maximize the Designer window.

Note: Notice the Navigator window on the left side, which should resemble Figure 2-1. However, you

may see additional or fewer repositories, depending on your classroom environment.

3. Log into the PC8_DEV repository with the user name studentxx, where xx represents your student number as assigned by the instructor. The password is the same. Passwords are always case-sensitive.

Step 2: Navigate Folders

1. Double-click the folder DEV_SHARED. This opens the folder and shows you the subfolders associated with it. Figure 2-2 shows the Navigator:

Figure 2-1. Navigator Window

Tip: The user name to log into the repository is an application-level user name—it allows

PowerCenter to admit you to the repository with a specific set of application privileges. It is not a database user name.

(41)

Note: Notice that the DEV_SHARED folder has a small blue arm holding it. This icon denotes that DEV_SHARED is a shortcut folder. As you will see later in this lab, objects dragged from a shortcut folder into an open folder create shortcuts to the object.

2. Expand some of the subfolders to see the objects they hold.

Note that some subfolders are empty. When a new object, such as a target definition, is created within a folder, it automatically goes into the appropriate subfolder.

Note: Notice that within the Sources subfolder, the source objects are organized under individual

“nodes” (branches in the hierarchy), such as FlatFile, ODBC_EDW, etc. These are based on the type of source and the name of the Data Source Name that was used to import the source definition (more on this later). Very Important: You will need to click on these source nodes to locate source

definitions that may be “hiding” from view.

Each PowerCenter application, such as the Designer, shows only subfolders related to the objects that can be created and modified by that application. For example, in the Designer you only see subfolders for sources, targets, mappings, etc.

3. Double-click on your individual student folder.

For the remainder of the class, you will create and modify objects in this folder. Some pre-made objects have been provided as well.

Note: Your student folder is now the “open” folder. Only one folder at a time can be open. The

DEV_SHARED folder is now “expanded.” This distinction is important, as you will see later in this lab.

Step 3: Navigating the Designer Tools

1. Select the menu option Tools > Source Analyzer. The workspace to the right of the Navigator window changes to an empty space.

Note: Note the small toolbar directly to the right of the Navigator window, at the top. These are the

five Designer tools. Each tool allows you to create and modify one specific type of object, such as sources. Figure 2-3 shows the Designer tools with the first tool (the Source Analyzer) selected.

2. With your left mouse button, alternately toggle between the five tools.

Tip: Technically, all folders are “shared” with all users who have the appropriate folder permissions,

regardless whether it has the blue arm or not. Do not confuse repository folders with the directories visible in Windows Explorer. The folders are PowerCenter repository objects and are not related to Windows directories.

Tip: Subfolders are created and managed automatically. Users cannot create, delete, nest, or rename

subfolders.

(42)

The name of each tool is displayed in the upper left corner of the workspace when that tool is active.

Note: The main menu bar (very top of your screen) changes depending on which tool is active.

Because these menus are context-sensitive to which tool is active, you must already be in the appropriate tool to create or modify a specific type of object.

♦ The Source Analyzer tool is used to create or modify source objects. They may be relational, flat file, XML or COBOL sources.

♦ The Target Designer tool is used to create or modify target objects. They may be relational, flat file, or XML. It does not matter whether these targets are part of an actual data warehouse.

♦ The Transformation Developer tool is used to create or modify reusable transformations. Non-reusable transformations are created directly in a mapping or mapplet. This distinction will be covered later in the class.

♦ The Mapplet Designer tool is used to create or modify mapplets.

♦ The Mapping Designer tool is used to create or modify mappings.

Step 4: Create and Save Shortcuts

1. Ensure that the Target Designer is active and that your student folder is open.

2. To help view which folder is active, choose View > Workbook to view the PowerCenter Client in Workfbook view.

The PowerCenter Client displays tabs for each folder at the bottom of the Main window:

Important: In order to copy/shortcut any object into a folder, the destination folder (the folder you

are adding to) must be the open folder. If the destination folder is not open, the copy/shortcut will not work.

(43)

3. In the DEV_SHARED folder, expand the Targets subfolder by clicking on the + sign to the left of the subfolder. Figure 2-4 shows the Navigator window:

4. Drag and drop the STG_PAYMENT target from the Navigator into the Target Designer workspace. You will see the confirmation message, “Create a shortcut to the target table STG_PAYMENT?”

5. Click Yes at the confirmation message.

6. Expand the Targets subfolder in your Student folder. Note that you have added a shortcut of the STG_PAYMENT staging target table to your own folder.

7. Open the Source Analyzer tool in your student folder.

8. In the DEV_SHARED folder, expand the Sources subfolder and expand the FlatFile container.

9. Add shortcuts to your folder to the two source definitions listed below.

♦ PROMOTIONS

♦ PAYMENT

Figure 2-4. DEV_SHARED Target subfolder

Tip: PowerCenter shortcuts are “pointers” to the original object. They can be used but they cannot

be modified as shortcuts. The original object can be modified, and any changes will immediately affect all shortcuts to that object.

(44)

10.Confirm that your student folder appears similar to Figure 2-5:

11.Use the menu option Repository > Save to save these objects in your student folder.

Step 5: Launch the Workflow Manager

1. Left-click the toolbar icon for the Workflow Manager shown in Figure 2-6. This toolbar is usually above the Navigator window.

2. Confirm that the Workflow Manager launches and you are automatically logged into the repository the same way as you were in the Designer.

3. Maximize the Workflow Manager application.

Figure 2-5. Student folder with new objects

Tip: You should periodically save changes to the repository when using the Designer or the

Workflow Manager. The keyboard shortcut Ctrl+S can also be used. There is no “auto-save” feature.

Figure 2-6. Application Toolbar

Tip: Avoid having two or more “instances” of the same PowerCenter application (such as the

Workflow Manager) running on a machine at the same time. There is no benefit in doing this, and it can result in confusion when editing objects.

(45)

4. Browse through the various folders and subfolders in the Workflow Manager Navigator window as you did in the Designer. Note that only subfolders for the objects that can be created with the Workflow Manager are present, Tasks, Sessions, Worklets, and Workflows.

Note: Although a session object is a type of task, it gets its own subfolder because you will typically

have many more sessions than the other types of tasks. Only reusable sessions will appear in the Sessions subfolder. Likewise, only reusable tasks (except for sessions) will appear in the Tasks subfolder.

Step 6: Navigating the Workflow Manager Tools

1. Select the menu option Tools > Task Developer.

Just as in the Designer, you will see the workspace clear itself and a toolbar appear to the right of the Navigator window. The idea is the same as with the Designer, except there are three tools instead of five.

2. With your left mouse button, alternately toggle between the three tools.

Note that the name of each tool is displayed in the upper left corner of the workspace when that tool is active. Note also the context-sensitive menus, as we did in the Designer.

♦ The Task Developer tool is used to create or modify reusable tasks.

♦ The Worklet Designer tool is used to create or modify worklets.

♦ The Workflow Designer tool is used to create or modify workflows

Step 7: Workflow Manager Task Toolbar

The Workflow Manager is equipped with a toolbar that shows an icon for each type of task that can be created. This toolbar is visible by default, but the default location of is at the top right-hand corner of the screen. We will move the toolbar to a more central location.

1. Locate the “vertical stripe” at the far-left hand side of the task bar, as shown in Figure 2-7:

(46)

2. With your left mouse button, drag the toolbar toward the left and drop it in a convenient location so that all of the buttons are visible. The top of your Workflow Manager should appear similar to Figure 2-8:

Step 8: Database Connection Objects

Later in the class, we will create sessions that will read data from database source and target tables. In order to open a connection to the respective databases, PowerCenter needs the database log-in and the designation (i.e., connection string, database name or server name).

Instead of requiring the user to type this information each time a session is created, PowerCenter allows us to create reusable and sharable database connection objects. These objects contain properties

describing one database connection. The objects can be associated with multiple sessions to describe either source, target, or lookup connections.

1. In the Workflow Manager, select the menu option Connections > Relational.

(47)

You will see the Relational Connection Browser similar to Figure 2-9.

Note: Note that each connection object is organized under a database type.

2. Double-click on the NATIVE_TRANS connection object to display its properties.

3. You will not have write privileges. Click OK.

Note: Note that the connection NATIVE_TRANS will log into the database with the user name sdbu.

The connection object will be shared among the students in the class.

4. Double-click on any of the other objects that have your student number. The NATIVE_STG07 connection, for example, will have the user name tdbu07. These are the individual student connections to be used to read from and write to your individual staging tables and the enterprise data warehouse (EDW) tables.

It’s intuitive to create additional connection objects. Experiment if you have extra time.

Figure 2-9. Relational Connection Browser

(48)
(49)

Unit 3: Source Qualifier

In this unit you will learn about:

♦ Source Qualifier transformation

♦ Velocity Methodology

♦ Source Qualifier joins

♦ Source pipelines

Lesson 3-1. Source Qualifier Transformation

Source Qualifier Transformation

Type

Active.

Description

A Source Qualifier transformation:

♦ Selects records from flat file and relational table sources. Only those fields or columns used in the mapping are selected, based on the output connections.

♦ Converts the data from the source’s native datatype to the most compatible PowerCenter transformation datatype.

♦ Generates a SQL query for relational sources.

(50)

Properties

The following table describes the Source Qualifier transformation properties:

Business Purpose

The use of a Source Qualifier is a product requirement; other types of sources require equivalent transformations (XML Source Qualifier, etc.). It provides an efficient way to filter input fields/columns and to perform homogeneous joins.

Property Description

Sql Query Allows you to override the default SQL query that PowerCenter creates at runtime. User Defined Join Allows you to specify a join that replaces the default join created by PowerCenter.

Source Filter Allows you to create a where clause that will be inserted into the SQL query that is generated at runtime. The “where” portion of the statement is not required. For example:

Table1.ID = Table2.ID

Number of Sorted Ports PowerCenter will insert an order by clause in the generated SQL query. The order by will be on the number of ports specified, from the top down. For example, in the sq_Product_Product_Cost Source Qualifier, if the number of sorted ports = 2, the order by will be:

ORDER BY PRODUCT.PRODUCT_ID, PRODUCT.GROUP_ID.

Tracing Level Specifies the amount of detail written to the session log. Select Distinct Allows you to select distinct values only.

Pre SQL Allows you to specify SQL that will be run prior to the pipeline being run. The SQL will be run using the connection specified in the session task.

Post SQL Allows you to specify SQL that will be run after the pipeline has been run. The SQL will be run using the connection specified in the session task.

(51)

Datatype Conversion

Data can be converted from one datatype to another by:

♦ Passing data between ports with different datatypes

♦ Passing data from an expression to a port

♦ Using transformation functions

♦ Using transformation arithmetic operators Supported conversions are:

♦ Numeric datatypes <=> Other numeric datatypes

♦ Numeric datatypes <=> String

♦ Date/Time <=> Date or String (to convert from string to date the string must be in the default PowerCenter data format MM/DD/YYYY HH24:MI:SS)

Similarly, when writing to a target the Integration Service converts the data to the target’s native datatype. For further information, see the PowerCenter Client Help > Index > port-to-port data conversion.

(52)

Lesson 3-2. Velocity Methodology

In labs, we will use Informatica's Velocity methodology.

This methodology includes:

♦ Templates

♦ Mapping specification templates

♦ Source to target field matrix

♦ Naming conventions

♦ Object type prefixes: m_, exp_, agg_, wf_, s_, …

♦ Best practices

Velocity covers the entire data integration project life cycles: Phase 1: Manage Phase 2: Architect Phase 3: Design Phase 4: Build Phase 5: Test Phase 6: Deploy Phase 7: Operate

For more information, see http://devnet.informatica.com (requires registration).

Lab Project

The Mersche Motors data model consists of the following star schemas. The labs predominately use the Sales star schema.

(53)

Data is moved first to the staging area and from there to the data warehouse and target flat files.

The labs can source from flat files and/or a relational database.

Source Tables and Files

The source system has the following relational tables:

(54)

PRODUCT

PRODUCT_COST

The source system has the following flat files:

customer_layout dates employees_layout inventory payment promotions sales_transactions

Staging Area

The staging area has the following tables:

STG_CUSTOMERS STG_DATES STG_DEALERSHIP STG_EMPLOYEES STG_INVENTORY STG_PAYMENT STG_PRODUCT STG_PROMOTIONS STG_TRANSACTIONS

Data Warehouse

The data warehouse has the following tables:

DIM_CUSTOMERS DIM_DATES DIM_DEALERSHIP DIM_EMPLOYEES DIM_PAYMENT DIM_PRODUCT DIM_PROMOTIONS FACT_INVENTORY FACT_PRODUCT_AGG_DAILY FACT_PRODUCT_AGG_WEEKLY FACT_PROMOTIONS_AGG_DAILY FACT_SALES

(55)

Architecture and Connectivity

Architecture

The labs use the following architecture and connections: Integration Service: PC_IService

Repository Name: PC8_DEV Folders: Student 01 - 20 User Names: student01 - 20 Passwords: student01 - 20

Connectivity

ODBC Connections:

Native Connections:

Source Tables ODBC_TRANS

Staging Area ODBC_STG (01 - 20)

Data Warehouse ODBC_EDW (01 - 20)

Source Tables NATIVE_TRANS

Staging Area NATIVE_STG (01 - 20)

Data Warehouse NATIVE_EDW (01 - 20)

Relational Source sdbu with password sdbu

Relational Targets tdbu01 - 20

(56)
(57)

Unit 3 Lab A: Load Payment Staging Table

Section 1: Pass-Through Mapping

Business Purpose

The staging area of the Mersche Motors data warehouse contains a table that assigns payment type descriptions for each payment ID. Because these descriptions may change, the table must be synchronized daily with the corresponding data located in the operational system. The operational system administrator uses a simple flat file to record and edit these descriptions.

Technical Description

PowerCenter will source from a delimited flat file and insert the data into a database table without performing data transformations. In order to avoid duplicate records in subsequent loads, we will configure PowerCenter to truncate the target table before each load.

Objectives

♦ Open the Designer Tools and switch between Workspaces

♦ Import a flat file definition

♦ Import a table definition

♦ Create a simple pass-through mapping

♦ Create a Session task to run the mapping and configure connectivity

♦ Create a Workflow to run the Session task

♦ Run the Workflow and monitor the results

Duration

(58)

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

This is a pass-through mapping with no data transformation.

Mapping Name m_Stage_Payment_Type_xx

Source System Flat file Target System Oracle Table

Initial Rows 5 Rows/Load 5

Short Description Simple pass-through mapping, comma-delimited flat-file to Oracle table

Load Frequency

Preprocessing Target truncate

Post Processing

Error Strategy Default

Reload Strategy Unique Source Fields

PAYMENT_ID, PAYMENT_TYPE_DESC

Tables

Table Name Schema/Owner Selection/Filter

Files

File Name File Location Fixed/Delimited Additional File Info

payment.txt C:\pmfiles\SrcFiles Delimited Comma delimiter

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_PAYMENT X

(59)

S

O

URCE T

O

T

A

RGET

F

IEL

D M

A

T

R

IX

Ta rg et Ta bl e Ta rg et C olum n Dat at yp e S ou rce Ta bl e S ou rce Co lu m n Dat at yp e E xp res si on Def au lt Va lu e if N ulls D at a I ssu es / qu alit y S TG _PAY M EN T P ay m ent _i d Num ber (3 ,0 ) PAY M EN T P ay m ent _i d D ec im al (3, 0) S TG _PAY M EN T P ay m ent _t ype_d es c Va rc har 2( 20 ) PAY M EN T P ay m ent _t ype_d es c S tri ng (1 0) Ve lo ci ty B est P ract ice: Th is i s th e V el o ci ty Source to T arg et F iel d Ma tri x. I t is d is p la ye d h ere fo r you referen ce. I n fu ture la b s w e w ill b e us in g a s h or ten ed v ers io n of th e m atri x.

(60)

Instructions

Step 1: Launch the Designer and Review the Source and Target Definitions

1. Launch the Designer application by selecting Start > Programs > Informatica PowerCenter … >

Client > PowerCenter Designer.

2. Log into the PC8_DEV repository with the user name studentxx and password studentxx where xx represents your student number as assigned by the instructor.

3. Open your student folder by double-clicking on it.

4. Open the Source Analyzer selecting the menu option Tools > Source Analyzer.

5. Drag the Shortcut_to_payment source file from the Sources subfolder into the Source Analyzer workspace.

Confirm that your source definition appears the same as displayed in Figure 3-1. You may have to drag the box wider to see the Length column.

6. Open the Target Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:

7. Drag the Shortcut_to_STG_PAYMENT target table definition from the Targets subfolder into the Target Designer workspace.

8. Review the target definition.

Step 2: Create a Mapping

1. Open the Mapping Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:

Tip: If an instance of the Designer is already running on your workstation, do not launch another

instance. It is unnecessary and potentially confusing to run more than one instance per workstation.

References

Related documents

There has been a lot of debate as to the botanical identity of Soma, but if we make the assumption that Amanita Muscaria was the original Soma plant of the Aryans, the composers of

The GETS specification that uses actual values of uncertain information is found to perform particularly well when it is able to explain big movements in the exchange rate, but

ROS nodes implemented on the virtual machines within the cloud provided cloud services to the robots, such as control systems to follow a black line or utilize the pose

The results indicate significant differences in species richness, assemblage structure based on abundance and biomass, as well as in the composition of dung beetles between remnant

Hacking a Bird in the Sky: Exploiting Satellite Trust Relationship Hack In The Box Security Conference 2008 Artificial Satellite ‣

DR: Dorper × Santa Inês; HD: Hampshire × Santa Inês; IF: Ile de France × Santa Inês; SI: Santa Inês; TX: Texel × Santa Inês. RP: rib proportion; LP: loin proportion; SP:

He provides medical writing which uses similar imagery to literary texts at the time (which chapter four deals with) and he writes on the theory of laughter.

Whether an excluded white applicant with a better academic record has a greater stake in obtaining relief in the form of a judicial invalidation of a challenged admissions policy