D W B I C o n c e p t s . c o m
Master Informatica Questions and
Answer Set
Version 2.5
Copyright Notice
Informatica Master Question and Answer Set is copyright © DWBIConcepts 2013.
All rights reserved. No part of this book shall be reproduced, stored in a retrieval system,
or transmitted by any means – electronic, mechanical, photocopying, recording, or
oth-erwise – without written permission from the publisher. No patent liability is assumed
with respect to the use of the information contained herein. Although every precaution
has been taken in the preparation of this book, the publisher and author assume no
re-sponsibility for errors or omissions. Neither is any liability assumed for damages
result-ing from the use of the information contained herein.
Trademarks
All terms mentioned in this book that are known to be trademarks or service marks have
been appropriately capitalized. New Riders Publishing cannot attest to the accuracy of
this information. Use of a term in this book should not be regarded as affecting the
valid-ity of any trademark or service mark.
Warning and disclaimer
Every effort has been made to make this book as complete and as accurate as possible,
but no warranty of fitness is implied. The information is provided on an “as is” basis. The
author and the publisher shall have neither liability nor responsibility to any person or
entity with respect to any loss or damages arising from the information contained in this
book
How this book should be used
This book contains various questions and answers pertaining to Informatica Power
Cen-ter™ and allied tools as commonly asked in Job Interviews. As such the book is written
for the candidates who are preparing for Job Interviews. It is suggested that the candidate
start preparing from the material at least one week in advance so that s/he can finish
reading the entire content before appearing for the interview. In case the candidate is
stuck with any question or answer, is not clear on something or has a doubt – s/he can
interact with the Experts by using DWBIConcepts forum.
For the help of the readers, we have tagged certain questions accordingly as shown
be-low:
Common / Frequently Asked Questions
Harder Questions
Table of Contents
COPYRIGHT NOICE 2
TRADEMARKS 2
WARNING AND DISCLAIMER 2
HOW THIS BOOK SHOULD BE USED 3
1. AGGREGATOR TRANSFORMATION 13
1. WHAT IS AN AGGREGATOR TRANSFORMATION? 13
2. HOW AN EXPRESSION TRANSFORMATION DIFFERS FROM AGGREGATOR TRANSFORMATION? 13 3. DOES AN AGGREGATOR TRANSFORMATION SUPPORT ONLY AGGREGATE EXPRESSIONS? 13 4. GIVE ONE EXAMPLE FOR EACH OF CONDITIONAL AGGREGATION,NON-AGGREGATE EXPRESSION AND NESTED AGGREGATION. 13
5. HOW DOES AGGREGATOR TRANSFORMATION HANDLE NULL VALUES? 13 6. WHAT ARE THE PERFORMANCE CONSIDERATIONS WHEN WORKING WITH AGGREGATOR TRANSFORMATION? 14
7. WHAT ARE THE USES OF INDEX AND DATA CACHE? 14
8. WHAT DIFFERS WHEN WE CHOOSE SORTED INPUT FOR AGGREGATOR TRANSFORMATION? 14 9. UNDER WHAT CONDITIONS SELECTING SORTED INPUT IN AGGREGATOR WILL STILL NOT BOOST SESSION PERFORMANCE? 15 10. UNDER WHAT CONDITION SELECTING SORTED INPUT IN AGGREGATOR MAY FAIL THE SESSION? 15 11. SUPPOSE WE DO NOT GROUP BY ON ANY PORTS OF THE AGGREGATOR WHAT WILL BE THE OUTPUT. 15 12. WHAT IS THE EXPECTED VALUE IF THE COLUMN IN AN AGGREGATOR TRANSFORMATION IS NEITHER A GROUP BY NOR AN
AGGREGATE EXPRESSION? 15
13. WHAT IS INCREMENTAL AGGREGATION? 15
14. SORTED INPUT FOR AGGREGATOR TRANSFORMATION WILL IMPROVE PERFORMANCE OF MAPPING.HOWEVER, IF SORTED INPUT IS USED FOR NESTED AGGREGATE EXPRESSION OR INCREMENTAL AGGREGATION, THEN THE MAPPING MAY RESULT IN SESSION FAILURE.
EXPLAIN WHY? 16
15. HOW CAN WE DELETE DUPLICATE RECORD USING INFORMATICA AGGREGATOR? 16
16. SCENARIO IMPLEMENTATION 1 16
17. SCENARIO IMPLEMENTATION 2 18
2. EXPRESSION TRANSFORMATION 19
1. WHAT IS AN EXPRESSION TRANSFORM? 19
2. HOW MANY TYPES OF PORTS ARE THERE IN EXPRESSION TRANSFORM? 19 3. WHAT IS THE EXECUTION ORDER OF THE PORTS IN AN EXPRESSION? 19 4. DESCRIBE THE APPROACH FOR THE REQUIREMENT.SUPPOSE THE INPUT IS: 19 5. HOW CAN WE IMPLEMENT AGGREGATION OPERATION WITHOUT USING AN AGGREGATOR TRANSFORMATION IN INFORMATICA? 20
6. SCENARIO IMPLEMENTATION 1 20 7. SCENARIO IMPLEMENTATION 2 21 8. SCENARIO IMPLEMENTATION 3 22 9. SCENARIO IMPLEMENTATION 4 22 10. SCENARIO IMPLEMENTATION 5 22 3. FILTER TRANSFORMATION 24
1. WHAT IS A FILTER TRANSFORMATION AND WHY IT IS AN ACTIVE ONE? 24
DWBIConcepts
DWBIConcepts
DWBIConcepts
4. JOINER TRANSFORMATION 25 1. WHAT IS A JOINER TRANSFORMATION AND WHY IT IS AN ACTIVE ONE? 25
2. STATE THE LIMITATIONS WHERE WE CANNOT USE JOINER IN THE MAPPING PIPELINE. 25
3. OUT OF THE TWO INPUT PIPELINES OF A JOINER, WHICH ONE WILL WE SET AS THE MASTER PIPELINE? 25 4. WHAT ARE THE DIFFERENT TYPES OF JOINS AVAILABLE IN JOINER TRANSFORMATION? 26 5. DEFINE THE VARIOUS JOIN TYPES OF JOINER TRANSFORMATION. 27 6. DESCRIBE THE IMPACT OF NUMBER OF JOIN CONDITIONS AND JOIN ORDER IN A JOINER. 27 7. HOW DOES JOINER TRANSFORMATION TREAT NULL VALUE MATCHING? 27
8. WHEN WE CONFIGURE THE JOIN CONDITION, WHAT ARE THE GUIDELINES WE NEED TO FOLLOW TO MAINTAIN THE SORT ORDER? 28 9. WHAT ARE THE TRANSFORMATIONS THAT CANNOT BE PLACED BETWEEN THE SORT ORIGIN AND THE JOINER TRANSFORMATION SO
THAT WE DO NOT LOSE THE INPUT SORT ORDER? 28
10. WHAT IS THE USE OF SORTED INPUT IN JOINER TRANSFORMATION? 28 11. CAN WE JOIN TWO TABLES BASED ON A JOIN COLUMN HAVING DIFFERENT DATA TYPE? 29
12. IMPLEMENTATION SCENARIO1-JOINER TRANSFORMATION IS JOINING TWO TABLES S1 AND S2. S1 HAS 10,000 ROWS AND S2
HAS 1000 ROWS .WHICH TABLE YOU WILL SET MASTER FOR BETTER PERFORMANCE OF JOINER TRANSFORMATION?WHY? 29
5. LOOKUP TRANSFORMATION 30
1. WHAT IS A LOOKUP TRANSFORM? 30
2. WHAT ARE THE DIFFERENCES BETWEEN CONNECTED AND UNCONNECTED LOOKUP? 30
3. WHAT ARE THE DIFFERENT LOOKUP CACHE(S)? 30
4. IS LOOKUP AN ACTIVE OR PASSIVE TRANSFORMATION? 31
5. WHAT IS THE DIFFERENCE BETWEEN STATIC AND DYNAMIC LOOKUP CACHE? 31
6. WHAT ARE THE USES OF INDEX AND DATA CACHES? 31
7. WHAT IS PERSISTENT LOOKUP CACHE? 31
8. WHAT TYPE OF JOIN DOES LOOKUP SUPPORT? 32
9. EXPLAIN HOW LOOKUP TRANSFORMATION WORKS LIKE SQLLEFT OUTER JOIN. 32
10. WHERE AND WHY DO WE USE UNCONNECTED LOOKUP INSTEAD OF CONNECTED LOOKUP? 32
11. HOW CAN WE IDENTIFY PERSISTENT CACHE FILES IN INFORMATICA SERVER? 33 12. HOW TO CONFIGURE A LOOKUP ON A FLAT FILE WITH HEADER? 33 13. WHAT IS THE DIFFERENCE BETWEEN PERSISTENT CACHE AND SHARED CACHE? 33 14. DESCRIBE HOW TO RETURN MULTIPLE PORT VALUES FROM UNCONNECTED LOOKUP IN INFORMATICA. 34 15. HOW TO MAKE THE PERSISTENT LOOKUP CACHE IN SYNC WITH LOOKUP TABLE? 34
16. IF WE USE PERSISTENT CACHE FOR A DYNAMIC LOOKUP, WILL THE CACHE FILE BE UPDATED OR INSERTED AS REQUIRED? 34
17. IS THERE ANYTHING WRONG IN SHARING A PERSISTENT CACHE BETWEEN STATIC AND DYNAMIC LOOKUP? 34 18. WHAT IS THE DIFFERENCE BETWEEN THE TWO UPDATE PROPERTIES - UPDATE ELSE INSERT, INSERT ELSE UPDATE IN DYNAMIC
LOOKUP CACHE? 35
19. IF THE DEFAULT VALUE FOR THE LOOKUP RETURN PORT IS NOT SET, WHAT WILL BE THE OUTPUT WHEN THE LOOKUP CONDITION FAILS? 35
20. HOW CAN WE ENSURE DATA IS NOT DUPLICATED IN THE TARGET WHEN THE SOURCE HAS DUPLICATE RECORDS, USING LOOKUP
TRANSFORMATION? 35
6. NORMALIZER TRANSFORMATION 36
1. WHAT IS A NORMALIZER TRANSFORMATION? 36
2. SCENARIO IMPLEMENTATION 1 36
3. WHAT ARE LEVELS IN NORMALIZER TRANSFORMATION? 36
DWBIConcepts
DWBIConcepts
DWBIConcepts
7. RANK TRANSFORMATION 38
1. WHAT IS A RANK TRANSFORM? 38
2. HOW DOES A RANK TRANSFORM DIFFER FROM AGGREGATOR TRANSFORM FUNCTIONS MAX AND MIN? 38
3. HOW DOES A RANK CACHE WORKS? 38
4. WHAT IS A RANK PORT AND RANKINDEX? 38
5. HOW CAN YOU GET RANKS BASED ON DIFFERENT GROUPS? 38
6. WHAT HAPPENS IF TWO RANK VALUES MATCH? 39
7. WHAT ARE THE RESTRICTIONS OF RANK TRANSFORMATION? 39
8. HOW DOES RANK TRANSFORMATION HANDLE STRING VALUES? 39 9. WHAT IS DENSE RANK AND DOES INFORMATICA SUPPORTS DENSE RANK? 39 10. HOW DO WE ACHIEVE DENSE_RANK IN INFORMATICA? 40 11. SOURCE TABLE HAS 5 ROWS.RANK IN RANK TRANSFORMATION IS SET TO 10.HOW MANY ROWS THE RANK TRANSFORMATION
WILL OUTPUT? 40
12. HOW YOU WILL LOAD UNIQUE RECORD INTO TARGET FLAT FILE FROM SOURCE FLAT FILES HAS DUPLICATE DATA? 40
8. ROUTER TRANSFORMATION 42
1. WHAT IS THE DIFFERENCE BETWEEN ROUTER AND FILTER? 42 2. WHAT IS THE MINIMUM NUMBER OF GROUPS WE CAN DECLARE IN A ROUTER TRANSFORMATION? 42
3. SCENARIO IMPLEMENTATION 1 42
4. SCENARIO IMPLEMENTATION 2 43
5. SCENARIO IMPLEMENTATION 3 44
9. SEQUENCE GENERATOR TRANSFORMATION 45
1. WHAT IS A SEQUENCE GENERATOR TRANSFORMATION? 45 2. DEFINE THE PROPERTIES AVAILABLE IN SEQUENCE GENERATOR TRANSFORMATION IN BRIEF. 45
3. SCENARIO IMPLEMENTATION 1 46
4. SCENARIO IMPLEMENTATION 2 46
5. WHAT ARE THE CHANGES WE OBSERVE WHEN WE PROMOTE A NON-REUSABLE SEQUENCE GENERATOR TO A REUSABLE ONE?AND WHAT HAPPENS IF WE SET THE NUMBER OF CACHED VALUES TO 0 FOR A REUSABLE TRANSFORMATION? 47 6. HOW SEQUENCE GENERATOR IN THE MAPPING IS HANDLED WHEN WE MIGRATE THE MAPPING FROM ONE ENVIRONMENT TO
ANOTHER? 47
7. SCENARIO IMPLEMENTATION 3 48
8. HOW DO I GET A SEQUENCE GENERATOR TO "PICK UP" WHERE ANOTHER "LEFT OFF"? 48
10. STORED PROCEDURE TRANSFORMATION 49
1. WHAT IS A STORED PROCEDURE TRANSFORMATION? 49 2. HOW MANY TYPES OF STORED PROCEDURE TRANSFORMATION ARE THERE? 49 3. HOW DO WE CALL AN UNCONNECTED STORED PROCEDURE TRANSFORMATION? 49 4. HOW DO WE SET THE EXECUTION ORDER OF PRE-POST LOAD STORED PROCEDURE? 49 5. HOW DO WE SET THE CALL TEXT FOR STORED PROCEDURE TRANSFORMATION? 49
6. HOW DO WE RECEIVE OUTPUT/RETURN PARAMETERS FROM UNCONNECTED STORED PROCEDURE? 50
11. SORTER TRANSFORMATION 51
1. WHAT IS A SORTER TRANSFORMATION? 51
DWBIConcepts
DWBIConcepts
DWBIConcepts
4. HOW DOES SORTER HANDLE NULL VALUES? 51
5. HOW DOES A SORTER CACHE WORKS? 51
6. HOW TO DELETE DUPLICATE RECORDS OR RATHER TO SELECT DISTINCT ROWS FOR FLAT FILE SOURCES? 52
12. UNION TRANSFORMATION 53
1. WHAT IS A UNION TRANSFORMATION? 53
2. WHAT ARE THE RESTRICTIONS OF UNION TRANSFORMATION? 53
3. HOW COME UNION TRANSFORMATION IS ACTIVE? 53
13. UPDATE STRATEGY TRANSFORMATION 54
1. WHAT IS UPDATE STRATEGY TRANSFORM? 54
2. WHAT ARE UPDATE STRATEGY CONSTANTS? 54
3. HOW CAN WE UPDATE A RECORD IN TARGET TABLE WITHOUT USING UPDATE STRATEGY? 54
4. WHAT IS DATA DRIVEN? 54
5. WHAT HAPPENS WHEN DD_UPDATE IS DEFINED IN UPDATE STRATEGY AND TREAT SOURCE ROWS AS INSERT IS SELECTED IN
SESSION? 55
6. WHAT ARE THE THREE AREAS WHERE THE ROWS CAN BE FLAGGED FOR PARTICULAR TREATMENT? 55 7. BY DEFAULT OPERATION CODE FOR ANY ROW IN INFORMATICA WITHOUT BEING ALTERED IS INSERT.THEN STATE WHEN DO WE
NEED DD_INSERT? 55
8. WHAT IS THE DIFFERENCE BETWEEN UPDATE STRATEGY AND FOLLOWING UPDATE OPTIONS IN TARGET? 55
9. WHAT IS THE USE OF FORWARD REJECT ROWS IN MAPPING? 56
10. SCENARIO IMPLEMENTATION 1 56
14. JAVA TRANSFORMATION 57
1. SCENARIO IMPLEMENTATION 1 57
2. SCENARIO IMPLEMENTATION 2 57
15. SOURCE QUALIFIER TRANSFORMATION 59
1. WHAT IS A SOURCE QUALIFIER?WHAT ARE THE TASKS WE CAN PERFORM USING A SOURCE QUALIFIER AND WHY IT IS AN ACTIVE
TRANSFORMATION? 59
2. WHAT HAPPENS TO A MAPPING IF WE ALTER THE DATA TYPES BETWEEN SOURCE AND ITS CORRESPONDING SOURCE QUALIFIER? 59 3. SUPPOSE WE HAVE USED THE SELECT DISTINCT AND THE NUMBER OF SORTED PORTS PROPERTY IN THE SOURCE QUALIFIER AND THEN WE ADD CUSTOM SQLQUERY.EXPLAIN WHAT WILL HAPPEN. 59 4. DESCRIBE THE SITUATIONS WHERE WE WILL USE THE SOURCE FILTER,SELECT DISTINCT AND NUMBER OF SORTED PORTS
PROPERTIES OF SOURCE QUALIFIER TRANSFORMATION. 60
5. WHAT WILL HAPPEN IF THE SELECT LIST COLUMNS IN THE CUSTOM OVERRIDE SQLQUERY AND THE OUTPUTPORTS ORDER IN SOURCE QUALIFIER TRANSFORMATION DO NOT MATCH? 60 6. WHAT HAPPENS IF IN THE SOURCE FILTER PROPERTY OF SQ TRANSFORMATION WE INCLUDE KEYWORD WHERE SAY,WHERE
CUSTOMERS.CUSTOMER_ID>1000. 60
7. DESCRIBE THE SCENARIOS WHERE WE GO FOR JOINER TRANSFORMATION INSTEAD OF SOURCE QUALIFIER TRANSFORMATION. 60 8. WHAT IS THE MAXIMUM NUMBER WE CAN USE IN NUMBER OF SORTED PORTS FOR SYBASE SOURCE SYSTEM? 61
9. WHAT IS USE OF SOURCE QUALIFIER IN INFORMATICA?CAN WE CREATE A MAPPING WITHOUT A SOURCE QUALIFIER? 61 10. SUPPOSE WE HAVE TWO TABLES OF SAME DATABASE TYPE, RESIDING IN DIFFERENT DATABASE INSTANCE.IF A DATABASE LINK IS AVAILABLE, HOW CAN WE JOIN THE TWO TABLES USING A SOURCE QUALIFIER IN INFORMATICA PROVIDED THERE ARE VALID JOIN
COLUMNS. 61
DWBIConcepts
DWBIConcepts
DWBIConcepts
12. SCENARIO IMPLEMENTATION 1 62
16. MISCELLANEOUS 63
1. WHAT ARE THE NEW FEATURES OF INFORMATICA 9.X IN DEVELOPER LEVEL? 63
2. NAME THE TRANSFORMATIONS WHICH CONVERTS ONE TO MANY ROWS I.E. INCREASES THE I/P:O/P ROW COUNT.ALSO WHAT IS
THE NAME OF ITS REVERSE TRANSFORMATION? 63
3. HOW MANY WAYS WE CAN FILTER RECORDS? 63
4. WHAT ARE THE TRANSFORMATIONS THAT USE CACHE FOR PERFORMANCE? 63 5. WHAT IS THE FORMULA FOR CALCULATION OF LOOKUP/RANK/AGGREGATOR INDEX & DATA CACHES? 64
6. WHAT IS THE DIFFERENCE BETWEEN INFORMATICA POWERCENTER AND EXCHANGE AND MART? 64 7. HOW DO WE HANDLE DELIMITER CHARACTER AS A PART OF THE DATA IN A DELIMITED SOURCE FILE? 65 8. WE HAVE JUST RECEIVED SOURCE FILES FROM UNIX.WE WANT TO STAGE THAT DATA TO ETL PROCESS.WHAT ARE THE POINTS
WE NEED TO LOOK FOR? 65
9. WHAT IS THE DIFFERENCE BETWEEN JOINER AND LOOKUP.PERFORMANCE WISE WHICH ONE IS BETTER TO USE. 65
10. WHAT IS THE B2B IN INFORMATICA?HOW CAN WE USE IT IN INFORMATICA? 66
11. WHAT IS CDC,SCD AND MD5 IN INFORMATICA? 66
12. HOW CAN WE IMPLEMENT AN SCDTYPE2 MAPPING WITHOUT USING A LOOKUP TRANSFORMATION? 67 13. HOW DOES JOINER AND LOOKUP TRANSFORMATION TREAT NULL VALUE MATCHING? 67 14. DOES MICROSOFT SQL SERVER SUPPORTS BULK LOADING?IF YES,WHAT HAPPENS WHEN YOU SPECIFY BULK MODE AND DATA
DRIVEN FOR SQL SERVER TARGET 67
15. HOW CAN YOU UTILIZE COM COMPONENTS IN INFORMATICA? 67
16. WHAT IS SQL TRANSFORMATION IN INFORMATICA? 67
17. WHAT IS A XML SOURCE QUALIFIER? 68
18. WHAT IS THE “METADATA EXTENSIONS” TAB IN INFORMATICA? 68
19. DESCRIBE SOME OF THE ETLBEST PRACTICES 69
20. IS THERE A SCOPE OF CLOUD COMPUTING IN DATA WAREHOUSING TECHNOLOGY? 69
17. MAPPING 71
1. SCENARIO IMPLEMENTATION 1 71
2. WHAT ARE MAPPING PARAMETERS AND VARIABLES? 71
4. WHAT ARE THE DEFAULT VALUES FOR VARIABLES? 72
5. WHAT DOES FIRST COLUMN OF BAD FILE (REJECTED ROWS) INDICATES? 72 6. OUT OF 100000 SOURCE ROWS SOME ROWS GET DISCARD AT TARGET, HOW WILL YOU TRACE THEM AND WHERE IT GETS LOADED?
72
7. WHAT IS REJECT LOADING? 72
8. WHY INFORMATICA WRITER THREAD MAY REJECT A RECORD? 74
9. WHY TARGET DATABASE CAN REJECT A RECORD? 74
10. DESCRIBE VARIOUS STEPS FOR LOADING REJECT FILE? 74 11. VARIABLE V1 HAS VALUES SET AS 5 IN DESIGNER (DEFAULT),10 IN PARAMETER FILE, AND 15 IN REPOSITORY.WHILE RUNNING
SESSION WHICH VALUE INFORMATICA WILL READ? 74
12. WHAT ARE SHORTCUTS?WHERE IT CAN BE USED?WHAT ARE THE ADVANTAGES? 74 13. CAN WE HAVE AN INFORMATICA MAPPING WITH TWO PIPELINES, WHERE ONE FLOW IS HAVING A TRANSACTION CONTROL
TRANSFORMATION AND ANOTHER NOT.EXPLAIN WHY? 75
14. HOW CAN WE IMPLEMENT REVERSE PIVOTING USING INFORMATICA TRANSFORMATIONS? 75
15. IS IT POSSIBLE TO UPDATE A TARGET TABLE WITHOUT ANY KEY COLUMN IN TARGET? 75
DWBIConcepts
DWBIConcepts
DWBIConcepts
1. WHAT IS A MAPPLET? 77
2. WHAT IS THE DIFFERENCE BETWEEN REUSABLE TRANSFORMATION AND MAPPLET? 77
3. WHAT ARE THE TRANSFORMATIONS THAT ARE NOT SUPPORTED IN MAPPLET? 77
4. IS IT POSSIBLE TO CONVERT REUSABLE TRANSFORMATION TO A NON-REUSABLE ONE? 77 5. WHAT IS THE USE OF MAPPLET &WORKLET IN PROJECT? 78 6. IS IT POSSIBLE TO HAVE A MAPPLET WITHIN A MAPPLET AND WORKLET WITHIN A WORKLET? 78
19. SESSION 79
1. WHAT IS SESSION AND BATCHES? 79
2. WHAT ARE VARIOUS SESSION TRACING LEVELS? 79
3. CAN WE COPY A SESSION TO NEW FOLDER OR NEW REPOSITORY? 79 4. IS IT POSSIBLE TO STORE ALL THE INFORMATICA SESSION LOG INFORMATION IN A DATABASE TABLE?NORMALLY THE SESSION LOG IS STORED AS A BINARY COMPRESSION .BIN FILE IN SESSLOGS DIRECTORY.CAN WE STORE THE SAME INFORMATION IN DATABASE TABLES
FOR FUTURE ANALYSIS? 79
5. CAN WE CALL A SHELL SCRIPT FROM SESSION PROPERTIES? 80 6. CAN WE CHANGE THE SOURCE AND TARGET TABLE NAMES IN SESSION LEVEL? 81
7. HOW TO WRITE FLAT FILE COLUMN NAMES IN TARGET? 81
8. WHAT ARE THE ERROR TABLES PRESENT IN INFORMATICA? 81 9. WHAT ARE THE ALTERNATE WAYS TO STOP A SESSION WITHOUT USING “STOPONERRORS” OPTION SET TO 1 IN SESSION
PROPERTIES? 81
10. SUPPOSE A SESSION FAILS AFTER LOADING OF 10,000 RECORDS IN THE TARGET.HOW CAN WE LOAD THE RECORDS FROM 10,001
WHEN WE RUN THE SESSION NEXT TIME? 82
11. DEFINE THE TYPES OF COMMIT INTERVALS APART FROM USER DEFINED? 82 12. SUPPOSE SESSION IS CONFIGURED WITH COMMIT INTERVAL OF 10,000 ROWS AND SOURCE HAS 50,000 ROWS EXPLAIN THE COMMIT POINTS FOR SOURCE BASED COMMIT & TARGET BASED COMMIT.ASSUME APPROPRIATE VALUE WHEREVER REQUIRED? 82 13. HOW TO CAPTURE PERFORMANCE STATISTICS OF INDIVIDUAL TRANSFORMATION IN THE MAPPING AND EXPLAIN SOME
IMPORTANT STATISTICS THAT CAN BE CAPTURED? 83
14. HOW CAN WE PARAMETERIZE SUCCESS OR FAILURE EMAIL LIST? 83 15. IS IT POSSIBLE THAT A SESSION FAILED BUT STILL THE WORKFLOW STATUS IS SHOWING SUCCESS? 83
16. WHAT IS BUSY PERCENTAGE? 83
17. CAN WE WRITE A PL/SQL BLOCK IN PRE AND POST SESSION OR IN TARGET QUERY OVERRIDE? 84 18. WHENEVER A SESSION RUNS DOES THE DATA GETS OVERWRITTEN IN A FLAT FILE TARGET?IS IT POSSIBLE TO KEEP THE EXISTING
DATA AND ADD THE NEW DATA TO THE TARGET FILE? 84
19. CAN WE USE THE SAME SESSION TO LOAD A TARGET TABLE IN DIFFERENT DATABASES HAVING SAME TARGET DEFINITION? 84 20. HOW DO YOU REMOVE THE CACHE FILES AFTER THE TRANSFORMATION? 84 21. WHY DOESN'T A RUNNING SESSION QUIT WHEN ORACLE OR SYBASE RETURN FATAL ERRORS? 84
20. WORKFLOW 86
1. WHAT IS THE DIFFERENCE BETWEEN STOP AND ABORT OPTIONS IN WORKFLOW? 86
2. RUNNING INFORMATICA WORKFLOW CONTINUOUSLY –HOW TO RUN A WORKFLOW CONTINUOUSLY UNTIL A CERTAIN CONDITION
IS MET? 86
3. HOW DO WE SEND EMAILS FROM INFORMATICA AFTER THE SUCCESSFUL COMPLETION OF ONE SESSION?THE EMAIL WILL CONTAIN THE JOB NAME/ SESSION START TIME AND SESSION END TIME IN THE MESSAGE BODY. 87
4. SCENARIO IMPLEMENTATION 1 87
5. HOW CAN WE SEND TWO SEPARATE EMAILS AFTER A SUCCESSFUL SESSION RUN? 87
DWBIConcepts
DWBIConcepts
DWBIConcepts
8. WE KNOW THERE ARE 3 OPTIONS FOR SESSION RECOVERY STRATEGY -RESTART TASK,FAIL TASK AND CONTINUE RUNNING THE WORKFLOW,RESUME FROM LAST CHECKPOINT WHENEVER A SESSION FAILS.HOW DO WE RESTART A WORKFLOW AUTOMATICALLY WITHOUT ANY MANUAL INTERVENTION IN THE EVENT OF SESSION FAILURE? 89
9. WHAT IS THE DIFFERENCE REAL-TIME AND CONTINUOUS WORKFLOWS? 89
11. SCENARIO IMPLEMENTATION 3 89
12. HOW DO WE SEND A SESSION FAILURE MAIL WITH THE WORKFLOW OR SESSION LOG AS ATTACHMENT? 90 13. EXPLAIN DEADLOCK IN INFORMATICA AND HOW DO WE RESOLVE IT? 90
14. SCENARIO IMPLEMENTATION 4 90
15. HOW CAN WE PASS A VALUE FROM ONE WORKFLOW TO ANOTHER? 91
21. ADMINISTRATION 92
1. WHAT IS LOAD MANAGER? 92
2. WHAT IS DTM PROCESS?HOW MANY THREADS IT CREATES TO PROCESS DATA, EXPLAIN EACH THREAD IN BRIEF? 92
3. CAN YOU CREATE A FOLDER WITHIN DESIGNER? 92
4. HOW DO YOU TAKE CARE OF SECURITY USING A REPOSITORY MANAGER? 93 5. WHAT ARE THE DIFFERENT USES OF A REPOSITORY MANAGER? 93 6. WHAT ARE 2 MODES OF DATA MOVEMENT IN INFORMATICA SERVER? 93
7. WHAT IS CODE PAGE USED FOR? 93
8. WHAT IS CODE PAGE COMPATIBILITY? 94
9. WHAT IS DEFAULT BLOCK BUFFER SIZE? 94
10. WHAT IS DEFAULT LM SHARED MEMORY SIZE? 94
11. DEFINE SERVER CONCEPTS WITH RESPECT TO MEMORY BUFFERS 94 12. WHAT ARE THE TWO PROGRAMS THAT COMMUNICATE WITH THE INFORMATICA SERVER? 95
22. COMMAND LINE ARGUMENTS 96
1. WHAT IS PMCMD COMMANDS? 96
2. WHAT IS PMREP COMMANDS? 96
3. HOW DO WE START & STOP SESSION FROM PMCMD COMMAND LINE? 96
23. METADATA REPOSITORY 97
1. IS THERE ANY METADATA QUERY TO FIND THE LIST OF INFORMATICA FOLDER NAME, WORKFLOW NAMES WHICH ARE MIGRATED IN
A PARTICULAR QUARTER? 97
3. WRITE A METADATA QUERY TO IDENTIFY THE SESSIONS HAVING TRUNCATE OPTION ENABLED 97 4. WHERE CAN I FIND A HISTORY / METRICS OF THE LOAD SESSIONS THAT HAVE OCCURRED IN INFORMATICA? 97
5. HOW TO EXTRACT THE WORKFLOW MONITOR RECORD INFORMATION FROM INFORMATICA METADATA REPOSITORY? 98
24. REPOSITORY MANAGER 100
1. DESCRIBE THE STEPS FOR EXPORT AND IMPORT? 100
2. WHAT ARE THE VARIOUS METHODS OF CODE MIGRATION OR WHICH IS THE BEST WAY OF DEPLOYMENT? 100 3. WHAT ARE THE VARIOUS OPTIONS FOR ETL CODE MIGRATION 101
4. WHAT IS LABELING IN INFORMATICA? 101
5. SUPPOSE HAVING INFORMATICA VERSION CONTROL IN PLACE, CAN WE REVERT BACK AN OBJECT TO A STATE OF TWO PREVIOUS
VERSION. 102
6. WHAT DO WE MEAN BY TEAM BASED DEVELOPMENT IN INFORMATICA? 102
DWBIConcepts
DWBIConcepts
DWBIConcepts
1. SUPPOSE WE HAVE TEN SOURCE FLAT FILES OF SAME STRUCTURE.HOW CAN WE LOAD ALL THE FILES IN TARGET DATABASE IN A
SINGLE BATCH RUN USING A SINGLE MAPPING? 104
2. SUPPOSE WE HAVE TWO SOURCE QUALIFIER TRANSFORMATIONS SQ1 AND SQ2 CONNECTED TO TARGET TABLES TGT1 AND TGT2 RESPECTIVELY.HOW DO YOU ENSURE TGT2 IS LOADED AFTER TGT1? 104 3. SUPPOSE WE HAVE A SOURCE QUALIFIER TRANSFORMATION THAT POPULATES TWO TARGET TABLES.HOW DO YOU ENSURE TGT2
IS LOADED AFTER TGT1? 106
4. SUPPOSE WE HAVE THE EMP TABLE AS OUR SOURCE.IN THE TARGET WE WANT TO VIEW THOSE EMPLOYEES WHOSE SALARY ARE GREATER THAN OR EQUAL TO THE AVERAGE SALARY FOR THEIR DEPARTMENTS.DESCRIBE YOUR MAPPING APPROACH. 106
5. HOW CAN WE PERFORM CHANGED DATA CAPTURE BASED ON LOAD SEQUENCE NUMBER (INTEGER) COLUMN PRESENT IN THE
SOURCE TABLE? 110
6. SCENARIO IMPLEMENTATION 1 111
7. HOW CAN WE LOAD ‘X’ RECORDS (USER DEFINED RECORD NUMBERS) OUT OF ‘N’ RECORDS FROM SOURCE DYNAMICALLY,
WITHOUT USING FILTER AND SEQUENCE GENERATOR TRANSFORMATION? 112
8. SUPPOSE WE HAVE ‘N’ NUMBER OF ROWS IN THE SOURCE AND WE HAVE TWO TARGET TABLES.HOW CAN WE LOAD ‘N/2’ I.E. FIRST HALF THE SOURCE DATA INTO ONE TARGET AND THE REMAINING HALF INTO THE NEXT TARGET? 112 9. SUPPOSE WE HAVE A FLAT FILE WHICH HAS A HEADER RECORD WITH ‘FILE CREATION DATE’, AND DETAILED DATA RECORDS.
DESCRIBE THE APPROACH TO LOAD THE 'FILE CREATION DATE' COLUMN ALONG WITH EACH AND EVERY DETAILED RECORD. 113
10. SCENARIO IMPLEMENTATION 2 113
11. SUPPOSE WE HAVE A FLAT FILE WHICH CONTAINS JUST A NUMERIC VALUE.WE NEED TO POPULATE THIS VALUE IN ONE COLUMN OF THE TARGET TABLE FOR EVERY SOURCE RECORD.HOW CAN WE ACHIEVE THIS? 113
12. HOW WILL YOU LOAD A SOURCE FLAT FILE INTO A STAGING TABLE WHEN THE FILE NAME IS NOT FIXED?THE FILE NAME IS LIKE SALES_2013_02_22.TXT, I.E. DATE IS APPENDED AT THE END OF THE FILE AS A PART OF FILE NAME. 114 13. SOLVE THE BELOW SCENARIO USING INFORMATICA AND DATABASE SQL. 114 14. SUPPOSE WE HAVE A COLUMN IN SOURCE WITH VALUES AS BELOW: 115 15. CAN WE PASS THE VALUE OF A MAPPING VARIABLE BETWEEN 2 PIPELINES UNDER THE SAME MAPPING?IF NOT HOW CAN WE
ACHIEVE THIS? 116
16. SCENARIO IMPLEMENTATION 3 116
17. SCENARIO IMPLEMENTATION 4 117
18. IMPLEMENT SLOWLY CHANGING DIMENSION OF TYPE 2 WHICH WILL LOAD CURRENT RECORD IN CURRENT TABLE AND OLD DATA
IN LOG TABLE. 118
26. PERFORMANCE TUNING 119
1. WHICH ONE IS FASTER CONNECTED OR UNCONNECTED LOOKUP? 119
2. HOW WE CAN IMPROVE PERFORMANCE OF INFORMATICA NORMALIZATION TRANSFORMATION. 119
3. HOW TO IMPROVE THE SESSION PERFORMANCE? 119
4. HOW DO YOU IDENTIFY THE BOTTLENECKS IN MAPPINGS? 120 5. HOW DO YOU HANDLE PERFORMANCE ISSUES IN INFORMATICA?WHERE CAN YOU MONITOR THE PERFORMANCE? 121
6. WHAT ARE PERFORMANCE COUNTERS? 122
7. HOW CAN WE INCREASE SESSION PERFORMANCE? 122
8. SCENARIO IMPLEMENTATION 1 124
DWBIConcepts
DWBIConcepts
DWBIConcepts
Topic Matrix:
Serial Number Topics Questions
1 Aggregator 17 2 Expression 10 3 Filter 2 4 Joiner 12 5 Lookup 20 6 Normalizer 4 7 Rank 12 8 Router 5 9 Sequence Generator 8 10 Stored Procedure 6 11 Sorter 6 12 Union 3 13 Update Strategy 10 14 Java 2 15 Source Qualifier 12 16 Miscellaneous 20 17 Mapping 12 18 Mapplet 6 19 Session 22 20 Workflow 15 21 Administration 12
22 Command Line Arguments 3
23 Metadata Repository 5 24 Repository Manager 6 25 Scenario Questions 18 26 Performance Tuning 8
DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts
1. Aggregator Transformation
1. What is an Aggregator Transformation?
Answer:
An aggregator is an Active, Connected transformation which performs aggregate calculations like AVG,
COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM and VARIANCE.
2. How an Expression Transformation differs from Aggregator Transformation?
Answer:
An Expression Transformation performs calculation on a row-by-row basis, whereas an Aggregator Trans-formation performs calculations on groups.
3. Does an Aggregator Transformation support only aggregate expressions?
Answer:
Apart from aggregate expressions, aggregator transformation supports non-aggregate expressions and
con-ditional clauses.
4. Give one example for each of Conditional Aggregation, Non-Aggregate expression and
Nested Aggregation.
Answer:
Use conditional clauses in the aggregate expression to reduce the number of rows used in the
ag-gregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
SUM (SALARY, JOB = ‘CLERK’)
Use non-aggregate expressions in group by ports to modify or replace groups. IIF (PRODUCT = ‘Brown Bread’, ‘Bread’, PRODUCT)
Nested aggregation expression can include one aggregate function within another aggregate
func-tion.
MAX (COUNT (PRODUCT))
5. How does Aggregator Transformation handle NULL values?
DWBIConcepts
DWBIConcepts
DWBIConcepts
By default, the aggregator transformation treats null values as NULL in aggregate functions. But we can specify to treat null values in aggregate functions as NULL or zero.
6. What are the performance considerations when working with Aggregator
Transfor-mation?
Answer:
Filter the unnecessary data before aggregating it. Place a Filter transformation in the mapping be-fore the aggregator transformation to reduce unnecessary aggregation.
Improve performance by connecting only the necessary input/output ports to subsequent transfor-mations, thereby reducing the size of the data cache.
Use Sorted input which reduces the amount of data cached and improves session performance.
Aggregator performance improves dramatically if records are sorted before passing to the aggregator and “Sorted Input” option under aggregator properties is checked. The record set should be sorted on those col-umns that are used in Group By operation.
It is often a good idea to sort the record set in database level (click here to see why?) e.g. inside a source qualifier transformation, unless there is a chance that already sorted records from source qualifier can again become unsorted before reaching aggregator.
7. What are the uses of index and data cache?
Answer:
The group data is stored in index files whereas Row data stored in data files.
8. What differs when we choose Sorted Input for Aggregator Transformation?
Answer:
Integration Service creates the index and data caches files in memory to process the Aggregator transfor-mation. If the Integration Service requires more space as allocated for the index and data cache sizes in the transformation properties, it stores overflow values in cache files i.e. paging to disk.
One way to increase session performance is to increase the index and data cache sizes in the transformation properties.
But when we check Sorted Input the Integration Service uses memory to process an Aggregator transfor-mation it does not use cache files.
DWBIConcepts
DWBIConcepts
DWBIConcepts
9. Under what conditions selecting Sorted Input in aggregator will still not boost session
per-formance?
Answer:
Incremental Aggregation, session option is enabled.
The aggregate expression contains nested aggregate functions. When session property, Treat Source rows as is set to data driven.
10. Under what condition selecting Sorted Input in aggregator may fail the session?
Answer:
If the input data is not sorted correctly, the session will fail.
Also if the input data is properly sorted, the session may fail if the sort order by ports and the group
by ports of the aggregator are not in the same order.
11. Suppose we do not group by on any ports of the aggregator what will be the output.
Answer:
If we do not use an input port in group-by neither in aggregate expression, the Integration Ser-vice will return only the last row value of the column for the input rows.
For example, if we have 100 rows coming from source then aggregator will output only the last record (100th record)
12. What is the expected value if the column in an aggregator transformation is neither a
group by nor an aggregate expression?
Answer:
Integration Service produces one row for each group based on the group by ports. The columns which are neither part of the key nor aggregate expression will return the corresponding value of last record of the group received.
However, if we specify particularly the FIRST function, the Integration Service then returns the value of the specified first row of the group. So default is the LAST function.
13. What is Incremental Aggregation?
DWBIConcepts
DWBIConcepts
DWBIConcepts
Answer:
We can enable the session option, Incremental Aggregation for a session that includes an Aggregator Trans-formation. When the Integration Service performs incremental aggregation, it actually passes changed source data through the mapping and uses the historical cache data to perform aggregate calculations in-crementally.
14. Sorted input for aggregator transformation will improve performance of mapping.
How-ever, if sorted input is used for nested aggregate expression or incremental aggregation,
then the mapping may result in session failure. Explain why?
Answer:
In case of a nested aggregation, there are multiple levels of sorting associated as each aggregation function will require one sorting pass, and after the first level of aggregation, the sort order of the group by column may get jumbled up, so before the second level of aggregation, Informatica must internally sort it again. However, if we already indicate that input is sorted, Informatica will not do this sorting - resulting into fail-ure.
In incremental aggregation, the aggregate calculations are stored in historical cache on the server. In this his-torical cache the data may not be in sorted order. If we give sorted input, the records come as presorted for that particular run but in the historical cache the data may not be in the sorted order.
15. How can we delete duplicate record using Informatica Aggregator?
Answer:
One way to handle duplicate records in source batch run is to use an Aggregator Transformation and using the Group By checkbox on the ports having duplicate occurring data. Here you can have the flexibility to se-lect the last or the first of the duplicate column value records.
16. Scenario Implementation 1
Suppose in our Source Table we have data as given below:
Student Name Subject Name Marks
Sam Maths 100
Tom Maths 80
Sam Physical Science 80
John Maths 75
Sam Life Science 70
John Life Science 100
John Physical Science 85
Tom Life Science 100
Tom Physical Science 85
DWBIConcepts
DWBIConcepts
DWBIConcepts
We want to load our Target Table as:
Student Name Maths Life Science Physical Science
Sam 100 70 80
John 75 100 85
Tom 80 100 85
Describe your approach.
Answer:
Here our scenario is to convert many rows to one row, and the transformation which will help us to achieve this is Aggregator.
Our Mapping will look like this:
We will sort the source data based on STUDENT_NAME ascending followed by SUBJECT ascending.
Now based on STUDENT_NAME in GROUP BY clause the following output subject columns are populated as MATHS: MAX( MARKS, SUBJECT = ’Maths’ )
LIFE_SC: MAX( MARKS, SUBJECT = ’Life Science’ ) PHY_SC: MAX( MARKS, SUBJECT = ’Physical Science’ )
DWBIConcepts
DWBIConcepts
DWBIConcepts
17. Scenario Implementation 2
Source:
100 XYZ AAA 100 XYZ BBB 100 XYZ CCC
The expected output data: 100 XYZ AAA BBB CCC
Which transformations are used for this?
Answer:
Use an Aggregator transformation with variable.
DWBIConcepts
DWBIConcepts
DWBIConcepts
2. Expression Transformation
1. What is an Expression Transform?
Answer:
Expression is a Passive connected transformation used to calculate values in a single row before you write to the target. We can use the Expression transformation to perform any non-aggregate calculations. We can al-so use the Expression transformation to test conditional statements before you output the results to target tables or other transformations.
For example, we might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers.
2. How many types of ports are there in Expression transform?
Answer:
There are three types of ports- INPUT, OUTPUT, and VARIABLE
3. What is the execution order of the ports in an expression?
Answer:
All ports are executed TOP TO BOTTOM in a serial physical ordering fashion, but they are done in the following groups:
All input ports are pushed values first.
Then all variables are executed (top to bottom physical ordering in the expression). Last - all output expressions are executed to push values to output ports
You can utilize this to your advantage, by placing lookups in to variables, then using the variables "later" in the execution cycle.
4. Describe the approach for the requirement. Suppose the input is:
Col1 Col2 10 a 20 b 30 c
DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts
50 d
The desired output is: Col1 Col2 10 a 20 a,b 30 a,b,c 40 a,b,c 50 a,b,c,d
Answer: Use an Expression transformation:-
Port Name Port Type Expression
Col1 I/O
Col2 I
V_Seq V CUME(1)
V_Col2 V IIF (V_Seq = 1, Col2, IIF ( ISNULL (Col2), Prev_Col2, Prev_Col2 || ',' || Col2)) Prev_Col2 V V_Col2
Out_Col2 O Prev_Col2
Keep in mind the string length of the variable and output ports.
CUME function is used to calculate the cumulative amount based on the argument of the cumulative func-tion. This means, if we call CUME with argument 1, e.g. CUME(1); then on the first call it will re-turn 1; on the second call, it will rere-turn 2; on the third call, it will rere-turn 3 and so on. Since Informatica process data row by row, this means that when the first row is processed CUME(1) will return 1; for the next row, it will return 2 and so on.
5. How can we implement aggregation operation without using an Aggregator
Transfor-mation in Informatica?
Answer:
We will use the very basic concept of the Expression Transformation, that at a time we can
ac-cess the previous row data as well as the currently proac-cessed data in an expression
transfor-mation. What we need is simple Sorter, Expression and Filter transformation to achieve aggre-gation at Informatica level.
For detailed understanding visit Aggregation without Aggregator.
6. Scenario Implementation 1
Source Col1 Col2 A W B R C EDWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts
B E
Target
Col1 READ WRITE EXECUTE A 1 1 0
B 1 0 1 C 0 0 1
In this scenario Source values in Col2 W, R, E means read write and execute.
Answer:
Take an Expression transformation followed by Aggregator transformation. In Expression Transformation:
Port Name Port Type Expression
Col1 I/O Col2 I/O
Read O IIF ( Col2 = 'R', 1, 0 ) Write O IIF ( Col2 = 'W', 1, 0 ) Execute O IIF ( Col2 = 'E', 1, 0 ) In Aggregator Transform:
Col 1 I/O GROUP BY
Read I/O MAX (Read) Write I/O MAX (Write) Execute I/O MAX (Execute)
7. Scenario Implementation 2
Source data is like below:
Id name1 name2 10 A B 10 C D 20 E F
Desired Target data is like below
Id name 10 AB 10 CD
DWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts
Answer:
Use Expression Transformation to concatenate both values as- name = name1 || name2
8. Scenario Implementation 3
Suppose we have a field in source file named as DATA. We need to mark those records having 9 characters such that the first 2 characters must be alphabets i.e.(A-Z) and the rest 7 characters must be alphanumeric i.e.(A-Z) or (0-9) for the DATA field as output. And the records which don’t match the condition should be marked as “Invalid”. How do we implement this?
E.g. DATA OUTPUT AB345GH6756 AB345GH67 CD56789PJ CD56789PJ 56CHJK97889 Invalid DG//*67DF Invalid Answer:
Use the below logic in an output port of an Expression Transformation in Informatica:- IIF( REG_MATCH( SUBSTR(DATA,1,2), '[[:alpha:]]{2}' ) = 1
ANDREG_MATCH( SUBSTR(DATA,3,7), '[[:alnum:]]{7}' ) = 1, SUBSTR(DATA, 1, 9), 'Invalid' )
9. Scenario Implementation 4
How do we convert a Date field coming as data type string from a flat file?
Answer:
Use Date Conversion Functions:-
IIF( IS_DATE( Column1 ) = 1, TO_DATE( Column1 , 'YYYYMMDD' ), NULL )
In the above example, we have assumed the format of the date field is ‘YYYYMMDD’. If the format is some-thing else (e.g. YYYY-MM-DD), we need to specify the same
10. Scenario Implementation 5
Source: Col1 Col2 1 BDWBIConcepts
DWBIConcepts
DWBIConcepts
DWBIConcepts
3 D 4 E
Target
Col1 Col2 Col3 Col4 1 B 2 C 3 D 4 E
Describe the approach to the above scenario where the source 1st record loaded to target col1,col2 then 2nd record loaded to col3,col4 again 3rd record to col1,col2 and so on.
Answer:
Use an Expression transformation:
Port Name
Port Type Expression
Col1 I Col2 I
V_ID V 1 – MOD (Col1, 2)
O_ID O V_ID O_Col1 O V_Col1 O_Col2 O V_Col2 O_Col3 O Col1 O_Col4 O Col2 V_Col1 V Col1 V_Col2 V Col2
Next use a Filter transformation with condition O_ID = 1
Next map O_Col1, O_Col2, O_Col3, O_Col4 to Col1, Col2, Col3, Col4 of the target respectively.
DWBIConcepts
DWBIConcepts
DWBIConcepts
3. Filter Transformation
1. What is a Filter Transformation and why it is an Active one?
Answer:
A Filter transformation is an Active and Connected transformation that can filter rows in a mapping. Only the rows that meet the Filter Condition pass through the Filter transformation to the next transfor-mation in the pipeline. TRUE and FALSE are the implicit return values from any filter condition we set. If the filter condition evaluates to NULL, the row is assumed to be FALSE. The numeric equivalent of FALSE is zero (0) and any non-zero value is the equivalent of TRUE.
As an ACTIVE transformation, the Filter transformation may change the number of rows passed through it. A filter condition returns TRUE or FALSE for each row that passes through the transformation, de-pending on whether a row meets the specified condition. Only rows that return TRUE pass through this transformation. Discarded rows do not appear in the session log or reject files.
2. What is the difference between Source Qualifier transformations Source filter option and
filter transformation?
Answer:
SQ Source Filter Filter Transformation
Source Qualifier transformation filters rows when read from a source.
Filter transformation filters rows from within a mapping
Source Qualifier transformation can only filter rows from relational sources.
Filter transformation filters rows coming from any type of source system in the map-ping level.
Source Qualifier limits the row set extracted from a source.
Filter transformation limits the row set sent to a target.
Source Qualifier reduces the number of rows used throughout the mapping and hence it provides better performance.
To maximize session performance, in-clude the Filter transformation as close to the sources in the mapping as possible to filter out unwanted data early in the flow of data from sources to targets.
The filter condition in the Source Qualifier transfor-mation only uses standard SQL as it runs in the database.
Filter Transformation can define a condi-tion using any statement or transformacondi-tion function that returns either a TRUE or FALSE
value.
DWBIConcepts
DWBIConcepts
DWBIConcepts
4. Joiner Transformation
1. What is a Joiner Transformation and why it is an Active one?
Answer:
A Joiner is an Active and Connected transformation used to join two source data streams coming from same or heterogeneous databases or files.
The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources.
In the Joiner transformation, we must configure the transformation properties namely Join Condition, Join
Type and optionally Sorted Input option to improve Integration Service performance.
The join condition contains ports from both input sources that must match for the Integration Service to join two rows. Depending on the join condition and the type of join selected, the Integration Service either adds the row to the result set or discards the row. Because of this reason, the number of rows in Joiner output may not be equal to the number of rows in Joiner Input. This is why Joiner is considered an Active transformation.
2. State the limitations where we cannot use Joiner in the mapping pipeline.
Answer:
The Joiner transformation accepts input from most transformations. However, following are the limitations:
Joiner transformation cannot be used when either of the input pipelines contains an Update
Strate-gy transformation.
Joiner transformation cannot be used if we connect a Sequence Generator transformation directly
before the Joiner transformation.
3. Out of the two input pipelines of a joiner, which one will we set as the master pipeline?
Answer:
During a session run, the Integration Service compares each row of the master source against the detail source. The master and detail sources need to be configured for optimal performance. When the Integration Service processes an unsorted Joiner transformation, it blocks the detail source while it caches rows from the master source. Once the Integration Service finishes reading and caching all master
DWBIConcepts
DWBIConcepts
DWBIConcepts
For a Sorted Joiner transformation, use the source with fewer duplicate key values as the master source for optimal performance and disk storage. When the Integration Service processes a sorted Joiner transfor-mation, it caches rows for one hundred keys at a time. If the master source contains many rows with the same key value, the Integration Service must cache more rows, and performance can be slowed.
Blocking logic is possible if master and detail input to the Joiner transformation originate from dif-ferent sources. Otherwise, it does not use blocking logic. Instead, it stores more rows in the cache.
4. What are the different types of Joins available in Joiner Transformation?
Answer:
In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The Joiner transformation is similar to an SQL join except that data can originate from different types of sources. The Joiner transformation supports the following types of joins:
Normal Master Outer Detail Outer Full Outer
A normal or master outer join performs faster than a full outer or detail outer join.
DWBIConcepts
DWBIConcepts
DWBIConcepts
5. Define the various Join Types of Joiner Transformation.
Answer:
In a normal join, the Integration Service discards all rows of data from the master and detail source that do not match, based on the join condition.
A master outer join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.
A detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.
A full outer join keeps all rows of data from both the master and detail sources.
6. Describe the impact of number of join conditions and join order in a Joiner.
Answer:
We can define one or more conditions based on equality between the specified master and detail sources. Both ports in a condition must have the same data type.
If we need to use two ports in the join condition with non-matching data types we must convert the data types so that they match. The Designer validates data types in a join condition.
Additional ports in the join condition, increases the time necessary to join two sources.
The order of the ports in the join condition can impact the performance of the Joiner transformation. If we use multiple ports in the join condition, the Integration Service compares the ports in the order we specified.
Only equality operator is available in joiner join condition.
7. How does Joiner transformation treat NULL value matching?
Answer:
The Joiner transformation does not match null values.
For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not consider them a match and does not join the two rows.
To join rows with null values, replace null input with default values in the Ports tab of the joiner, and then
join on the default values.
DWBIConcepts
DWBIConcepts
DWBIConcepts
If a result set includes fields that do not contain data in either of the sources, the Joiner transfor-mation populates the empty fields with null values. If we know that a field will return a NULL and we do not want to insert NULLs in the target, set a default value on the Ports tab for the corre-sponding port.
8. When we configure the join condition, what are the guidelines we need to follow to
main-tain the sort order?
Suppose we configure Sorter transformations in the master and detail pipelines with the following sorted ports in order: ITEM_NO, ITEM_NAME and PRICE.
Answer:
If we have sorted both the master and detail pipelines in order of the ports say ITEM_NO, ITEM_NAME and PRICE we must ensure that:
Use ITEM_NO in the First Join Condition.
If we add a Second Join Condition, we must use ITEM_NAME.
If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use ITEM_NAME in the Second Join Condition.
If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order and the In-tegration Service fails the session.
9. What are the transformations that cannot be placed between the sort origin and the
Join-er transformation so that we do not lose the input sort ordJoin-er?
Answer:
The best option is to place the Joiner transformation directly after the sort origin to maintain sorted data. However do not place any of the following transformations between the sort origin and the Joiner transfor-mation: Custom Unsorted Aggregator Normalizer Rank Union transformation XML Parser transformation XML Generator transformation
Mapplet [if it contains any one of the above mentioned transformations]
10. What is the use of sorted input in joiner transformation?
Answer:
It is recommended to Join sorted data when possible. We can improve session performance by con-figuring the Joiner transformation to use sorted input. When we configure the Joiner transformation
DWBIConcepts
DWBIConcepts
DWBIConcepts
great performance improvement when we work with large data sets.
For an unsorted Joiner transformation, designate as the master source the source with fewer rows. For optimal performance and disk storage, designate the master source as the source with the fewer rows. During a session, the Joiner transformation compares each row of the master source against the de-tail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
11. Can we join two tables based on a join column having different data type?
For example table 1 EMPNO (string) and table 2 EMPNUM (number)
Answer:
Yes possible in this case. If we are using Joiner, we should be able to do this explicit conversion in an expres-sion transformation before joining the tables.
12. Implementation Scenario1 - Joiner transformation is joining two tables s1 and s2. s1 has
10,000 rows and s2 has 1000 rows . Which table you will set master for better
perfor-mance of joiner transformation? Why?
Answer:
Set table S2 as Master table because informatica server has to keep master table in the cache so if it is 1000 in cache will get performance instead of having 10000 rows in cache.
DWBIConcepts
DWBIConcepts
DWBIConcepts
5. Lookup Transformation
1. What is a Lookup transform?
Answer:
The transform is used to look up data in a flat file, relational table, views, or synonym. The informatica server queries the lookup table based on the lookup ports in the transformation. It compares lookup transfor-mation port values to lookup table column values based on the lookup condition. The result is passed to other transformations and the target.
Uses:
Get related value Perform a calculation
Update slowly changing dimension tables.
2. What are the differences between Connected and Unconnected Lookup?
Answer:
The differences are illustrated in the below table:
Connected Lookup Unconnected Lookup
Connected lookup participates in dataflow and re-ceives input directly from the pipeline
Unconnected lookup receives input values from the result of a LKP: expression in an-other transformation
Connected lookup can use both dynamic and static cache
Unconnected Lookup cache can NOT be dynamic
Connected lookup can return more than one col-umn value ( output port )
Unconnected Lookup can return only one column value i.e. output port
Connected lookup caches all lookup columns Unconnected lookup caches only the lookup output ports in the lookup condi-tions and the return port
Supports user-defined default values (i.e. value to return when lookup conditions are not satisfied)
Does not support user defined default val-ues
3. What are the different lookup cache(s)?
Answer:
Informatica Lookups can be cached or un-cached (No cache). And Cached lookup can be either static or
dy-DWBIConcepts
DWBIConcepts
DWBIConcepts
A static cache is one which does not modify the cache once it is built and the data remains same during the session run.
On the other hand, a dynamic cache is refreshed during the session run by inserting or updating the records
in cache based on the incoming source data. By default, Informatica cache is static cache.
A lookup cache can also be divided as persistent or non-persistent based on whether Informatica retains the cache even after the completion of session run or deletes it.
4. Is lookup an active or passive transformation?
Answer:
From Informatica 9x, Lookup transformation can be configured as an "Active" transformation. Find out How to configure lookup as active transformation.
However, in the earlier versions of Informatica, lookup is a passive transformation.
5. What is the difference between Static and Dynamic Lookup Cache?
Answer:
We can configure a Lookup transformation to cache the underlying lookup table. In case of static or
read-only lookup cache the Integration Service caches the lookup table at the beginning of the session and does
not update the lookup cache while it processes the Lookup transformation. Rows are not added dynamically in the cache.
In case of dynamic lookup cache the Integration Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The dynamic cache is synchronized with the target. It basically, caches the rows as and when it is passed.
In case you are wondering why we need to make lookup cache dynamic, read this article on dynamic lookup.
6. What are the uses of index and data caches?
Answer:
The conditions are stored in index cache and records from the lookup are stored in data cache
7. What is Persistent Lookup Cache?
Answer:
DWBIConcepts
DWBIConcepts
DWBIConcepts
If the cache generated for a Lookup needs to be preserved for subsequent use then persistent cache is used. It will not delete the index and data files. It is useful only if the lookup table remains constant.
Lookups are cached by default in Informatica. Lookup cache can be either non-persistent or persistent. The Integration Service saves or deletes lookup cache files after a successful session run based on, whether the Lookup cache is checked as persistent or not.
8. What type of join does Lookup support?
Answer:
Lookup is just similar like SQL LEFT OUTER JOIN.
9. Explain how lookup transformation works like SQL Left Outer Join.
Answer:
Lookup means if the source input column value matches the lookup table comparison column value then it will Return valid values from the lookup table else it will return NULL.
Let’s consider the EMP table as Source and DEPT table as lookup. We want to extract the location of each employee based on his or her department number. So if the Location details are not available in the DEPT table, still we want to have all the other information of the employee coming from the source EMP table, apart from NULL as location and load in our target table.
So the equivalent SQL query looks like below:- SELECT EMP.*, DEPT.LOC
FROM EMP LEFT OUTER JOIN DEPT ON EMP.DEPTNO = DEPT.DEPTNO
Hence Lookup is associated with the Source table as Left Outer Join.
10. Where and why do we use Unconnected Lookup instead of Connected Lookup?
Answer:
The best part of unconnected lookup is that, we can call the lookup based on some condition and not every time. I.e. based on some condition met we can invoke the unconnected lookup in an expression transformation else not. By this we may optimize the performance of a flow.
We may consider unconnected lookup as a function in any procedural language. It takes multiple parameters as input and returns one values, and can be used repeatedly. Same way unconnected lookup can be used in any scenario where we need to use the lookup repeatedly either in single or multiple transformation. With the unconnected lookup, we get the performance benefit of not caching the same data multiple times. Also it is a good coding practice.
DWBIConcepts
DWBIConcepts
DWBIConcepts
11. How can we Identify Persistent Cache Files in Informatica Server?
Answer:
Cache files are generated in the Cache directory of the Informatica Server for transformations like Aggregator, Joiner, Lookup, Rank & Sorter.
Two types of cache files are generated i.e. the data and index files exception being Sorter transfor-mation.
Most Important point is that Informatica automatically deletes all the generated .dat and .idx cache files after a session run is finished.
So the files that are present in the Cache directory are basically the Persistent Cache files of Lookup transformation, Aggregator Cache files of Incremental Aggregation sessions or if the session run was not successfully completed.
Informatica generated cache files are named as:
PMAGG*.idx, PMAGG*.dat, PMJNR*.idx, PMJNR*.dat, PMLKP*.idx, PMLKP*.dat.
Often while handling big data cache Informatica creates multiple index and data files due to paging and appends a number to the end of the files e.g. PMAGG*.dat0, PMAGG*.idx0, PMAGG*.dat1, PMAGG*.idx1.
So if we have followed any particular naming convention for Lookup Persistent Cache Name e.g. ta-ble_name_PC or the table names have a convention like GDW_ then use shell commands accordingly to identify the cache files in server.
In this context you can revisit Lookup Persistent Cache and Incremental Aggregation article
12. How to configure a Lookup on a flat file with header?
Answer:
When we try to create a lookup transformation, we have the option to select the location of the Lookup Ta-ble from any of Source, Target, Source Qualifier, Import from Relational TaTa-ble or Import from Flat File. So after selecting the flat file as lookup from the desired location, the edit Transformation tab of the lookup will have the Flat file information to choose between Delimited or Fixed width and advanced properties to modify like Column Delimiters, Code Page and obviously Number of initial rows to skip.
Set Number of initial rows to skip as 1. Set the Lookup condition as required.
Apart from that go to the Mapping tab of the corresponding session and select the lookup transformation to configure the Lookup source file directory and filename and Lookup source file type i.e. Direct or Indirect.
13. What is the difference between persistent cache and shared cache?
Answer:
Persistent cache is a type of Informatica lookup cache in which the cache file is stored in disk. We can configure the session to re-cache if necessary. It will be used only if we are sure that lookup table will not change between sessions.
It will be used if your mapping uses any static tables as lookup mostly.
DWBIConcepts
DWBIConcepts
DWBIConcepts
If the persistent cache is shared across mappings, we call it as shared cache (named). We will provide a name for this cache file.
If the lookup table is used in more than one transformation/mapping then the cache built for the first lookup can be used for the others. It can be used across mappings.
For Shared cache we have to give the name in cache file name prefix property. Use the same name it in dif-ferent lookup where we want to use the cache.
Unshared cache: Within the mapping if the lookup table is used in more than one transformation then the cache built for the first lookup can be used for the others. It cannot be used across mappings.
14. Describe how to return multiple port values from unconnected lookup in Informatica.
Answer:
Informatica Unconnected Lookup by default supports only one return port.
So alternatively we can write a Lookup SQL override with the required ports values concatenated into a sin-gle string as return port value.
Call the Unconnected lookup from the expression transformation and use various output ports to retrieve the lookup values based on the concatenated return value. Use SUBSTR, INSTR functions to extract the col-umn values from the concatenated return field.
15. How to make the persistent lookup cache in sync with lookup table?
Answer:
To make the persistent cache in sync with the lookup table simply enable Re-cache option of the lookup transformation to rebuild the lookup cache from lookup table again. While loading the target dimension ta-ble we can choose to make the lookup cache dynamic and recache-persistent so that once dimension is loaded the persistent cache file is in sync and available during Fact table loading.
16. If we use persistent cache for a dynamic lookup, will the cache file be updated or inserted
as required?
Answer:
Having persistent cache will not impact the dynamic cache anyway in doing insert & updates to the cache file. Just that cache file will have a proper name assigned using persistent named cache and it can be reused later.