informatica

(1)

Informatica Question and Answers

what is rank transformation?where can we use this ...

Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation.

Where is the cache stored in informatica?

cache stored in informatica is in informatica server.

If you want to create indexes after the load process which transformation you

choose?

stored procedure transformation

In a joiner transformation, you should specify the

source with fewer rows as the master source. Why?

In joiner transformation Inforrmatica server reads all the records from master source builds index and data caches based on master table rows after building the caches the joiner transformation reads records from the detail source and perform joins

What happens if you try to create a shortcut to a non-shared folder?

It only creates a copy of it.

What is Transaction?

A transaction can be defined as DML operation.

means it can be insertion, modification or deletion of data performed by users/ analysts/applicators

Can any body write a session parameter file which will change the source and

targets for every session i.e different source and targets for each session run.

You are supposed to define a parameter file. And then in the Parameter file, you can define two parameters, one for source and one for target.

Give like this for example:

$Src_file = c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt

Then go and define the parameter file:

[folder_name.WF:workflow_name.ST:s_session_name]

$Src_file =c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt

If its a relational db, you can even give an overridden sql at the session level...as a parameter. Make sure the sql is in a single line.

Informatica Live Interview Questions

here are some of the interview questions i could not answer, any body can help giving answers for others also.

thanks in advance.

Explain grouped cross tab? Explain reference cursor

(2)

What is meta data and system catalog What is factless fact schema

What is confirmed dimension

Which kind of index is preferred in DWH Why do we use DSS database for OLAP tools

confirmed dimension == one dimension that shares with two fact table

factless means, fact table without measures only contains foreign keys-two types of factless table, one is event tracking and other is coverage table

Bit map indexes preferred in the data ware housing

Metadate is data about data, here every thing is stored example-mapping, sessions, privileges other data, in informatica we can see the Metadate in the repository.

System catalog that we used in the cognos, that also contains data, tables, privileges, predefined filter etc, using this catalog we generate reports

group cross tab is a type of report in cognos, where we have to assign 3 measures for getting the result

What is meant by Junk Attribute in Informatica?

Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions.

Can anyone explain about incremental aggregation with an example?

When you use aggregator transformation to aggregate it creates index and data caches to store the data 1.Of group by columns 2. Of aggregate columns

the incremental aggregation is used when we have historical data in place which will be used in aggregation incremental aggregation uses the cache which contains the historical data and for each group by column value already present in cache it add the data value to its corresponding data cache value and outputs the row in case of a incoming value having no match in index cache the new values for group by and output ports are inserted into the cache .

Difference between Rank and Dense Rank?

Rank: 1 2<--2nd position 2<--3rd position 4 5

Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game usually Ranks this way. This is usually a Gold Ranking.

Dense Rank: 1 2<--2nd position 2<--3rd position 3 4

(3)

About Informatica Power center 7:

1) I want to know which mapping properties can be overridden on a Session Task

level.

2)Know what types of permissions are needed to run and schedule Work flows.

1) I want to Know which mapping properties can be overridden on a Session Task level? You can override any properties other than the source and targets. Make sure the source and targets exist in your db if it is a relational db. If it is a flat file, you can override its properties. You can override sql if its a relational db, session log, DTM buffer size, cache sizes etc.

2) Know what types of permissions are needed to run and schedule Work flows

You need execute permissions on the folder to run/schedule a workflow. You may have read and write. But u need execute permissions as well.

Can any one explain real time complain mappings or complex transformations in

Informatica.

Especially in Sales Domain.

Most complex logic we use is denormalization. We don’t have any Denormalizer transformation in Informatica. So we will have to use an aggregator followed by an expression. Apart from this, we use most of the complex in expression transformation involving lot of nested IIF and Decode

statements...another one is the union transformation and joiner.

How do you create a mapping using multiple lookup transformation?

Use unconnected lookup if same lookup repeats multiple times.

In the source, if we also have duplicate records and we have 2 targets, T1- for unique

values and T2- only for duplicate values. How do we pass the unique values to T1

and duplicate values to T2 from the source to these 2 different targets in a single

mapping?

Soln1:

source--->sq--->exp-->sorter (with enable select distinct check box) --->t1 --->aggregator (with enabling group by and write count function) --->t2 If u wants only duplicates to t2 u can follow this sequence

--->agg (with enable group by write this code decode(count(col),1,1,0))--->Filter(condition is 0)--->t2.

Soln2:

take two source instances and in first one embedded distinct in the source qualifier and connect it to the target t1.

and just write a query in the second source instance to fetch the duplicate records and connect it to the target t2.

<< if u use aggregator as suggested by my friend u will get duplicate as well as distinct records in the second target >>

Soln3:

Use a sorter transformation. Sort on key fields by which u want to find the duplicates. then use an expression transformation. Example: Example: field1--> field2--> SORTER: field1 --ascending/descending field2 --ascending/descending

(4)

Expression: --> field1 --> field2

<--> v_field1_curr = field1 <--> v_field2_curr = field2

v_dup_flag = IIF(v_field1_curr = v_field1_prev, true, false) o_dup_flag = IIF(v_dup_flag = true, 'Duplicate', 'Not Duplicate' <--> v_field1_prev = v_field1_curr

<--> v_field2_prev = v_field2_curr

Use a Router transformation and put o_dup_flag = 'Duplicate' in T2 and 'Not Duplicate' in T1.

Informatica evaluates row by row. So as we sort, all the rows come in order and it will evaluate based on the previous and current rows.

What are the enhancements made to Informatica 7.1.1 version when compared to

6.2.2 version?

In 7+ versions

- We can lookup a flat file - Union and custom transformation- There is

propagate option i.e., if we change any data type of a field, all the linked

columns will reflect that change- We can write to XML target.- We can use

up to 64 partitions

What is the difference between Power Centre and Power Mart?

What is the procedure for creating Independent Data Marts from Informatica 7.1?

Power Centre have Multiple Repositories,where as Power mart have single repository(desktop repository)Power Centre again linked to global repositor to share between users

Power center Powermart No. of

repository n No. n No.

aplicability high end WH low&mid range WH global

repository supported not supported local repository supported supported ERP support available not available

What is lookup transformation and update strategy transformation and explain with

an example.

Look up transformation is used to lookup the data in a relational table, view, Synonym and Flat file. The informatica server queries the lookup table based on the lookup ports used in the transformation. It compares the lookup transformation port values to lookup table column values based on the lookup condition

By using lookup we can get related value, Perform a calculation and Update SCD. Two types of lookups

Connected Unconnected

(5)

Update strategy transformation

This is used to control how the rows are flagged for insert, update, delete or reject. To define a flagging of rows in a session it can be insert, Delete, Update or Data driven. In Update we have three options

Update as Update Update as insert Update else insert

What is the logic will you implement to load the data in to one fact able from 'n'

number of dimension tables.

To load data into one fact table from more than one dimension tables. Firstly you need to create a fact table and dimension tables. Later load data into individual dimensions by using sources and

transformations (aggregator, sequence generator, lookup) in mapping designer then to the fact table connect the surrogate to the foreign key and the columns from dimensions to the fact.

After loading the data into the dimension tables we will load the data into the fact tables ... the reason for this is that the dimension tables contain the data related to the fact table.

To load the data from dimension table to fact table is simple ..

assume (dimension table as source tables) and fact table as target. that all...

Can i use a session Bulk loading option that time can i make a recovery to the

session?

If the session is configured to use in bulk mode it will not write recovery information to recovery tables. So Bulk loading will not perform the recovery as required.

No, why because in bulk load u won’t create redo log file, when u normal load we create redo log file, but in bulk load session performance increases.

How do you configure mapping in informatica

You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations.

For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. You can also perform the following tasks to optimize the mapping:

• Configure single-pass reading.

• Optimize datatype conversions.

• Eliminate transformation errors.

• Optimize transformations.

• Optimize expressions. You should configure the mapping with the least number of

transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations.

For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.

(6)

•

○ Configure single-pass reading. ○ Optimize datatype conversions. ○ Eliminate transformation errors. ○ Optimize transformations. ○ Optimize expressions.

What is difference between dimension table and fact table

and what are different dimension tables and fact tables

In the fact table contain measurable data and fewer columns and many rows, It's contain primary key

Different types of fact tables: Additive, non additive, semi additive

In the dimensions table contain textual description of data and also contain many columns, less rows Its contain primary key

What are Work let and what use of work let and in which situation we can use it

Worklet is a set of tasks. If a certain set of task has to be reused in many workflows then we use work lets. To execute a Work let, it has to be placed inside a workflow.

The use of work let in a workflow is similar to the use of mapplet in a mapping.

What are mapping parameters and variables in which situation we can use it

If we need to change certain attributes of a mapping after every time the session is run, it will be very difficult to edit the mapping and then change the attribute. So we use mapping parameters and variables and define the values in a parameter file. Then we could edit the parameter file to change the attribute values. This makes the process simple.

Mapping parameter values remain constant. If we need to change the parameter values then we need to edit the parameter file.

But value of mapping variables can be changed by using variable function. If we need to increment the attribute value by 1 after every session run then we can use mapping variables

In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run.

explain use of update strategy transformation

Maintain the history data and maintain the most recent changes data. what is meant by complex mapping,

Complex mapping means involved in more logic and more business rules.Actually in my project complex mapping isIn my bank project, I involved in construct a 1 data ware houseMany customer is there in my bank project, They r after taking loans relocated in to another place that time i feel to difficult maintain both previous and current addressesin the sense i am using scd2This is an simple example of complex mapping

I have an requirement where in the columns names in a table (Table A) should appear

in rows of target table (Table B) i.e. converting columns to rows. Is it possible

through Informatica? If so, how?

if data in tables as follows

(7)

Key-1 char(3); table A values _______ 1 2 3 Table B bkey-a char(3); bcode char(1); table b values 1 T 1 A 1 G 2 A 2 T 2 L 3 A

and output required is as 1, T, A

2, A, T, L 3, A

the SQL query in source qualifier should be select key_1,

max(decode( bcode, 'T', bcode, null )) t_code, max(decode( bcode, 'A', bcode, null )) a_code, max(decode( bcode, 'L', bcode, null )) l_code from a, b

where a.key_1 = b.bkey_a group by key_1

/

If a session fails after loading of 10,000 records in to the target How can u load the

records from 10001 th record when u run the session next time in informatica 6.1?

Simple solution, Nothing by using performance recovery option

Can we run a group of sessions without using workflow manager

ya Its Possible using pmcmd Command with out using the workflow Manager run the group of session. what is the difference between stop and abort

The Power Center Server handles the abort command for the Session task like the stop command, except it has a timeout period of 60 seconds. If the Power Center Server cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session. stop: _______If the session u want to stop is a part of batch you must stop the batch,

if the batch is part of nested batch, Stop the outer most bacth\

Abort:----You can issue the abort command , it is similar to stop command except it has 60 second time out . If the server cannot finish processing and committing data with in 60 sec

(8)

What is difference between lookup cache and uncached lookup?

Can i run the mapping with out starting the informatica server?

The difference between cache and uncached lookup is when you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation, in cache lookup the select statement executes only once and compares the values of the input record with the values in the cache but in uncached lookup the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record

I want to prepare a questionnaire. The details about it are as follows:

-1. Identify a large company/organization that is a prime candidate for DWH project.

(For example Telecommunication, an insurance company, banks, may be the prime

candidate for this)

2. Give at least four reasons for the selecting the organization.

3. Prepare a questionnaire consisting of at least 15 non-trivial questions to collect

requirements/information about the organization. This information is required to

build data warehouse.

Can you please tell me what should be those 15 questions to ask from a company,

say a telecom company?

First of all meet your sponsors and make a BRD (business requirement document) about their expectation from this data warehouse (main aim comes from them).For example they need customer billing process. Now go to business management team they can ask for metrics out of billing process for their use. Now management people monthly usage, billing metrics, sales organization, rate plan to perform sales rep and channel performance analysis and rate plan analysis. So your dimension tables can be Customer (customer id, name, city, state etc) Sales rep sales rep number, name, idsalesorg: sales ord idBill dimension: Bill #,Bill date, Numberrate plan:rate plan codeAnd Fact table can be:Billing

details(bill #,customer id, minutes used, call details etc)you can follow star and snow flake schema in this case. Depend upon the granularity of your data.

Can i start and stop single session in concurrent batch?

Just right click on the particular session and going to recovery option or

by using event wait and event rise

What is Micro Strategy? Why is it used for? Can any one explain in detail about it?

Micro strategy is again an BI tool which is a HOLAP... u can create 2 dimensional report and also cubes in here...basically a reporting tool. It has a full range of reporting on web also in windows.

What is difference b/w Informatica 7.1 and Abinitio

There is a lot of difference between Inforrmatica an Abinitio In Ab Initio we r using 3 parllalisim

but Informatica using 1 parllalisim

In Ab Initio no scheduling option we can scheduled manully or pl/sql script but informatica contains 4 scheduling options

Ab Inition contains co-operating system but informatica is not

(9)

Ramp time is very quickly in Ab Initio campare than Informatica Ab Initio is userfriendly than Informatica

What is mystery dimension?

Also known as Junk Dimensions

Making sense of the rogue fields in your fact table..

What is cost based and rule based approaches and the difference

Cost based and rule based approaches are the optimization techniques which are used in related to databases, where we need to optimize a SQL query.

Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques. bcz the third has some disadvantages.)

When ever you process any SQL query in Oracle, what oracle engine internally does is, it reads the query and decides which will the best possible way for executing the query. So in this process, Oracle follows these optimization techniques.

1. cost based Optimizer (CBO): If a SQL query can be executed in 2 different ways ( like may have path 1 and path2 for same query),then What CBO does is, it basically calculates the cost of each path and the analyses for which path the cost of execution is less and then executes that path so that it can optimize the query execution.

2. Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query. So depending on the number of rules which are to be applied, the optimzer runs the query.

Use:

If the table you are trying to query is already analysed, then oracle will go with CBO. If the table is not analysed , the Oracle follows RBO.

For the first time, if table is not analysed, Oracle will go with full table scan.

what are partition points?

Partition points mark the thread boundaries in a source pipeline and divide the pipeline into stages.

How to append the records in flat file (Informatica) ? Where as in Data stage we have

the options

i) overwrite the existing file

ii) Append existing file

This is not there in Informatica v 7. But heard that it’s included in the latest version 8.0 where u can append to a flat file. Its about to be shipping in the market.

If u had to split the source level key going into two separate tables. One as surrogate

and other as primary. Since informatica does not gurantee keys are loaded

properly(order!) into those tables. What are the different ways you could handle this

type of situation?

foreign key

what is the best way to show metadata(number of rows at source, target and each

transformation level, error related data) in a report format

When your workflow gets completed go to workflow monitor right click the session .then go to transformation statistics there we can see number of rows in source and target. if we go for session properties we can see errors related to data

(10)

You can select these details from the repository table. you can use the view REP_SESS_LOG to get these data

Two relational tables are connected to SQ transformation, what are the possible

errors it will be thrown?

We can connect two relational tables in one sq Transformation. No errors will be perform

With out using Updatestrategy and sessons options, how we can do the update our

target table?

Soln1: You can use this by using "update override" in target properties Soln2: In session properties, There is an option

insert update

insert as update update as update like that

by using this we will easily solve

Soln3: By default all the rows in the session is set as insert flag ,you can change it in the session general properties -- Treate source rows as :update

so, all the incoming rows will be set with update flag. now you can update the rows in the target table

Could anyone please tell me what are the steps required for type2

dimension/version data mapping. how can we implement it

Go to mapping designer in it go for mapping select wizard in it go for slowly changing dimension

Here u will find a new window their u need to give the mapping name source table target table and type of slowly changing dimension then if select finish slowly changing dimension 2 mapping is created

go to ware designer and generate the table then validate the mapping in mapping designer save it to repository run the session in workflow manager

later update the source table and re run again u will find the difference in target table

How to import oracle sequence into Informatica.

Create one procedure and declare the sequence inside the procedure,finally call the procedure in informatica with the help of stored procedure transformation

What is data merging, data cleansing, sampling?

Cleansing:---TO identify and remove the retundacy and inconsistency sampling: just smaple the data throug send the data from source to target

What is IQD file?

IQD file is nothing but Impromptu Query Definition, This file is mainly used in Cognos Impromptu tool after creating a imr ( report) we save the imr as IQD file which is used while creating a cube in power play transformer.In data source type we select Impromptu Query Definetion.

Differences between Normalizer and Normalizer transformation.

Normalizer: It is a transormation mainly using for cobol sources,

it's change the rows into coloums and columns into rows Normalization:To remove the retundancy and inconsitecy

(11)

In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL

What is the procedure or steps implementing versioning if you are already in

version7.X. Any gotcha\'s or precautions..

For version control in ETL layer using informatica, first of all after doing anything in your designer mode or workflow manager, do the following steps...

1> First save the changes or new implementations.

2>Then from navigator window, right click on the specific object you are currently in. There will be a pop up window. In that window at the lower end side, you will find versioning->Check In. A window will be opened. Leave the information you have done like "modified this mapping" etc. Then click ok button.

can anyone explain error handling in informatica with examples so that it will be easy

to explain the same in the interview.

go to the session log file there we will find the information regarding to the session initiation process,

errors encountered. load summary.

so by seeing the errors encountered during the session running, we can resolve the errors.

If you have four lookup tables in the workflow How do you troubleshoot to improve

performance?

There r many ways to improve the mapping which has multiple lookups.

1) We can create an index for the lookup table if we have permissions(staging area).

2) Divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows only the new rows will come to mapping and the process will be fast . (b) Dedicate the second one to update : source=target,, these r existing rows only the rows which exists allready will come into the mapping.

3)we can increase the chache size of the lookup

If you are workflow is running slow in informatica. Where do you start trouble

shooting and what are the steps you follow?

If you are workflow is running slow in informatica. Where do you start trouble shooting and what are the steps you follow? SOLN1: when the work flow is running slowly you have to find out the bottlenecks

in this order target source mapping session system

SOLN2: work flow may be slow due to different reasons one is alpha characters in decimal data check it out this and due to insufficient length of strings check with the SQL override

How do you handle decimal places while importing a flatfile into informatica?

while importing the flat file, the flat file wizard helps in configuring the properties of the file so that select the numeric column and just enter the precision value and the scale. Precision includes the scale for examples if the number is 98888.654, enter precision as 8 and scale as 3 and width as 10 for fixed width flat file

(12)

we have a task called wait event using that we can stop. we start using raise event.

why dimenstion tables are denormalized in nature ?...

Because in Data warehousing historical data should be maintained, to maintain historical data means suppose one employee details like where previously he worked, and now where he is working, all details should be maintain in one table, if u maintain primary key it won't allow the duplicate records with same employee id. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column).

so all the dimensions are marinating historical data, they are de normalized, because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table

Can we use aggregator/active transformation after update strategy transformation?

We can use, but the update flag will not be remain. But we can use passive transformation

Can any one comment on

significance of oracle 9i in informatica when compared to oracle 8 or 8i.

i mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in

informatica

it's very easy

Actually oracle 8i not allowed user defined data types But 9i allows

and then blob, lob allow only 9i not 8i

and more over list partinition is there in 9i only

in the concept of mapping parameters and variables, the variable value will be saved

to the repository after the completion of the session and the next time when u run the

session, the server takes the saved variable value in the repository and starts

assigning the next value of the saved value. for example i ran a session and in the

end it stored a value of 50 to the repository.next time when i run the session, it

should start with the value of 70. not with the value of 51.

how to do this.

SOLN1: u can do onething after running the mapping,, in workflow manager start--->session.

right clickon the session u will get a menu, in that go for persistant values, there u will find the last value stored in the repository regarding to mapping variable. then remove it and put ur desired one, run the session... i hope ur task will be done

SOLN2: it takes value of 51 but u can override the saved variable in the repository by defining the value in the parameter file.if there is a parameter file for the mapping variable it uses the value in the parameter file not the value+1 in the repositoryfor example assign the value of the mapping variable as 70.in othere words higher preference is given to the value in the parameter file

(13)

Mapping parameters and variables make the use of mappings more flexible and also it avoids creating of multiple mappings. it helps in adding incremental data mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping ----> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$. and choose type as

parameter/variable, data type once defined the variable/parameter is in the any expression for example in SQ transformation in the source filter properties tab. just enter filter condition and finally create a

parameter file to assign the value for the variable / parameter and configure the session properties. however the final step is optional. if their parameter is not present it uses the initial value which is assigned at the time of creating the variable

How to delete duplicate rows in flat files source is any option in informatica

Use a sorter transformation , in that u will have a "distinct" option make use of it .

What is the use of incremental aggregation? Explain me in brief with an example.

Its a session option when the informatica server performs incremental aggregation it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally for performance we will use it.

What is the procedure to load the fact table.Give in detail?

SOLN1: we use the 2 wizards (i.e) the getting started wizard and slowly changing dimension wizard to load the fact and dimension tables,by using these 2 wizards we can create different types of mappings according to the business requirements and load into the star schemas(fact and dimension tables). SOLN2: first dimenstion tables need to be loaded, then according to the specifications the fact tables should be loaded. Don’t think that fact table’s r different in case of loading; it is general mapping as we do for other tables. specifications will play important role for loading the fact.

How to lookup the data on multiple tabels.

if u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that joined table. informatica provieds lookup on joined tables

How to retrieve the records from a rejected file. explane with syntax or example

SOLN1:

there is one utility called "reject Loader" where we can find out the reject records and able to refine and reload the rejected records..

SOLN2: During the execution of workflow all the rejected rows will be stored in bad files (where your informatica server get installed C:\Program Files\Inforrmatica Power Center 7.1\Server) These bad files can be imported as flat a file in source then thro' direct mapping we can load these files in desired format.

How does the server recognise the source and target databases?

By using ODBC connection.if it is relational.if is flat file FTP connection..see we can make sure with connection in the properties of session both sources & targets

What are variable ports and list two situations when they can be used?

We have mainly three ports Inport, Outport, Variable port. Inport represents data is flowing into

transformation. Outport is used when data is mapped to next transformation. Variable port is used when we mathematical calculations are required.

you can also use as for example consider price and quantity and total as a variable we can make a sum on the total_amt by giving

sum (total_amt)

variable port is used to break the complex expression into simpler and also it is used to store intermediate values

(14)

You can use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is zero or negative:

IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 )

You can use DECODE instead of IIF in many cases. DECODE may improve readability. The following shows how you can use DECODE instead of IIF :

SALES > 0 and SALES < 50, SALARY1,

SALES > 49 AND SALES < 100, SALARY2, SALES > 99 AND SALES < 200, SALARY3, SALES > 199, BONUS)

in Dimensional modeling fact table is normalized or denormalized?in case of star

schema and incase of snow flake schema?

No concept of normailzation in the case of star schema but in the case of snow flack schema dimension table must be normalized.

Star schema--De-Normalized dimensions Snow Flake Schema-- Normalized dimensions

which is better among connected lookup and unconnected lookup transformations in

informatica or any other ETL tool?

When you compared both basically connected lookup will return more values and unconnected returns one value conn lookup is in the same pipeline of source and it will accept dynamic caching. Unconn lookup don't have that facility but in some special cases we can use Unconnected. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favorable

I think the better one is connected look up. beacaz we can use dynamic cache with it ,, also connected loop up can send multiple columns in a single row, where as unconnected is concerned it has a single return port.(in case of etl informatica is concerned)

What is the limit to the number of sources and targets you can have in a mapping

As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping.

Question is " if you make N number of tables to participate at a time in processing what is the position of your database. I organization point of view it is never encouraged to use N number of tables at a time, It reduces database and informatica server performance"

The restriction is only on the database side. how many concurrent threads r u allowed to run on the db server?

which objects are required by the debugger to create a valid debug session?

Initially the session should be valid session.

Source, target, lookups, expressions should be available min 1 break point should be available for debugger to debug your session.

Informatica server Object is must.

what is the procedure to write the query to list the highest salary of three employees?

SELECT sal

FROM (SELECT sal FROM my_table ORDER BY sal DESC) WHERE ROWNUM < 4;

since this is informatica.. you might as well use the Rank transformation. check out the help file on how to use it.

(15)

We are using Update Strategy Transformation in mapping how can we know whether

insert or update or reject or delete option has been selected during running of

sessions in Informatica.

In Designer while creating Update Strategy Transformation uncheck "forward to next transformation". If any rejected rows are there automatically it will be updated to the session log file.

Update or insert files are known by checking the target file or table only.

Suppose session is configured with commit interval of 10,000 rows and source has

50,000 rows. Explain the commit points for Source based commit and Target based

commit. Assume appropriate value wherever required.

Source based commit will commit the data into target based on commit interval so for every 10,000 rows it will commit into target.

Target based commit will commit the data into target based on buffer size of the target. i.e., it commits the data into target when ever the buffer fills Let us assume that the buffer size is 6,000. So for every 6,000 rows it commits the data.

How do we estimate the number of partitions that a mapping really requires? Is it

dependent on the machine configuration?

It depends upon the informatica version we r using suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions

Can Informatica be used as a Cleansing Tool? If yes give example of transformations

that can implement a data cleansing routine.

Yes, we can use Informatica for cleansing data some time we use stages to cleansing the data. It depends upon performance again else we can use expression to cleansing data.

For example a field X has some values and other with Null values and assigned to target field where target field is not null column, inside an expression we can assign space or some constant value to avoid session failure.

The input data is in one format and target is in another format, we can change the format in expression. We can assign some default values to the target to represent complete set of data in the target.

How do you decide whether you need it do aggregations at database level or at

Informatica level?

It depends upon our requirement only If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here I am explaining why we need to use informatica.

what ever it may be informatica is a third party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. No necessary to process entire values again and again unless this can be done if nobody deleted that cache files. If that happened total aggregation we need to execute on informatica also.

In database we don't have Incremental aggregation facility.

Identifying bottlenecks in various components of Informatica and resolving them.

The best way to find out bottlenecks is writing to flat file and see where the bottle neck is .

How to join two tables without using the Joiner Transformation

SOLN1:

It possible to join the two or more tables by using source qualifier. But provided the tables should have relationship.

When u drag n drop the table u will getting the source qualifier for each table. Delete all the source qualifiers. Add a common source qualifier for all. Right click on the source qualifier u will find EDIT click on it. Click on the properties tab, u will find sql query in that u can write ur sqls

(16)

SOLN2: joiner transformation is used to join n (n>1) tables from same or different databases, but source qualifier transformation is used to join only n tables from same database

SOLN3: use Source Qualifier transformation to join tables on the SAME database. Under its properties tab, you can specify the user-defined join. Any select statement you can run on a database.. you can do also in Source Qualifier.

Note: you can only join 2 tables with Joiner Transformation but you can join two tables from different databases.

In a filter expression we want to compare one date field with a db2 system field

CURRENT DATE.

Our Syntax: datefield = CURRENT DATE (we didn't define it by ports, its a system

field ), but this is not valid (PMParser: Missing Operator)..

Can someone help us.

the db2 date format is "yyyymmdd" where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date formate to local database date formate is compulsary. other wise u will get that type of error Use Sysdate or use to_date for the current date

what does the expression n filter transformations do in Informatica Slowly growing

target wizard?

EXPESSION transformation detects and flags the rows from source.

Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation

how to create the staging area in your database

A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options.

So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab.

whats the diff between Informatica powercenter server, repositoryserver and

repository?

Power center server contains the scheduled runs at which time data should load from source to target Repository contains all the definitions of the mappings done in designer.

What are the Differences between Informatica Power Center versions 6.2 and 7.1,

also between Versions 6.2 and 5.1?

The main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor.

In ver 7x u have the option of looking up (lookup) on a flat file. U can write to XML target.

Versioning

LDAP authentication

Support of 64 bit architectures

Differences between Informatica 6.2 and Informatica 7.0

Features in 7.1 are :

(17)

2. Lookup on flat file

3. Grid servers working on different operating systems can coexist on same server 4. We can use pmcmdrep

5. We can export independent and dependent rep objects 6. We ca move mapping in any web application

7. Version controlling 8. Data profilling

What is the difference between connected and unconnected stored procedures.

Run a stored procedure before or after your session.

Unconnected

Run a stored procedure once during your mapping, such as pre- or

post-session.

Unconnected

Run a stored procedure every time a row passes through the Stored Procedure

transformation.

Connected or

Unconnected

Run a stored procedure based on data that passes through the mapping, such

as when a specific port does not contain a null value.

Unconnected

Pass parameters to the stored procedure and receive a single output parameter.

Connected or

_Unconnected

Pass parameters to the stored procedure and receive multiple output

parameters.

Note: To get multiple output parameters from an unconnected Stored

Procedure transformation, you must create variables for each output

parameter. For details, see Calling a Stored Procedure From an Expression.

Connected or

Unconnected

Run nested stored procedures.

Unconnected

Call multiple times within a mapping.

Unconnected

Discuss which is better among incremental load, Normal Load and Bulk load

If the database supports bulk load option from Inforrmatica then using BULK LOAD for intial loading the tables is recommended.

Depending upon the requirment we should choose between Normal and incremental loading strategies If supported by the database bulk load can do the loading faster than normal load.(incremental load concept is differnt dont merge with bulk load, mormal load)

Compare Data Warehousing Top-Down approach with Bottom-up approach

in top down approch: first we have to build dataware house then we will build data marts. which will need more crossfunctional skills and timetaking process also costly.

in bottom up approach: first we will build data marts then data warehuse. the data mart that is first build will remain as a proff of concept for the others. less time as compared to above and less cost.

What is the difference between summary filter and detail filter

summary filter can be applied on a group of rows that contain a common value where as detail filters can be applied on each and every rec of the data base.

(18)

Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse.

A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data

can we modify the data in flat file?

Just open the text file with notepad, change what ever you want (but datatype should be the same)

how to get the first 100 rows from the flat file into the target?

SOLN1: task --->(link) session (workflow manager)

double click on link and type $$source sucsess rows(parameter in session variables) = 100 it should automatically stops session.

SOLN2: 1. Use test download option if you want to use it for testing. 2. Put counter/sequence generator in mapping and perform it.

can we lookup a table from a source qualifer transformation-unconnected lookup

No. we can't do.

I will explain you why.

1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query.

2) source qualifier don't have any variables feilds to utalize as expression. what is a junk dimension

A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades.

What is the difference between Narmal load and Bul...

Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed.Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases. compartivly Bulk load is pretty faster than normal load.

At the max how many tranformations can be us in a mapping?

There is no such limitation to use this number of transformations. But in performance point of view using too many transformations will reduce the session performance.

My idea is "if needed more tranformations to use in a mapping its better to go for some stored procedure."

Waht are main advantages and purpose of using Normalizer Transformation in Informatica?

Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data

(19)

How do u convert rows to columns in Normalizer? could you explain us??

Normally, its used to convert columns to rows but for converting rows to columns, we need an aggregator and expression and little effort is needed for coding. Denormalization is not possible with a Normalizer transformation.

Discuss the advantages & Disadvantages of star & snowflake schema?

In a star schema every dimension will have a primary key.

In a star schema, a dimension table will not have any parent table.

Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema.

Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to drill down the data from topmost hierachies to the lowermost hierarchies.

star schema consists of single fact table surrounded by some dimensional table.In snowflake schema the dimension tables are connected with some subdimension table.

In starflake dimensional ables r denormalized,in snowflake dimension tables r normalized. star schema is used for report generation ,snowflake schema is used for cube.

The advantage of snowflake schema is that the normalized tables r easier to maintain.it also saves the storage space.

The disadvantage of snowflake schema is that it reduces the effectiveness of navigation across the tables due to large no of joins between them.

what is a time dimension? give an example.

Time dimension is one of important in Datawarehouse. Whenever u genetated the report , that time u access all data from thro time dimension.

eg. employee time dimension

Fields : Date key, full date, day of wek, day , month,quarter,fiscal year

What r the connected or unconnected transforamations?

Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not.

much like calling a program by name and by reference.

use unconnected transforms when you wanna call the same transform many times in a single mapping An unconnected transformation cant be connected to another transformation. but it can be called inside another transformation.

uncondition transformation are directly connected and can/used in as many as other transformations. If you are using a transformation several times, use unconditional. You get better performance.

How can U create or import flat file definition in to the warehouse designer?

U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer.

U can not create or import flat file defintion in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.

(20)

Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica

server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.

Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again.

Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file

Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session.

Creating log files: Loadmanger creates logfile contains the status of session.

How do you transfert the data from data warehouse to flatfile?

You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool.

Diff between informatica repositry server & informatica server

Informatica Repository Server:It's manages connections to the repository from client application. Informatica Server:It's extracts the source data,performs the data transformation,and loads the transformed data into the target

Router transformation

A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.

What are 2 modes of data movement in Informatica Server?The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server.

a) Unicode - IS allows 2 bytes for each character and uses additional byte for each non-ascii character (such as Japanese characters)

b) ASCII - IS holds all data in a single byte

The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.

How to read rejected data or bad data from bad file and reload it to target?

correction the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator.

Explain the informatica Architecture in detail

Informatica server connects source data and target data using native odbc drivers

again it connect to the repository for running sessions and retriveing metadata information source--->informatica server--->target

(21)

| |

REPOSITORY

repository←Repository→Repository ser.adm.

control server ¢Õ

source←informatica server→target

---¢Õ ¢Õ ¢Õ

designer w.f.manager

w.f.monitor

how can we partition a session in Informatica?

The Informatica® PowerCenter® Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning.

GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option

enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.

What is Load Manager?

While running a Workflow,the PowerCenter Server uses the Load Manager process and the Data

Transformation Manager Process (DTM) to run the workflow and carry out workflow

tasks.When the PowerCenter Server runs a workflow, the Load Manager performs the following

tasks:

1. Locks the workflow and reads workflow properties.

2. Reads the parameter file and expands workflow variables.

3. Creates the workflow log file.

4. Runs workflow tasks.

5. Distributes sessions to worker servers.

6. Starts the DTM to run sessions.

7. Runs sessions from master servers.

8. Sends post-session email if the DTM terminates abnormally.

When the PowerCenter Server runs a session, the DTM performs the following tasks:

1. Fetches session and mapping metadata from the repository.

2. Creates and expands session variables.

3. Creates the session log file.

4. Validates session code pages if data code page validation is enabled. Checks query

conversions if data code page validation is disabled.

5. Verifies connection object permissions.

6. Runs pre-session shell commands.

7. Runs pre-session stored procedures and SQL.

8. Creates and runs mapping, reader, writer, and transformation threads to extract,transform, and

load data.

9. Runs post-session stored procedures and SQL.

10. Runs post-session shell commands.

11. Sends post-session email.

What is Data cleansing..?

(22)

The process of finding and removing or correcting data that is incorrect, out-of-date, redundant, incomplete, or formatted incorrectly.

This is nothing but polising of data. For example of one of the sub system store the Gender as M and F. The other may store it as MALE and FEMALE. So we need to polish this data, clean it before it is add to Datawarehouse. Other typical example can be Addresses. The all sub systesms maintinns the customer address can be different. We might need a address cleansing to tool to have the customers addresses in clean and neat form.

To provide support for Mainframes source data,which files r used as a source

definitions?

COBOL Copy-book files

Where should U place the flat file to import the flat file

defintion to the designer?

There is no such restrication to place the source file. In performance point of view its better to place the file in server local src folder. if you need path please check the server properties availble at workflow manager.

It doesn't mean we should not place in any other folder, if we place in server src folder by default src will be selected at time session creation

How many ways you can update a relational source defintion and what r they?

Two ways

1. Edit the definition

2. Reimport the definition

Which transformation should u need while using the cobol

sources as source defintions?

Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data.

What is the maplet?

For Ex:Suppose we have several fact tables that require a series of dimension keys.Then we can create a mapplet which contains a series of Lkp transformations to find each dimension key and use it in each fact table mapping instead of creating the same Lkp logic in each mapping.

what is a transforamation?

It is a repostitory object that generates,modifies or passes data.A transformation is repository object that pass data to the next stage(i.e to the next transformation or target) with/with out modifying the data

What r the active and passive transforamtions?

An active transforamtion can change the number of rows that pass through it.A passive transformation does not change the number of rows that pass through it.

Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition. A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.

What r the reusable transforamtions?

Reusable transformations can be used in multiple mappings.When u need to incorporate this transformation into maping,U add an instance of it to

maping.Later if U change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,U can change the

transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save U great deal of work.

What r the methods for creating reusable

transforamtions?

Two methods

1.Design it in the transformation developer.

2.Promote a standard transformation from the mapping designer.After U add a transformation to the mapping , U can promote it to the status of reusable transformation.

Once U promote a standard transformation to reusable status,U can demote it to a standard transformation at any time.

If u change the properties of a reusable transformation in mapping,U can revert it to the original reusable transformation properties by clicking the revert button.

What r the unsupported repository

(23)

objects for a mapplet?

COBOL source definition Joiner transformations

Normalizer transformations

Non reusable sequence generator transformations. Pre or post session stored procedures

Target defintions

Power mart 3.5 style Look Up functions XML source definitions

IBM MQ source definitions• Source definitions. Definitions of database objects (tables, views,

synonyms) or files that provide source data. • Target definitions. Definitions of database objects or files that contain the target data. • Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions. • Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the

Informatica Server uses to transform and move data. • Reusable transformations. Transformations that you can use in multiple mappings. • Mapplets. A set of transformations that you can use in multiple mappings. • Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.

What r the mapping paramaters and

maping variables?

Maping parameter represents a constant value that U can define before running a session.A mapping parameter retains the same value throughout the entire session.

When u use the maping parameter ,U declare and use the parameter in a maping or maplet.Then define the value of parameter in a parameter file for the session.

Unlike a mapping parameter,a maping variable represents a value that can change throughout the session.The informatica server saves the value of maping variable to the repository at the end of session run and uses that value next time U run the session.

Can U use the maping parameters or

variables created in one maping into another maping?

NO.

We can use mapping parameters or variables in any transformation of the same maping or mapplet in which U have created maping parameters or variables.

Can u use the maping parameters or

variables created in one maping into any other reusable transformation?

Yes.Because reusable tranformation is not contained with any maplet or maping.

How can U improve session performance in aggregator transformation?

use sorted input:

1. use a sorter before the aggregator

2. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by.

the key order is also very important

What is aggregate cache in aggregator transforamtion?

The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the

transformation.If the informatica server requires more space,it stores overflow values in cache files. When you run a workflow that uses an Aggregator transformation, the Informatica Server creates index and data caches in memory to process the transformation. If the Informatica Server requires more space, it stores overflow values in cache files.

What r the diffrence between joiner transformation and source qualifier

transformation?

U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation.

U need matching keys to join two relational sources in source qualifier transformation.Where as u doesn’t need matching keys to join two sources.

(24)

which r coming from diffrent sources also.

In which condtions we can not use joiner

transformation(Limitaions of joiner transformation)?

Both pipelines begin with the same original data source.

Both input pipelines originate from the same Source Qualifier transformation. Both input pipelines originate from the same Normalizer transformation. Both input pipelines originate from the same Joiner transformation. Either input pipelines contains an Update Strategy transformation.

Either input pipelines contains a connected or unconnected Sequence Generator transformation.

what r

the settiings that u use to cofigure the joiner transformation?

• Master and detail source • Type of join • Condition of the join

the Joiner transformation supports the following join types, which you set in the Properties tab:

• Normal (Default)

• Master Outer

• Detail Outer

• Full Outer

What r the join types in joiner transformation?

Normal (Default) -- only matching rows from both master and detail Master outer -- all detail rows and only matching rows from master Detail outer -- all master rows and only matching rows from detail

Full outer -- all rows from both master and detail ( matching or non matching) follw this