Informatica Question and Answers
what is rank transformation?where can we use this ...Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation.
Where is the cache stored in informatica?
cache stored in informatica is in informatica server.
If you want to create indexes after the load process which transformation you
choose?
stored procedure transformationIn a joiner transformation, you should specify the
source with fewer rows as the master source. Why?
In joiner transformation Inforrmatica server reads all the records from master source builds index and data caches based on master table rows after building the caches the joiner transformation reads records from the detail source and perform joinsWhat happens if you try to create a shortcut to a non-shared folder?
It only creates a copy of it.What is Transaction?
A transaction can be defined as DML operation.
means it can be insertion, modification or deletion of data performed by users/ analysts/applicators
Can any body write a session parameter file which will change the source and
targets for every session i.e different source and targets for each session run.
You are supposed to define a parameter file. And then in the Parameter file, you can define two parameters, one for source and one for target.Give like this for example:
$Src_file = c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt
Then go and define the parameter file:
[folder_name.WF:workflow_name.ST:s_session_name]
$Src_file =c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt
If its a relational db, you can even give an overridden sql at the session level...as a parameter. Make sure the sql is in a single line.
Informatica Live Interview Questions
here are some of the interview questions i could not answer, any body can help giving answers for others also.
thanks in advance.
Explain grouped cross tab? Explain reference cursor
What is meta data and system catalog What is factless fact schema
What is confirmed dimension
Which kind of index is preferred in DWH Why do we use DSS database for OLAP tools
confirmed dimension == one dimension that shares with two fact table
factless means, fact table without measures only contains foreign keys-two types of factless table, one is event tracking and other is coverage table
Bit map indexes preferred in the data ware housing
Metadate is data about data, here every thing is stored example-mapping, sessions, privileges other data, in informatica we can see the Metadate in the repository.
System catalog that we used in the cognos, that also contains data, tables, privileges, predefined filter etc, using this catalog we generate reports
group cross tab is a type of report in cognos, where we have to assign 3 measures for getting the result
What is meant by Junk Attribute in Informatica?
Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions.
Can anyone explain about incremental aggregation with an example?
When you use aggregator transformation to aggregate it creates index and data caches to store the data 1.Of group by columns 2. Of aggregate columns
the incremental aggregation is used when we have historical data in place which will be used in aggregation incremental aggregation uses the cache which contains the historical data and for each group by column value already present in cache it add the data value to its corresponding data cache value and outputs the row in case of a incoming value having no match in index cache the new values for group by and output ports are inserted into the cache .
Difference between Rank and Dense Rank?
Rank: 1 2<--2nd position 2<--3rd position 4 5
Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game usually Ranks this way. This is usually a Gold Ranking.
Dense Rank: 1 2<--2nd position 2<--3rd position 3 4
About Informatica Power center 7:
1) I want to know which mapping properties can be overridden on a Session Task
level.
2)Know what types of permissions are needed to run and schedule Work flows.
1) I want to Know which mapping properties can be overridden on a Session Task level? You can override any properties other than the source and targets. Make sure the source and targets exist in your db if it is a relational db. If it is a flat file, you can override its properties. You can override sql if its a relational db, session log, DTM buffer size, cache sizes etc.2) Know what types of permissions are needed to run and schedule Work flows
You need execute permissions on the folder to run/schedule a workflow. You may have read and write. But u need execute permissions as well.
Can any one explain real time complain mappings or complex transformations in
Informatica.
Especially in Sales Domain.
Most complex logic we use is denormalization. We don’t have any Denormalizer transformation in Informatica. So we will have to use an aggregator followed by an expression. Apart from this, we use most of the complex in expression transformation involving lot of nested IIF and Decode
statements...another one is the union transformation and joiner.
How do you create a mapping using multiple lookup transformation?
Use unconnected lookup if same lookup repeats multiple times.In the source, if we also have duplicate records and we have 2 targets, T1- for unique
values and T2- only for duplicate values. How do we pass the unique values to T1
and duplicate values to T2 from the source to these 2 different targets in a single
mapping?
Soln1:
source--->sq--->exp-->sorter (with enable select distinct check box) --->t1 --->aggregator (with enabling group by and write count function) --->t2 If u wants only duplicates to t2 u can follow this sequence--->agg (with enable group by write this code decode(count(col),1,1,0))--->Filter(condition is 0)--->t2.
Soln2:
take two source instances and in first one embedded distinct in the source qualifier and connect it to the target t1.and just write a query in the second source instance to fetch the duplicate records and connect it to the target t2.
<< if u use aggregator as suggested by my friend u will get duplicate as well as distinct records in the second target >>
Soln3:
Use a sorter transformation. Sort on key fields by which u want to find the duplicates. then use an expression transformation. Example: Example: field1--> field2--> SORTER: field1 --ascending/descending field2 --ascending/descendingExpression: --> field1 --> field2
<--> v_field1_curr = field1 <--> v_field2_curr = field2
v_dup_flag = IIF(v_field1_curr = v_field1_prev, true, false) o_dup_flag = IIF(v_dup_flag = true, 'Duplicate', 'Not Duplicate' <--> v_field1_prev = v_field1_curr
<--> v_field2_prev = v_field2_curr
Use a Router transformation and put o_dup_flag = 'Duplicate' in T2 and 'Not Duplicate' in T1.
Informatica evaluates row by row. So as we sort, all the rows come in order and it will evaluate based on the previous and current rows.
What are the enhancements made to Informatica 7.1.1 version when compared to
6.2.2 version?
In 7+ versions
- We can lookup a flat file - Union and custom transformation- There is
propagate option i.e., if we change any data type of a field, all the linked
columns will reflect that change- We can write to XML target.- We can use
up to 64 partitions
What is the difference between Power Centre and Power Mart?
What is the procedure for creating Independent Data Marts from Informatica 7.1?
Power Centre have Multiple Repositories,where as Power mart have single repository(desktop repository)Power Centre again linked to global repositor to share between usersPower center Powermart No. of
repository n No. n No.
aplicability high end WH low&mid range WH global
repository supported not supported local repository supported supported ERP support available not available
What is lookup transformation and update strategy transformation and explain with
an example.
Look up transformation is used to lookup the data in a relational table, view, Synonym and Flat file. The informatica server queries the lookup table based on the lookup ports used in the transformation. It compares the lookup transformation port values to lookup table column values based on the lookup condition
By using lookup we can get related value, Perform a calculation and Update SCD. Two types of lookups
Connected Unconnected
Update strategy transformation
This is used to control how the rows are flagged for insert, update, delete or reject. To define a flagging of rows in a session it can be insert, Delete, Update or Data driven. In Update we have three options
Update as Update Update as insert Update else insert
What is the logic will you implement to load the data in to one fact able from 'n'
number of dimension tables.
To load data into one fact table from more than one dimension tables. Firstly you need to create a fact table and dimension tables. Later load data into individual dimensions by using sources and
transformations (aggregator, sequence generator, lookup) in mapping designer then to the fact table connect the surrogate to the foreign key and the columns from dimensions to the fact.
After loading the data into the dimension tables we will load the data into the fact tables ... the reason for this is that the dimension tables contain the data related to the fact table.
To load the data from dimension table to fact table is simple ..
assume (dimension table as source tables) and fact table as target. that all...
Can i use a session Bulk loading option that time can i make a recovery to the
session?
If the session is configured to use in bulk mode it will not write recovery information to recovery tables. So Bulk loading will not perform the recovery as required.
No, why because in bulk load u won’t create redo log file, when u normal load we create redo log file, but in bulk load session performance increases.
How do you configure mapping in informatica
You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations.
For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. You can also perform the following tasks to optimize the mapping:
• Configure single-pass reading.
• Optimize datatype conversions.
• Eliminate transformation errors.
• Optimize transformations.
• Optimize expressions. You should configure the mapping with the least number of
transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations.
For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.
•
○ Configure single-pass reading. ○ Optimize datatype conversions. ○ Eliminate transformation errors. ○ Optimize transformations. ○ Optimize expressions.
What is difference between dimension table and fact table
and what are different dimension tables and fact tables
In the fact table contain measurable data and fewer columns and many rows, It's contain primary key
Different types of fact tables: Additive, non additive, semi additive
In the dimensions table contain textual description of data and also contain many columns, less rows Its contain primary key
What are Work let and what use of work let and in which situation we can use it
Worklet is a set of tasks. If a certain set of task has to be reused in many workflows then we use work lets. To execute a Work let, it has to be placed inside a workflow.The use of work let in a workflow is similar to the use of mapplet in a mapping.
What are mapping parameters and variables in which situation we can use it
If we need to change certain attributes of a mapping after every time the session is run, it will be very difficult to edit the mapping and then change the attribute. So we use mapping parameters and variables and define the values in a parameter file. Then we could edit the parameter file to change the attribute values. This makes the process simple.Mapping parameter values remain constant. If we need to change the parameter values then we need to edit the parameter file.
But value of mapping variables can be changed by using variable function. If we need to increment the attribute value by 1 after every session run then we can use mapping variables
In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run.
explain use of update strategy transformation
Maintain the history data and maintain the most recent changes data. what is meant by complex mapping,
Complex mapping means involved in more logic and more business rules.Actually in my project complex mapping isIn my bank project, I involved in construct a 1 data ware houseMany customer is there in my bank project, They r after taking loans relocated in to another place that time i feel to difficult maintain both previous and current addressesin the sense i am using scd2This is an simple example of complex mapping
I have an requirement where in the columns names in a table (Table A) should appear
in rows of target table (Table B) i.e. converting columns to rows. Is it possible
through Informatica? If so, how?
if data in tables as followsKey-1 char(3); table A values _______ 1 2 3 Table B bkey-a char(3); bcode char(1); table b values 1 T 1 A 1 G 2 A 2 T 2 L 3 A
and output required is as 1, T, A
2, A, T, L 3, A
the SQL query in source qualifier should be select key_1,
max(decode( bcode, 'T', bcode, null )) t_code, max(decode( bcode, 'A', bcode, null )) a_code, max(decode( bcode, 'L', bcode, null )) l_code from a, b
where a.key_1 = b.bkey_a group by key_1
/
If a session fails after loading of 10,000 records in to the target How can u load the
records from 10001 th record when u run the session next time in informatica 6.1?
Simple solution, Nothing by using performance recovery optionCan we run a group of sessions without using workflow manager
ya Its Possible using pmcmd Command with out using the workflow Manager run the group of session. what is the difference between stop and abort
The Power Center Server handles the abort command for the Session task like the stop command, except it has a timeout period of 60 seconds. If the Power Center Server cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session. stop: _______If the session u want to stop is a part of batch you must stop the batch,
if the batch is part of nested batch, Stop the outer most bacth\
Abort:----You can issue the abort command , it is similar to stop command except it has 60 second time out . If the server cannot finish processing and committing data with in 60 sec
What is difference between lookup cache and uncached lookup?
Can i run the mapping with out starting the informatica server?
The difference between cache and uncached lookup is when you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation, in cache lookup the select statement executes only once and compares the values of the input record with the values in the cache but in uncached lookup the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record
I want to prepare a questionnaire. The details about it are as follows:
-1. Identify a large company/organization that is a prime candidate for DWH project.
(For example Telecommunication, an insurance company, banks, may be the prime
candidate for this)
2. Give at least four reasons for the selecting the organization.
3. Prepare a questionnaire consisting of at least 15 non-trivial questions to collect
requirements/information about the organization. This information is required to
build data warehouse.
Can you please tell me what should be those 15 questions to ask from a company,
say a telecom company?
First of all meet your sponsors and make a BRD (business requirement document) about their expectation from this data warehouse (main aim comes from them).For example they need customer billing process. Now go to business management team they can ask for metrics out of billing process for their use. Now management people monthly usage, billing metrics, sales organization, rate plan to perform sales rep and channel performance analysis and rate plan analysis. So your dimension tables can be Customer (customer id, name, city, state etc) Sales rep sales rep number, name, idsalesorg: sales ord idBill dimension: Bill #,Bill date, Numberrate plan:rate plan codeAnd Fact table can be:Billing
details(bill #,customer id, minutes used, call details etc)you can follow star and snow flake schema in this case. Depend upon the granularity of your data.
Can i start and stop single session in concurrent batch?
Just right click on the particular session and going to recovery option orby using event wait and event rise
What is Micro Strategy? Why is it used for? Can any one explain in detail about it?
Micro strategy is again an BI tool which is a HOLAP... u can create 2 dimensional report and also cubes in here...basically a reporting tool. It has a full range of reporting on web also in windows.What is difference b/w Informatica 7.1 and Abinitio
There is a lot of difference between Inforrmatica an Abinitio In Ab Initio we r using 3 parllalisimbut Informatica using 1 parllalisim
In Ab Initio no scheduling option we can scheduled manully or pl/sql script but informatica contains 4 scheduling options
Ab Inition contains co-operating system but informatica is not
Ramp time is very quickly in Ab Initio campare than Informatica Ab Initio is userfriendly than Informatica
What is mystery dimension?
Also known as Junk Dimensions
Making sense of the rogue fields in your fact table..
What is cost based and rule based approaches and the difference
Cost based and rule based approaches are the optimization techniques which are used in related to databases, where we need to optimize a SQL query.
Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques. bcz the third has some disadvantages.)
When ever you process any SQL query in Oracle, what oracle engine internally does is, it reads the query and decides which will the best possible way for executing the query. So in this process, Oracle follows these optimization techniques.
1. cost based Optimizer (CBO): If a SQL query can be executed in 2 different ways ( like may have path 1 and path2 for same query),then What CBO does is, it basically calculates the cost of each path and the analyses for which path the cost of execution is less and then executes that path so that it can optimize the query execution.
2. Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query. So depending on the number of rules which are to be applied, the optimzer runs the query.
Use:
If the table you are trying to query is already analysed, then oracle will go with CBO. If the table is not analysed , the Oracle follows RBO.
For the first time, if table is not analysed, Oracle will go with full table scan.
what are partition points?
Partition points mark the thread boundaries in a source pipeline and divide the pipeline into stages.
How to append the records in flat file (Informatica) ? Where as in Data stage we have
the options
i) overwrite the existing file
ii) Append existing file
This is not there in Informatica v 7. But heard that it’s included in the latest version 8.0 where u can append to a flat file. Its about to be shipping in the market.
If u had to split the source level key going into two separate tables. One as surrogate
and other as primary. Since informatica does not gurantee keys are loaded
properly(order!) into those tables. What are the different ways you could handle this
type of situation?
foreign key
what is the best way to show metadata(number of rows at source, target and each
transformation level, error related data) in a report format
When your workflow gets completed go to workflow monitor right click the session .then go to transformation statistics there we can see number of rows in source and target. if we go for session properties we can see errors related to data
You can select these details from the repository table. you can use the view REP_SESS_LOG to get these data
Two relational tables are connected to SQ transformation, what are the possible
errors it will be thrown?
We can connect two relational tables in one sq Transformation. No errors will be perform
With out using Updatestrategy and sessons options, how we can do the update our
target table?
Soln1: You can use this by using "update override" in target properties Soln2: In session properties, There is an option
insert update
insert as update update as update like that
by using this we will easily solve
Soln3: By default all the rows in the session is set as insert flag ,you can change it in the session general properties -- Treate source rows as :update
so, all the incoming rows will be set with update flag. now you can update the rows in the target table
Could anyone please tell me what are the steps required for type2
dimension/version data mapping. how can we implement it
Go to mapping designer in it go for mapping select wizard in it go for slowly changing dimension
Here u will find a new window their u need to give the mapping name source table target table and type of slowly changing dimension then if select finish slowly changing dimension 2 mapping is created
go to ware designer and generate the table then validate the mapping in mapping designer save it to repository run the session in workflow manager
later update the source table and re run again u will find the difference in target table
How to import oracle sequence into Informatica.
Create one procedure and declare the sequence inside the procedure,finally call the procedure in informatica with the help of stored procedure transformation
What is data merging, data cleansing, sampling?
Cleansing:---TO identify and remove the retundacy and inconsistency sampling: just smaple the data throug send the data from source to target
What is IQD file?
IQD file is nothing but Impromptu Query Definition, This file is mainly used in Cognos Impromptu tool after creating a imr ( report) we save the imr as IQD file which is used while creating a cube in power play transformer.In data source type we select Impromptu Query Definetion.
Differences between Normalizer and Normalizer transformation.
Normalizer: It is a transormation mainly using for cobol sources,it's change the rows into coloums and columns into rows Normalization:To remove the retundancy and inconsitecy
In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL
What is the procedure or steps implementing versioning if you are already in
version7.X. Any gotcha\'s or precautions..
For version control in ETL layer using informatica, first of all after doing anything in your designer mode or workflow manager, do the following steps...
1> First save the changes or new implementations.
2>Then from navigator window, right click on the specific object you are currently in. There will be a pop up window. In that window at the lower end side, you will find versioning->Check In. A window will be opened. Leave the information you have done like "modified this mapping" etc. Then click ok button.
can anyone explain error handling in informatica with examples so that it will be easy
to explain the same in the interview.
go to the session log file there we will find the information regarding to the session initiation process,
errors encountered. load summary.
so by seeing the errors encountered during the session running, we can resolve the errors.
If you have four lookup tables in the workflow How do you troubleshoot to improve
performance?
There r many ways to improve the mapping which has multiple lookups.
1) We can create an index for the lookup table if we have permissions(staging area).
2) Divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows only the new rows will come to mapping and the process will be fast . (b) Dedicate the second one to update : source=target,, these r existing rows only the rows which exists allready will come into the mapping.
3)we can increase the chache size of the lookup
If you are workflow is running slow in informatica. Where do you start trouble
shooting and what are the steps you follow?
If you are workflow is running slow in informatica. Where do you start trouble shooting and what are the steps you follow? SOLN1: when the work flow is running slowly you have to find out the bottlenecksin this order target source mapping session system
SOLN2: work flow may be slow due to different reasons one is alpha characters in decimal data check it out this and due to insufficient length of strings check with the SQL override
How do you handle decimal places while importing a flatfile into informatica?
while importing the flat file, the flat file wizard helps in configuring the properties of the file so that select the numeric column and just enter the precision value and the scale. Precision includes the scale for examples if the number is 98888.654, enter precision as 8 and scale as 3 and width as 10 for fixed width flat file
we have a task called wait event using that we can stop. we start using raise event.
why dimenstion tables are denormalized in nature ?...
Because in Data warehousing historical data should be maintained, to maintain historical data means suppose one employee details like where previously he worked, and now where he is working, all details should be maintain in one table, if u maintain primary key it won't allow the duplicate records with same employee id. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column).
so all the dimensions are marinating historical data, they are de normalized, because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table
Can we use aggregator/active transformation after update strategy transformation?
We can use, but the update flag will not be remain. But we can use passive transformationCan any one comment on
significance of oracle 9i in informatica when compared to oracle 8 or 8i.
i mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in
informatica
it's very easy
Actually oracle 8i not allowed user defined data types But 9i allows
and then blob, lob allow only 9i not 8i
and more over list partinition is there in 9i only
in the concept of mapping parameters and variables, the variable value will be saved
to the repository after the completion of the session and the next time when u run the
session, the server takes the saved variable value in the repository and starts
assigning the next value of the saved value. for example i ran a session and in the
end it stored a value of 50 to the repository.next time when i run the session, it
should start with the value of 70. not with the value of 51.
how to do this.
SOLN1: u can do onething after running the mapping,, in workflow manager start--->session.
right clickon the session u will get a menu, in that go for persistant values, there u will find the last value stored in the repository regarding to mapping variable. then remove it and put ur desired one, run the session... i hope ur task will be done
SOLN2: it takes value of 51 but u can override the saved variable in the repository by defining the value in the parameter file.if there is a parameter file for the mapping variable it uses the value in the parameter file not the value+1 in the repositoryfor example assign the value of the mapping variable as 70.in othere words higher preference is given to the value in the parameter file
Mapping parameters and variables make the use of mappings more flexible and also it avoids creating of multiple mappings. it helps in adding incremental data mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping ----> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$. and choose type as
parameter/variable, data type once defined the variable/parameter is in the any expression for example in SQ transformation in the source filter properties tab. just enter filter condition and finally create a
parameter file to assign the value for the variable / parameter and configure the session properties. however the final step is optional. if their parameter is not present it uses the initial value which is assigned at the time of creating the variable
How to delete duplicate rows in flat files source is any option in informatica
Use a sorter transformation , in that u will have a "distinct" option make use of it .What is the use of incremental aggregation? Explain me in brief with an example.
Its a session option when the informatica server performs incremental aggregation it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally for performance we will use it.What is the procedure to load the fact table.Give in detail?
SOLN1: we use the 2 wizards (i.e) the getting started wizard and slowly changing dimension wizard to load the fact and dimension tables,by using these 2 wizards we can create different types of mappings according to the business requirements and load into the star schemas(fact and dimension tables). SOLN2: first dimenstion tables need to be loaded, then according to the specifications the fact tables should be loaded. Don’t think that fact table’s r different in case of loading; it is general mapping as we do for other tables. specifications will play important role for loading the fact.
How to lookup the data on multiple tabels.
if u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that joined table. informatica provieds lookup on joined tables
How to retrieve the records from a rejected file. explane with syntax or example
SOLN1:
there is one utility called "reject Loader" where we can find out the reject records and able to refine and reload the rejected records..SOLN2: During the execution of workflow all the rejected rows will be stored in bad files (where your informatica server get installed C:\Program Files\Inforrmatica Power Center 7.1\Server) These bad files can be imported as flat a file in source then thro' direct mapping we can load these files in desired format.
How does the server recognise the source and target databases?
By using ODBC connection.if it is relational.if is flat file FTP connection..see we can make sure with connection in the properties of session both sources & targets
What are variable ports and list two situations when they can be used?
We have mainly three ports Inport, Outport, Variable port. Inport represents data is flowing intotransformation. Outport is used when data is mapped to next transformation. Variable port is used when we mathematical calculations are required.
you can also use as for example consider price and quantity and total as a variable we can make a sum on the total_amt by giving
sum (total_amt)
variable port is used to break the complex expression into simpler and also it is used to store intermediate values
You can use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is zero or negative:
IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 )
You can use DECODE instead of IIF in many cases. DECODE may improve readability. The following shows how you can use DECODE instead of IIF :
SALES > 0 and SALES < 50, SALARY1,
SALES > 49 AND SALES < 100, SALARY2, SALES > 99 AND SALES < 200, SALARY3, SALES > 199, BONUS)
in Dimensional modeling fact table is normalized or denormalized?in case of star
schema and incase of snow flake schema?
No concept of normailzation in the case of star schema but in the case of snow flack schema dimension table must be normalized.
Star schema--De-Normalized dimensions Snow Flake Schema-- Normalized dimensions
which is better among connected lookup and unconnected lookup transformations in
informatica or any other ETL tool?
When you compared both basically connected lookup will return more values and unconnected returns one value conn lookup is in the same pipeline of source and it will accept dynamic caching. Unconn lookup don't have that facility but in some special cases we can use Unconnected. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favorable
I think the better one is connected look up. beacaz we can use dynamic cache with it ,, also connected loop up can send multiple columns in a single row, where as unconnected is concerned it has a single return port.(in case of etl informatica is concerned)
What is the limit to the number of sources and targets you can have in a mapping
As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping.Question is " if you make N number of tables to participate at a time in processing what is the position of your database. I organization point of view it is never encouraged to use N number of tables at a time, It reduces database and informatica server performance"
The restriction is only on the database side. how many concurrent threads r u allowed to run on the db server?
which objects are required by the debugger to create a valid debug session?
Initially the session should be valid session.Source, target, lookups, expressions should be available min 1 break point should be available for debugger to debug your session.
Informatica server Object is must.
what is the procedure to write the query to list the highest salary of three employees?
SELECT salFROM (SELECT sal FROM my_table ORDER BY sal DESC) WHERE ROWNUM < 4;
since this is informatica.. you might as well use the Rank transformation. check out the help file on how to use it.
We are using Update Strategy Transformation in mapping how can we know whether
insert or update or reject or delete option has been selected during running of
sessions in Informatica.
In Designer while creating Update Strategy Transformation uncheck "forward to next transformation". If any rejected rows are there automatically it will be updated to the session log file.
Update or insert files are known by checking the target file or table only.
Suppose session is configured with commit interval of 10,000 rows and source has
50,000 rows. Explain the commit points for Source based commit and Target based
commit. Assume appropriate value wherever required.
Source based commit will commit the data into target based on commit interval so for every 10,000 rows it will commit into target.
Target based commit will commit the data into target based on buffer size of the target. i.e., it commits the data into target when ever the buffer fills Let us assume that the buffer size is 6,000. So for every 6,000 rows it commits the data.
How do we estimate the number of partitions that a mapping really requires? Is it
dependent on the machine configuration?
It depends upon the informatica version we r using suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions
Can Informatica be used as a Cleansing Tool? If yes give example of transformations
that can implement a data cleansing routine.
Yes, we can use Informatica for cleansing data some time we use stages to cleansing the data. It depends upon performance again else we can use expression to cleansing data.
For example a field X has some values and other with Null values and assigned to target field where target field is not null column, inside an expression we can assign space or some constant value to avoid session failure.
The input data is in one format and target is in another format, we can change the format in expression. We can assign some default values to the target to represent complete set of data in the target.
How do you decide whether you need it do aggregations at database level or at
Informatica level?
It depends upon our requirement only If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here I am explaining why we need to use informatica.
what ever it may be informatica is a third party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. No necessary to process entire values again and again unless this can be done if nobody deleted that cache files. If that happened total aggregation we need to execute on informatica also.
In database we don't have Incremental aggregation facility.
Identifying bottlenecks in various components of Informatica and resolving them.
The best way to find out bottlenecks is writing to flat file and see where the bottle neck is .How to join two tables without using the Joiner Transformation
SOLN1:
It possible to join the two or more tables by using source qualifier. But provided the tables should have relationship.When u drag n drop the table u will getting the source qualifier for each table. Delete all the source qualifiers. Add a common source qualifier for all. Right click on the source qualifier u will find EDIT click on it. Click on the properties tab, u will find sql query in that u can write ur sqls
SOLN2: joiner transformation is used to join n (n>1) tables from same or different databases, but source qualifier transformation is used to join only n tables from same database
SOLN3: use Source Qualifier transformation to join tables on the SAME database. Under its properties tab, you can specify the user-defined join. Any select statement you can run on a database.. you can do also in Source Qualifier.
Note: you can only join 2 tables with Joiner Transformation but you can join two tables from different databases.
In a filter expression we want to compare one date field with a db2 system field
CURRENT DATE.
Our Syntax: datefield = CURRENT DATE (we didn't define it by ports, its a system
field ), but this is not valid (PMParser: Missing Operator)..
Can someone help us.
the db2 date format is "yyyymmdd" where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date formate to local database date formate is compulsary. other wise u will get that type of error Use Sysdate or use to_date for the current date
what does the expression n filter transformations do in Informatica Slowly growing
target wizard?
EXPESSION transformation detects and flags the rows from source.
Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation
how to create the staging area in your database
A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options.
So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab.
whats the diff between Informatica powercenter server, repositoryserver and
repository?
Power center server contains the scheduled runs at which time data should load from source to target Repository contains all the definitions of the mappings done in designer.
What are the Differences between Informatica Power Center versions 6.2 and 7.1,
also between Versions 6.2 and 5.1?
The main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor.
In ver 7x u have the option of looking up (lookup) on a flat file. U can write to XML target.
Versioning
LDAP authentication
Support of 64 bit architectures
Differences between Informatica 6.2 and Informatica 7.0
Features in 7.1 are :2. Lookup on flat file
3. Grid servers working on different operating systems can coexist on same server 4. We can use pmcmdrep
5. We can export independent and dependent rep objects 6. We ca move mapping in any web application
7. Version controlling 8. Data profilling
What is the difference between connected and unconnected stored procedures.
Run a stored procedure before or after your session.
Unconnected
Run a stored procedure once during your mapping, such as pre- or
post-session.
Unconnected
Run a stored procedure every time a row passes through the Stored Procedure
transformation.
Connected or
Unconnected
Run a stored procedure based on data that passes through the mapping, such
as when a specific port does not contain a null value.
Unconnected
Pass parameters to the stored procedure and receive a single output parameter.
Connected or
Unconnected
Pass parameters to the stored procedure and receive multiple output
parameters.
Note: To get multiple output parameters from an unconnected Stored
Procedure transformation, you must create variables for each output
parameter. For details, see Calling a Stored Procedure From an Expression.
Connected or
Unconnected
Run nested stored procedures.
Unconnected
Call multiple times within a mapping.
Unconnected
Discuss which is better among incremental load, Normal Load and Bulk load
If the database supports bulk load option from Inforrmatica then using BULK LOAD for intial loading the tables is recommended.
Depending upon the requirment we should choose between Normal and incremental loading strategies If supported by the database bulk load can do the loading faster than normal load.(incremental load concept is differnt dont merge with bulk load, mormal load)
Compare Data Warehousing Top-Down approach with Bottom-up approach
in top down approch: first we have to build dataware house then we will build data marts. which will need more crossfunctional skills and timetaking process also costly.
in bottom up approach: first we will build data marts then data warehuse. the data mart that is first build will remain as a proff of concept for the others. less time as compared to above and less cost.
What is the difference between summary filter and detail filter
summary filter can be applied on a group of rows that contain a common value where as detail filters can be applied on each and every rec of the data base.
Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse.
A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data
can we modify the data in flat file?
Just open the text file with notepad, change what ever you want (but datatype should be the same)
how to get the first 100 rows from the flat file into the target?
SOLN1: task --->(link) session (workflow manager)
double click on link and type $$source sucsess rows(parameter in session variables) = 100 it should automatically stops session.
SOLN2: 1. Use test download option if you want to use it for testing. 2. Put counter/sequence generator in mapping and perform it.
can we lookup a table from a source qualifer transformation-unconnected lookup
No. we can't do.I will explain you why.
1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query.
2) source qualifier don't have any variables feilds to utalize as expression. what is a junk dimension
A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades.
What is the difference between Narmal load and Bul...
Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed.Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases. compartivly Bulk load is pretty faster than normal load.
At the max how many tranformations can be us in a mapping?
There is no such limitation to use this number of transformations. But in performance point of view using too many transformations will reduce the session performance.
My idea is "if needed more tranformations to use in a mapping its better to go for some stored procedure."
Waht are main advantages and purpose of using Normalizer Transformation in Informatica?
Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data
How do u convert rows to columns in Normalizer? could you explain us??
Normally, its used to convert columns to rows but for converting rows to columns, we need an aggregator and expression and little effort is needed for coding. Denormalization is not possible with a Normalizer transformation.
Discuss the advantages & Disadvantages of star & snowflake schema?
In a star schema every dimension will have a primary key.In a star schema, a dimension table will not have any parent table.
Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to drill down the data from topmost hierachies to the lowermost hierarchies.
star schema consists of single fact table surrounded by some dimensional table.In snowflake schema the dimension tables are connected with some subdimension table.
In starflake dimensional ables r denormalized,in snowflake dimension tables r normalized. star schema is used for report generation ,snowflake schema is used for cube.
The advantage of snowflake schema is that the normalized tables r easier to maintain.it also saves the storage space.
The disadvantage of snowflake schema is that it reduces the effectiveness of navigation across the tables due to large no of joins between them.
what is a time dimension? give an example.
Time dimension is one of important in Datawarehouse. Whenever u genetated the report , that time u access all data from thro time dimension.
eg. employee time dimension
Fields : Date key, full date, day of wek, day , month,quarter,fiscal year
What r the connected or unconnected transforamations?
Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not.
much like calling a program by name and by reference.
use unconnected transforms when you wanna call the same transform many times in a single mapping An unconnected transformation cant be connected to another transformation. but it can be called inside another transformation.
uncondition transformation are directly connected and can/used in as many as other transformations. If you are using a transformation several times, use unconditional. You get better performance.
How can U create or import flat file definition in to the warehouse designer?
U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer.
U can not create or import flat file defintion in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.
Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica
server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.
Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again.
Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file
Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session.
Creating log files: Loadmanger creates logfile contains the status of session.
How do you transfert the data from data warehouse to flatfile?
You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool.
Diff between informatica repositry server & informatica server
Informatica Repository Server:It's manages connections to the repository from client application. Informatica Server:It's extracts the source data,performs the data transformation,and loads the transformed data into the target
Router transformation
A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.
What are 2 modes of data movement in Informatica Server?The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server.
a) Unicode - IS allows 2 bytes for each character and uses additional byte for each non-ascii character (such as Japanese characters)
b) ASCII - IS holds all data in a single byte
The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.
How to read rejected data or bad data from bad file and reload it to target?
correction the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator.Explain the informatica Architecture in detail
Informatica server connects source data and target data using native odbc drivers
again it connect to the repository for running sessions and retriveing metadata information source--->informatica server--->target
| |
REPOSITORY
repository←Repository→Repository ser.adm.
control server ¢Õ
source←informatica server→target---¢Õ ¢Õ ¢Õ
designer w.f.manager
w.f.monitor
how can we partition a session in Informatica?
The Informatica® PowerCenter® Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning.
GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option
enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.
What is Load Manager?
While running a Workflow,the PowerCenter Server uses the Load Manager process and the Data
Transformation Manager Process (DTM) to run the workflow and carry out workflow
tasks.When the PowerCenter Server runs a workflow, the Load Manager performs the following
tasks:
1. Locks the workflow and reads workflow properties.
2. Reads the parameter file and expands workflow variables.
3. Creates the workflow log file.
4. Runs workflow tasks.
5. Distributes sessions to worker servers.
6. Starts the DTM to run sessions.
7. Runs sessions from master servers.
8. Sends post-session email if the DTM terminates abnormally.
When the PowerCenter Server runs a session, the DTM performs the following tasks:
1. Fetches session and mapping metadata from the repository.
2. Creates and expands session variables.
3. Creates the session log file.
4. Validates session code pages if data code page validation is enabled. Checks query
conversions if data code page validation is disabled.
5. Verifies connection object permissions.
6. Runs pre-session shell commands.
7. Runs pre-session stored procedures and SQL.
8. Creates and runs mapping, reader, writer, and transformation threads to extract,transform, and
load data.
9. Runs post-session stored procedures and SQL.
10. Runs post-session shell commands.
11. Sends post-session email.
What is Data cleansing..?The process of finding and removing or correcting data that is incorrect, out-of-date, redundant, incomplete, or formatted incorrectly.
This is nothing but polising of data. For example of one of the sub system store the Gender as M and F. The other may store it as MALE and FEMALE. So we need to polish this data, clean it before it is add to Datawarehouse. Other typical example can be Addresses. The all sub systesms maintinns the customer address can be different. We might need a address cleansing to tool to have the customers addresses in clean and neat form.
To provide support for Mainframes source data,which files r used as a source
definitions?
COBOL Copy-book filesWhere should U place the flat file to import the flat file
defintion to the designer?
There is no such restrication to place the source file. In performance point of view its better to place the file in server local src folder. if you need path please check the server properties availble at workflow manager.
It doesn't mean we should not place in any other folder, if we place in server src folder by default src will be selected at time session creation
How many ways you can update a relational source defintion and what r they?
Two ways1. Edit the definition
2. Reimport the definition
Which transformation should u need while using the cobol
sources as source defintions?
Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data.What is the maplet?
For Ex:Suppose we have several fact tables that require a series of dimension keys.Then we can create a mapplet which contains a series of Lkp transformations to find each dimension key and use it in each fact table mapping instead of creating the same Lkp logic in each mapping.
what is a transforamation?
It is a repostitory object that generates,modifies or passes data.A transformation is repository object that pass data to the next stage(i.e to the next transformation or target) with/with out modifying the dataWhat r the active and passive transforamtions?
An active transforamtion can change the number of rows that pass through it.A passive transformation does not change the number of rows that pass through it.Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition. A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.
What r the reusable transforamtions?
Reusable transformations can be used in multiple mappings.When u need to incorporate this transformation into maping,U add an instance of it tomaping.Later if U change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,U can change the
transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save U great deal of work.
What r the methods for creating reusable
transforamtions?
Two methods1.Design it in the transformation developer.
2.Promote a standard transformation from the mapping designer.After U add a transformation to the mapping , U can promote it to the status of reusable transformation.
Once U promote a standard transformation to reusable status,U can demote it to a standard transformation at any time.
If u change the properties of a reusable transformation in mapping,U can revert it to the original reusable transformation properties by clicking the revert button.
What r the unsupported repository
objects for a mapplet?
COBOL source definition Joiner transformationsNormalizer transformations
Non reusable sequence generator transformations. Pre or post session stored procedures
Target defintions
Power mart 3.5 style Look Up functions XML source definitions
IBM MQ source definitions• Source definitions. Definitions of database objects (tables, views,
synonyms) or files that provide source data. • Target definitions. Definitions of database objects or files that contain the target data. • Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions. • Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the
Informatica Server uses to transform and move data. • Reusable transformations. Transformations that you can use in multiple mappings. • Mapplets. A set of transformations that you can use in multiple mappings. • Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
What r the mapping paramaters and
maping variables?
Maping parameter represents a constant value that U can define before running a session.A mapping parameter retains the same value throughout the entire session.When u use the maping parameter ,U declare and use the parameter in a maping or maplet.Then define the value of parameter in a parameter file for the session.
Unlike a mapping parameter,a maping variable represents a value that can change throughout the session.The informatica server saves the value of maping variable to the repository at the end of session run and uses that value next time U run the session.
Can U use the maping parameters or
variables created in one maping into another maping?
NO.We can use mapping parameters or variables in any transformation of the same maping or mapplet in which U have created maping parameters or variables.
Can u use the maping parameters or
variables created in one maping into any other reusable transformation?
Yes.Because reusable tranformation is not contained with any maplet or maping.How can U improve session performance in aggregator transformation?
use sorted input:
1. use a sorter before the aggregator
2. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by.
the key order is also very important
What is aggregate cache in aggregator transforamtion?
The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process thetransformation.If the informatica server requires more space,it stores overflow values in cache files. When you run a workflow that uses an Aggregator transformation, the Informatica Server creates index and data caches in memory to process the transformation. If the Informatica Server requires more space, it stores overflow values in cache files.
What r the diffrence between joiner transformation and source qualifier
transformation?
U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation.U need matching keys to join two relational sources in source qualifier transformation.Where as u doesn’t need matching keys to join two sources.
which r coming from diffrent sources also.
In which condtions we can not use joiner
transformation(Limitaions of joiner transformation)?
Both pipelines begin with the same original data source.Both input pipelines originate from the same Source Qualifier transformation. Both input pipelines originate from the same Normalizer transformation. Both input pipelines originate from the same Joiner transformation. Either input pipelines contains an Update Strategy transformation.
Either input pipelines contains a connected or unconnected Sequence Generator transformation.
what r
the settiings that u use to cofigure the joiner transformation?
• Master and detail source • Type of join • Condition of the jointhe Joiner transformation supports the following join types, which you set in the Properties tab:
• Normal (Default)
• Master Outer
• Detail Outer
• Full Outer
What r the join types in joiner transformation?
Normal (Default) -- only matching rows from both master and detail Master outer -- all detail rows and only matching rows from master Detail outer -- all master rows and only matching rows from detail
Full outer -- all rows from both master and detail ( matching or non matching) follw this