informatica

(1)

informatica: What r the kinds of lookup?

You can configure the Lookup transformation to perform different types of lookups. You can configure the transformation to be connected or unconnected, cached or uncached:

Connected or unconnected.

Connected and unconnected transformations receive input and send output in different ways. Cached or uncached.

Sometimes you can improve session performance by caching the lookup table. If you cache the lookup table, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Informatica Server inserts or updates rows in the cache during the session. When you cache the target table as the lookup, you can look up values in the target and insert them if they do not exist, or update them if they do

informatica: Persistent cache and non persistent.

PERSISTANT

CACHE-If you want to save and reuse the cache files, you can configure the transformation to use a persistent cache. Use a persistent cache when you know the lookup table does not change between session runs.

The first time the Informatica Server runs a session using a persistent lookup cache, it saves the cache files to disk instead of deleting them. The next time the Informatica Server runs the session, it builds the memory cache from the cache files. If the lookup table changes occasionally, you can override session properties to recache the lookup from the database. NONPERSISTANT

CACHE-By default, the Informatica Server uses a non-persistent cache when you enable caching in a Lookup transformation. The Informatica Server deletes the cache files at the end of a session. The next time you run the session, the Informatica Server builds the memory cache from the database

informatica: Dynamic cache?

You might want to configure the transformation to use a dynamic cache when the target table is also the lookup table. When you use a dynamic cache, the Informatica Server updates the lookup cache as it passes rows to the target.

The Informatica Server builds the cache when it processes the first lookup request. It queries the cache based on the lookup condition for each row that passes into the transformation.

When the Informatica Server reads a row from the source, it updates the lookup cache by performing one of the following actions:

Inserts the row into the cache. Updates the row in the cache. Makes no change to the cache.

(2)

informatica: Difference b/w filter and source qualifier?

You can use the Source Qualifier to perform the following tasks:

Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier.

Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query.

Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query.

Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query.

Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.

informatica:What is Data Transformation Manager Process? How

many Threads it creates to process data, explain each thread

in brief.

When the workflow reaches a session, the Load Manager starts the DTM process. The DTM process is the process associated with the session task. The Load Manager creates one DTM process for each session in the workflow. The DTM process performs the following tasks: Reads session information from the repository.

Expands the server and session variables and parameters. Creates the session log file.

Validates source and target code pages. Verifies connection object permissions.

Runs pre-session shell commands, stored procedures and SQL.

Creates and runs mapping, reader, writer, and transformation threads to extract, transform, and load data.

Runs post-session stored procedures, SQL, and shell commands. Sends post-session email.

The DTM allocates process memory for the session and divides it into buffers. This is also known as buffer memory. The default memory allocation is 12,000,000 bytes. The DTM uses multiple threads to process data. The main DTM thread is called the master thread.

The master thread creates and manages other threads. The master thread for a session can create mapping, pre-session, post-session, reader, transformation, and writer threads. Mapping Thread -One thread for each session. Fetches session and mapping information. Compiles the mapping. Cleans up after session execution.

Pre- and Post-Session Threads- One thread each to perform pre- and post-session operations. Reader Thread -One thread for each partition for each source pipeline. Reads from sources. Relational sources use relational reader threads, and file sources use file reader threads .

(3)

Transformation Thread -One or more transformation threads for each partition. Processes data according to the transformation logic in the mapping.

Writer Thread- One thread for each partition, if a target exists in the source pipeline. Writes to targets. Relational targets use relational writer threads, and file targets use file writer threads.

What are indicator files?

informatica: What are indicator files?

Ans: .If you use a flat file as a target, you can configure the Informatica Server to create an indicator file for target row type information. For each target row, the indicator file contains a number to indicate whether the row was marked for insert, update, delete, or reject. The Informatica Server names this file target_name.ind and stores it in the same directory as the target file.

to configure it - go to INFORMATICA SERVER SETUP-CONFUGRATION TAB-CLICK ON INDICATOR FILE SETTINGS.

informatica: Suppose session is configured with commit interval of

10,000 rows and source has 50,000 rows explain the commit

points for Source-based c

Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows explain the commit points for Source-based commit & Target-based commit. Assume appropriate value wherever required.

a)For example, a session is configured with target-based commit interval of 10,000. The writer buffers fill every 7,500 rows. When the Informatica Server reaches the commit interval of 10,000, it continues processing data until the writer buffer is filled. The second buffer fills at 15,000 rows, and the Informatica Server issues a commit to the target. If the session completes successfully, the Informatica Server issues commits after 15,000, 22,500, 30,000, and 40,000 rows.

b)The Informatica Server might commit less rows to the target than the number of rows produced by the active source. For example, you have a source-based commit session that passes 10,000 rows through an active source, and 3,000 rows are dropped due to transformation logic. The Informatica Server issues a commit to the target when the 7,000 remaining rows reach the target.

The number of rows held in the writer buffers does not affect the commit point for a source-based commit session. For example, you have a source-based commit session that passes 10,000 rows through an active source. When those 10,000 rows reach the targets, the Informatica Server issues a commit. If the session completes successfully, the Informatica Server issues commits after 10,000, 20,000, 30,000, and 40,000 source rows.

How to capture performance statistics of individual transformation

in the mapping and explain some important statistics that

can be captured?

informatica : How to capture performance statistics of individual transformation in the mapping and explain some important statistics that can be captured?

(4)

Ans: a)Before using performance details to improve session performance you must do the following:

Enable monitoring

Increase Load Manager shared memory Understand performance counters .

To view performance details in the Workflow Monitor:

While the session is running, right-click the session in the Workflow Monitor and choose Properties.

Click the Performance tab in the Properties dialog box. Click OK.

To view the performance details file: Locate the performance details file.

The Informatica Server names the file session_name.perf, and stores it in the same directory as the session log. If there is no session-specific directory for the session log, the Informatica Server saves the file in the default log files directory.

Open the file in any text editor. b)

Source Qualifier and Normalizer Transformations.

BufferInput_efficiency -Percentage reflecting how seldom the reader waited for a free buffer when passing data to the DTM.

BufferOutput_efficiency - Percentage reflecting how seldom the DTM waited for a full buffer of data from the reader.

Target

BufferInput_efficiency -Percentage reflecting how seldom the DTM waited for a free buffer when passing data to the writer.

BufferOutput_efficiency -Percentage reflecting how seldom the Informatica Server waited for a full buffer of data from the writer.

For Source Qualifiers and targets, a high value is considered 80-100 percent. Low is considered 0-20 percent. However, any dramatic difference in a given set of BufferInput_efficiency and BufferOutput_efficiency counters indicates inefficiencies that may benefit from tuning. Posted by Emmanuel at 4:31 PM

informatica:

Ans: Load manager is the primary Informatica server process. It performs the following tasks: a. Manages sessions and batch scheduling.

b. Locks the sessions and reads properties. c. Reads parameter files.

d. Expands the server and session variables and parameters. e. Verifies permissions and privileges.

(5)

f. Validates sources and targets code pages. g. Creates session log files.

h. Creates Data Transformation Manager (DTM) process, which executes the session.

Assume you have access to server.

When you run a session, the Informatica Server writes a message in the session log indicating the cache file name and the transformation name. When a session completes, the Informatica Server typically deletes index and data cache files. However, you may find index and data files in the cache directory under the following circumstances:

The session performs incremental aggregation.

You configure the Lookup transformation to use a persistent cache. The session does not complete successfully.

Table 21-2 shows the naming convention for cache files that the Informatica Server creates: Table 21-2. Cache File Naming Convention Transformation Type Index File Name Data File Name

Aggregator PMAGG*.idx PMAGG*.dat Rank PMAGG*.idx PMAGG*.dat Joiner PMJNR*.idx PMJNR*.dat Lookup PMLKP*.idx PMLKP*.dat

If a cache file handles more than 2 GB of data, the Informatica Server creates multiple index and data files. When creating these files, the Informatica Server appends a number to the end of the filename, such as PMAGG*.idx1 and PMAGG*.idx2. The number of index and data files are limited only by the amount of disk space available in the cache directory.

How to achieve referential integrity through Informatica?

.Using the Normalizer transformation, you break out repeated data within a record into separate records. For each new record it creates, the Normalizer transformation generates a unique identifier. You can use this key value to join the normalized records.

Also possible in source

analyzer-source analyzer- table1(pk table)-edit-ports-keytype-select primarykey-.

table2(fktable) -edit-ports-keytype-select foreign key -select table name &column name from options situated below.

What is Incremental Aggregation and how it should be used?

If the source changes only incrementally and you can capture changes, you can configure the session to process only those changes. This allows the Informatica Server to update your target incrementally, rather than forcing it to process the entire source and recalculate the same calculations each time you run the session. Therefore, only use incremental aggregation if: Your mapping includes an aggregate function.

The source changes only incrementally.

You can capture incremental changes. You might do this by filtering source data by timestamp. Before implementing incremental aggregation, consider the following issues:

Whether it is appropriate for the session

(6)

When to reinitialize the aggregate caches

Scenario :-Informatica Server and Client are in different machines. You run a session from the server manager by specifying the source and target databases. It displays an error. You are confident that everything is correct. Then why it is displaying the error?

The connect strings for source and target databases are not configured on the Workstation conatining the server though they may be on the client m/c.

Have u created parallel sessions How do u create parallel sessions

?

U can improve performace by creating a concurrent batch to run several sessions in parallel on one informatic server, if u have several independent sessions using separate sources and separate mapping to populate diff targets u can place them in a concurrent batch and run them at the same time , if u have a complex mapping with multiple sources u can separate the mapping into several simpler mappings with separate sources. Similarly if u have session performing a minimal no of transformations on large amounts of data like moving flat files to staging area, u can separate the session into multiple sessions and run them concurrently in a batch cutting the total run time dramatically

What is Data Transformation Manager?

Ans. After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run. The primary purpose of the DTM process is to create and manage threads that carry out the session tasks.

The DTM allocates process memory for the session and divide it into buffers. This is also known as buffer memory. It creates the main thread, which is called the master thread. The master thread creates and manages all other threads.

If we partition a session, the DTM creates a set of threads for each partition to allow concurrent processing.. When Informatica server writes messages to the session log it includes thread type and thread ID. Following are the types of threads that DTM creates:

• MASTER THREAD - Main thread of the DTM process. Creates and manages all other threads. • MAPPING THREAD - One Thread to Each Session. Fetches Session and Mapping Information. • Pre And Post Session Thread - One Thread Each To Perform Pre And Post Session

Operations.

• READER THREAD - One Thread for Each Partition for Each Source Pipeline.

• WRITER THREAD - One Thread for Each Partition If Target Exist In The Source pipeline Write To The Target.

• TRANSFORMATION THREAD - One or More Transformation Thread For Each Partition.

How is the Sequence Generator transformation different from other

transformations?

informatica : How is the Sequence Generator transformation different from other transformations? Ans: The Sequence Generator is unique among all transformations because we cannot add, edit, or delete its default ports (NEXTVAL and CURRVAL).

(7)

Unlike other transformations we cannot override the Sequence Generator transformation properties at the session level. This protecxts the integrity of the sequence values generated.

What are the advantages of Sequence generator? Is it necessary, if

so why?

informatica : What are the advantages of Sequence generator? Is it necessary, if so why? Ans: We can make a Sequence Generator reusable, and use it in multiple mappings. We might reuse a Sequence Generator when we perform multiple loads to a single target.

For example, if we have a large input file that we separate into three sessions running in parallel, we can use a Sequence Generator to generate primary key values. If we use different Sequence Generators, the Informatica Server might accidentally generate duplicate key values. Instead, we can use the same reusable Se

What are the uses of a Sequence Generator transformation?

informatica : What are the uses of a Sequence Generator transformation?

Ans:

We can perform the following tasks with a Sequence Generator transformation:

o Create keys

o Replace missing values

o Cycle through a sequential range of number

What are connected and unconnected Lookup

transformations?

informatica: What are connected and unconnected Lookup transformations?

Ans: We can configure a connected Lookup transformation to receive input directly from the mapping pipeline, or we can configure an unconnected Lookup transformation to receive input from the result of an expression in another transformation.

An unconnected Lookup transformation exists separate from the pipeline in the mapping. We write an expression using the :LKP reference qualifier to call the lookup within another transformation.

A common use for unconnected Lookup transformations is to update slowly changing dimension tables.

What is the difference between connected lookup and

unconnected lookup?

(8)

informatica : What is the difference between connected lookup and unconnected lookup? Ans:

Differences between Connected and Unconnected Lookups:

Connected Lookup Unconnected Lookup

Receives input values directly from the pipeline. Receives input values from the result of a :LKP expression in another transformation.

We can use a dynamic or static cache We can use a static cache

Supports user-defined default values Does not support user-defined default values

What is a Lookup transformation and what are its uses?

informatica : What is a Lookup transformation and what are its uses? Ans:

We use a Lookup transformation in our mapping to look up data in a relational table, view or synonym.

We can use the Lookup transformation for the following purposes:

Get a related value. For example, if our source table includes employee ID, but we want to include the employee name in our target table to make our summary data easier to read. Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).

Update slowly changing dimension tables. We can use a Lookup transformation to determine whether records already exist in the target.

What is a lookup table? (KPIT Infotech, Pune)

informatica: What is a lookup table? (KPIT Infotech, Pune)

Ans: The lookup table can be a single table, or we can join multiple tables in the same database using a lookup query override. The Informatica Server queries the lookup table or an in-memory cache of the table for all incoming rows into the Lookup transformation.

If your mapping includes heterogeneous joins, we can use any of the mapping sources or mapping targets as the lookup table.

(9)

informatica : Where do you define update strategy?

Ans: We can set the Update strategy at two different levels:

• Within a session. When you configure a session, you can instruct the Informatica Server to either treat all records in the same way (for example, treat all records as inserts), or use instructions coded into the session mapping to flag records for different database operations. • Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.

What is Update Strategy?

informatica : What is Update Strategy?

When we design our data warehouse, we need to decide what type of information to store in targets. As part of our target table design, we need to determine whether to maintain all the historic data or just the most recent changes.

The model we choose constitutes our update strategy, how to handle changes to existing records.

Update strategy flags a record for update, insert, delete, or reject. We use this transformation when we want to exert fine control over updates to a target, based on some condition we apply. For example, we might use the Update Strategy transformation to flag all customer records for update when the mailing address has changed, or flag all employee records for reject for people no longer working for the company.

What are the different types of Transformations? (Mascot)

Informatica: What are the different types of Transformations? (Mascot)

Ans: a) Aggregator transformation: The Aggregator transformation allows you to perform

aggregate calculations, such as averages and sums. The Aggregator transformation is unlike the Expression transformation, in that you can use the Aggregator transformation to perform

calculations on groups. The Expression transformation permits you to perform calculations on a row-by-row basis only. (Mascot)

b) Expression transformation: You can use the Expression transformations to calculate values in a single row before you write to the target. For example, you might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers. You can use the Expression transformation to perform any non-aggregate calculations. You can also use the Expression transformation to test conditional statements before you output the results to target tables or other transformations.

c) Filter transformation: The Filter transformation provides the means for filtering rows in a mapping. You pass all the rows from a source transformation through the Filter transformation, and then enter a filter condition for the transformation. All ports in a Filter transformation are input/output, and only rows that meet the condition pass through the Filter transformation. d) Joiner transformation: While a Source Qualifier transformation can join data originating from a common source database, the Joiner transformation joins two related heterogeneous sources residing in different locations or file systems.

e) Lookup transformation: Use a Lookup transformation in your mapping to look up data in a relational table, view, or synonym. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup

(10)

transformations in a mapping.

The Informatica Server queries the lookup table based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup table column values based on the lookup condition. Use the result of the lookup to pass to other transformations and the target.

What is a transformation?

informatica: What is a transformation?

A transformation is a repository object that generates, modifies, or passes data. You configure logic in a transformation that the Informatica Server uses to transform data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data.

Each transformation has rules for configuring and connecting in a mapping. For more information about working with a specific transformation, refer to the chapter in this book that discusses that particular transformation.

You can create transformations to use once in a mapping, or you can create reusable transformations to use in multiple mappings.

What are the tools provided by Designer?

informatica: What are the tools provided by Designer? Ans: The Designer provides the following tools:

• Source Analyzer. Use to import or create source definitions for flat file, XML, Cobol, ERP, and relational sources.

• Warehouse Designer. Use to import or create target definitions. • Transformation Developer. Use to create reusable transformations. • Mapplet Designer. Use to create mapplets.

• Mapping Designer. Use to create mappings.

What are the different types of Commit intervals?

Informatica: What are the different types of Commit intervals? Ans: The different commit intervals are:

• Target-based commit. The Informatica Server commits data based on the number of target rows and the key constraints on the target table. The commit point also depends on the buffer block size and the commit interval.

• Source-based commit. The Informatica Server commits data based on the number of source rows. The commit point is the commit interval you configure in the session properties.

What is Event-Based Scheduling?

(11)

Ans:

When you use event-based scheduling, the Informatica Server starts a session when it locates the specified indicator file. To use event-based scheduling, you need a shell command, script, or batch file to create an indicator file when all sources are available. The file must be created or sent to a directory local to the Informatica Server. The file can be of any format recognized by the Informatica Server operating system. The Informatica Server deletes the indicator file once the session starts.

Use the following syntax to ping the Informatica Server on a UNIX system: pmcmd ping [{user_name | %user_env_var} {password | %password_env_var}] [hostname:]portno

Use the following syntax to start a session or batch on a UNIX system:

pmcmd start {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno [folder_name:]{session_name | batch_name} [:pf=param_file] session_flag wait_flag

Use the following syntax to stop a session or batch on a UNIX system:

pmcmd stop {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno[folder_name:]{session_name | batch_name} session_flag Use the following syntax to stop the Informatica Server on a UNIX system:

pmcmd stopserver {user_name | %user_env_var} {password | %password_env_var} [hostname:]portno

What are the different types of locks?

Informatica: What are the different types of locks? There are five kinds of locks on repository objects:

• Read lock. Created when you open a repository object in a folder for which you do not have write permission. Also created when you open an object with an existing write lock.

• Write lock. Created when you create or edit a repository object in a folder for which you have write permission.

• Execute lock. Created when you start a session or batch, or when the Informatica Server starts a scheduled session or batch.

• Fetch lock. Created when the repository reads information about repository objects from the database.

• Save lock. Created when you save information to the repository.

What is Dynamic Data Store?

Informatica: What is Dynamic Data Store?

The need to share data is just as pressing as the need to share metadata. Often, several data marts in the same organization need the same information. For example, several data marts may need to read the same product data from operational sources, perform the same profitability calculations, and format this information to make it easy to review.

If each data mart reads, transforms, and writes this product data separately, the throughput for the entire organization is lower than it could be. A more efficient approach would be to read, transform, and write the data to one central data store shared by all data marts. Transformation is a processing-intensive task, so performing the profitability calculations once saves time.

Therefore, this kind of dynamic data store (DDS) improves throughput at the level of the entire organization, including all data marts. To improve performance further, you might want to capture

(12)

incremental changes to sources. For example, rather than reading all the product data each time you update the DDS, you can improve performance by capturing only the inserts, deletes, and updates that have occurred in the PRODUCTS table since the last time you updated the DDS. The DDS has one additional advantage beyond performance: when you move data into the DDS, you can format it in a standard fashion. For example, you can prune sensitive employee data that should not be stored in any data mart. Or you can display date and time values in a standard format. You can perform these and other data cleansing tasks when you move data into the DDS instead of performing them repeatedly in separate data marts.

What are Target definitions?

Informatica: What are Target definitions?

Detailed descriptions for database objects, flat files, Cobol files, or XML files to receive

transformed data. During a session, the Informatica Server writes the resulting data to session targets. Use the Warehouse Designer tool in the Designer to import or create target definitions.

. What are Source definitions?

informatica: . What are Source definitions?

Detailed descriptions of database objects (tables, views, synonyms), flat files, XML files, or Cobol files that provide source data. For example, a source definition might be the complete structure of the EMPLOYEES table, including the table name, column names and datatypes, and any

constraints applied to these columns, such as NOT NULL or PRIMARY KEY. Use the Source Analyzer tool in the Designer to import and create source definitions.

What are fact tables and dimension tables?

As mentioned, data in a warehouse comes from the transactions. Fact table in a data warehouse consists of facts and/or measures. The nature of data in a fact table is usually numerical.

On the other hand, dimension table in a data warehouse contains fields used to describe the data in fact tables. A dimension table can provide additional and descriptive information (dimension) of the field of a fact table.

e.g. If I want to know the number of resources used for a task, my fact table will store the actual measure (of resources) while my Dimension table will store the task and resource details. Hence, the relation between a fact and dimension table is one to many.

When should you create the dynamic data store? Do you need a

DDS at all?

informatica: When should you create the dynamic data store? Do you need a DDS at all?

To decide whether you should create a dynamic data store (DDS), consider the following issues: • How much data do you need to store in the DDS? The one principal advantage of data marts is the selectivity of information included in it. Instead of a copy of everything potentially relevant from the OLTP database and flat files, data marts contain only the information needed to answer specific questions for a specific audience (for example, sales performance data used by the sales division). A dynamic data store is a hybrid of the galactic warehouse and the individual data mart,

(13)

since it includes all the data needed for all the data marts it supplies. If the dynamic data store contains nearly as much information as the OLTP source, you might not need the intermediate step of the dynamic data store. However, if the dynamic data store includes substantially less than all the data in the source databases and flat files, you should consider creating a DDS staging area.

•

• What kind of standards do you need to enforce in your data marts? Creating a DDS is an important technique in enforcing standards. If data marts depend on the DDS for information, you can provide that data in the range and format you want everyone to use. For example, if you want all data marts to include the same information on customers, you can put all the data needed for this standard customer profile in the DDS. Any data mart that reads customer data from the DDS should include all the information in this profile.

•

• How often do you update the contents of the DDS? If you plan to frequently update data in data marts, you need to update the contents of the DDS at least as often as you update the individual data marts that the DDS feeds. You may find it easier to read data directly from source databases and flat file systems if it becomes burdensome to update the DDS fast enough to keep up with the needs of individual data marts. Or, if particular data marts need updates significantly faster than others, you can bypass the DDS for these fast update data marts.

•

• Is the data in the DDS simply a copy of data from source systems, or do you plan to reformat this information before storing it in the DDS? One advantage of the dynamic data store is that, if you plan on reformatting information in the same fashion for several data marts, you only need to format it once for the dynamic data store. Part of this question is whether you keep the data normalized when you copy it to the DDS.

•

• How often do you need to join data from different systems? On occasion, you may need to join records queried from different databases or read from different flat file systems. The more frequently you need to perform this type of heterogeneous join, the more advantageous it would be to perform all such joins within the DDS, then make the results available to all data marts that use the DDS as a source.

What is the difference between PowerCenter and PowerMart?

With PowerCenter, you receive all product functionality, including the ability to register multiple servers, share metadata across repositories, and partition data.

A PowerCenter license lets you create a single repository that you can configure as a global repository, the core component of a data warehouse.

PowerMart includes all features except distributed metadata, multiple registered servers, and data partitioning. Also, the various options available with PowerCenter (such as PowerCenter Integration Server for BW, PowerConnect for IBM DB2, PowerConnect for IBM MQSeries,

PowerConnect for SAP R/3, PowerConnect for Siebel, and PowerConnect for PeopleSoft) are not available with PowerMart.

What are Shortcuts?

Informatica: What are Shortcuts?

We can create shortcuts to objects in shared folders. Shortcuts provide the easiest way to reuse objects. We use a shortcut as if it were the actual object, and when we make a change to the original object, all shortcuts inherit the change.

(14)

repository are called global shortcuts. We use the Designer to create shortcuts.

What are Sessions and Batches?

informatica: What are Sessions and Batches?

Sessions and batches store information about how and when the Informatica Server moves data through mappings. You create a session for each mapping you want to run. You can group several sessions together in a batch. Use the Server Manager to create sessions and batches.

What are Reusable transformations?

Informatica: What are Reusable transformations?

You can design a transformation to be reused in multiple mappings within a folder, a repository, or a domain. Rather than recreate the same transformation each time, you can make the

transformation reusable, then add instances of the transformation to individual mappings. Use the Transformation Developer tool in the Designer to create reusable transformations

What is a metadata?

Designing a data mart involves writing and storing a complex set of instructions. You need to know where to get data (sources), how to change it, and where to write the information (targets). PowerMart and PowerCenter call this set of instructions metadata. Each piece of metadata (for example, the description of a source table in an operational database) can contain comments about it.

In summary, Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the

transformations, and connect strings for sources and targets.

What is ER Diagram

ER - Stands for entitity relationship diagrams. It is the first step in the design of data

model which will later lead to a physical database design of possible a OLTP or OLAP database

What are Data Marts

Data Mart is a segment of a data warehouse that can provide data for reporting and analysis on a section, unit, department or operation in the company, e.g. sales, payroll, production. Data marts are sometimes complete individual data warehouses which are usually smaller than the corporate data warehouse.

Datawarehousing informatica interview questions 1

(15)

17. What is inline view?

18. What is Dataware house key?

19. Is it possible to execute work flows in different repositories at the same time using the same informatica

20. Which transformation replaces the look up transformation?

21. How to parse characters using functions in the expression transformation. For example if a column has character like mgr=a. I have to parse the character 'mgr='. Which function should I use?

22. How to use incremental aggregation in real time?

23. How do you import 500file from tally software in informatica? ?

24. What are the steps follow in performance tuning?

25. what s an ODS? what s the purpose of ODS?s that a logical database that stores extracted data from source

26. what is session recovery?

27. We can insert or update the rows without using the update strategy. Then what is the necessity of the update strategy?

28. what is associated port in look up.

(16)

generally used with informatica.

30. which one is better performance wise joiner or lookup

A short description of materialized view

A Materialized View is effectively a database table that contains the results of a query. The power of materialized views comes from the fact that, once created, Oracle can automatically

synchronize a materialized view's data with its source information as required with little or no programming effort.

Materialized views can be used for many purposes, including:

• Denormalization

• Validation

• Data Warehousing

• Replication.

Starting with Oracle 8.1.5, introduced in March 1999, you can have a materialized view, also known as a summary. Like a regular view, a materialized view can be used to build a black-box abstraction for the programmer. In other words, the view might be created with a complicated JOIN, or an expensive GROUP BY with sums and averages. With a regular view, this expensive operation would be done every time you issued a query. With a materialized view, the expensive operation is done when the view is created and thus an individual query need not involve

substantial computation.

Materialized views consume space because Oracle is keeping a copy of the data or at least a copy of information derivable from the data. More importantly, a materialized view does not contain up-to-the-minute information. When you query a regular view, your results includes changes made up to the last committed transaction before your SELECT. When you query a materialized view, you're getting results as of the time that the view was created or refreshed. Note that Oracle lets you specify a refresh interval at which the materialized view will

automatically be refreshed.

At this point, you'd expect an experienced Oracle user to say "Hey, these aren't new. This is the old CREATE SNAPSHOT facility that we used to keep semi-up-to-date copies of tables on machines across the network!" What is new with materialized views is that you can create them with the ENABLE QUERY REWRITE option. This authorizes the SQL parser to look at a query involving aggregates or JOINs and go to the materialized view instead. Consider the following query, from the ArsDigita Community System's /admin/users/registration-history.tcl page:

select to_char(registration_date,'YYYYMM') as sort_key, rtrim(to_char(registration_date,'Month')) as pretty_month, to_char(registration_date,'YYYY') as pretty_year, count(*) as n_new from users group by to_char(registration_date,'YYYYMM'),

(17)

to_char(registration_date,'Month'), to_char(registration_date,'YYYY') order by 1;

SORT_K PRETTY_MO PRET N_NEW --- --- ---- ---199805 May 1998 898 199806 June 1998 806 199807 July 1998 972 199808 August 1998 849 199809 September 1998 1023 199810 October 1998 1089 199811 November 1998 1005 199812 December 1998 1059 199901 January 1999 1488 199902 February 1999 2148

For each month, we have a count of how many users registered at photo.net. To execute the query, Oracle must sequentially scan the users table. If the users table grew large and you wanted the query to be instant, you'd sacrifice some timeliness in the stats with

create materialized view users_by_month enable query rewrite

refresh complete start with 1999-03-28 next sysdate + 1 as select to_char(registration_date,'YYYYMM') as sort_key, rtrim(to_char(registration_date,'Month')) as pretty_month, to_char(registration_date,'YYYY') as pretty_year, count(*) as n_new from users group by to_char(registration_date,'YYYYMM'), to_char(registration_date,'Month'), to_char(registration_date,'YYYY') order by 1

Oracle will build this view just after midnight on March 28, 1999. The view will be refreshed every 24 hours after that. Because of the enable query rewriteclause, Oracle will feel free to grab data from the view even when a user's query does not mention the view. For example, given the query

select count(*) from users

where rtrim(to_char(registration_date,'Month')) = 'January' and to_char(registration_date,'YYYY') = '1999'

Oracle would ignore the users table altogether and pull information fromusers_by_month. This would give the same result with much less work. Suppose that the current month is March 1998, though. The query

select count(*) from users

where rtrim(to_char(registration_date,'Month')) = 'March' and to_char(registration_date,'YYYY') = '1998'

(18)

will also hit the materialized view rather than the users table and hence will miss anyone who has registered since midnight (i.e., the query rewriting will cause a different result to be returned).

informatica questions and answers new 2

Suppose there are 100,000 rows in the source and 20,000 rows are loaded to target. Now in between if the session stops after loading 20,000 rows how will you load the remaining rows? Informatica server has 3 methods torecover the sessions:

(1)run the session again if the Informatica server has not issued a comit

(2)truncate the target tables and run the session again if the session is not recoverable (3)consider perform recovery if the Informatica server has issued at least one commit So for your question,use Perform recovery to load the records from where the session fails. Why is sorter an active transformation? What happens when you uncheck the DISTINCT option in sorter?

Sorter is an active transformation since it eliminate the duplicate records when the distinct option box is checked.

The uncheck distinct option will lead to show duplicate records Why we use lookup transformations?

Lookup Transformations can access data from relational tables that are not sources in mapping. With Lookup transformation, we can accomplish the following tasks:

Get a related value-Get the Employee Name from Employee table based on the Employee IDPerform Calculation.

Update slowly changing dimension tables - We can use unconnected lookup transformation to determine whether the records already exist in the target or not.

What is lookup transformation in informatica?

Lookup is a transformation to look up the values from a relational table/view or a flat file. The developer defines the lookup match criteria. There are two types of Lookups in Powercenter-Designer, namely; 1) Connected Lookup 2) Unconnected Lookup . what is the difference between a session and a task?

Seesions : Set of instructions to run a mapping

Task: session is type of task. other than that informatica several type of task like Assignment,Command,Control,Decision,Email,Event-Raise,Event-Wait,Timer,session What is the method of loading 5 flat files of having same structure to a single target and

(19)

which transformations will you use?

This can be handled by using the file list in informatica. If we have 5 files in different locations on the server and we need to load in to single target table. In session properties we need to change the file type as Indirect.

(Direct if the source file contains the source data. Choose Indirect if the source file contains a list of files.

When you select Indirect, the PowerCenter Server finds the file list then reads each listed file when it executes the session.)

am taking a notepad and giving following paths and file names in this notepad and saving this notepad as emp_source.txt in the directory /ftp_data/webrep/ /ftp_data/webrep/SrcFiles/abc.txt

/ftp_data/webrep/bcd.txt

/ftp_data/webrep/srcfilesforsessions/xyz.txt /ftp_data/webrep/SrcFiles/uvw.txt

/ftp_data/webrep/pqr.txt

In session properties i give /ftp_data/webrep/ in the

directory path and file name as emp_source.txt and file type as Indirect.

Informatica new interview questions and answers

How do you identify existing rows of data in the target table using lookup transformation? There are two ways to lookup the target table to verify a row exists or not :

1. Use connect dynamic cache lookup and then check the values of NewLookuprow Output port to

decide whether the incoming record already exists in the table / cache or not.

2. Use Unconnected lookup and call it from an expression trasformation and check the Lookup condition port value (Null/ Not Null) to decide whether the incoming record already exists in the table or not.

What is a Source Qualifier?

source qualifier is used to convert the source data type to Informatica readable format. we can do mapping without source qualifier..in that case the datatypes of the source columns should be same as what will be mentioned in source qualifier..

(20)

which one faster, and which one is best in Informatica Power Center 8.1/8.5?

I guess you are asking about the tracing level.Â When you configure a transformation, you can set the amount of detail the Integration Service writes in the session log.Â

PowerCenter 8.x supports 4 types of tracing level:

1.Normal: Integration Service logs initialization and status information, errors

encountered, and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows.

2.Terse: Integration Service logs initialization information and error messages and notification of rejected data.

3. Verbose Initialization: In addition to normal tracing, Integration Service logs additional initialization details, names of index and data files used, and detailed transformation statistics.

4.Verbose Data: In addition to verbose initialization tracing, Integration Service logs each row that passes into the mapping. Also notes where the Integration Service truncates string data to fit the precision of a column and provides detailed transformation statistics.

Allows the Integration Service to write errors to both the session log and error log when you enable row error logging.

When you configure the tracing level to verbose data, the Integration Service writes row data for all rows in a block when it processes a transformation.

By default, the tracing level for every transformation is Normal. In which situation do we use unconnected lookup?

Unconnected lookup should be used when we need to call same lookup multiple times in one mapping. For example, in a parent child relationship you need to pass mutiple child id's to get respective parent id's.

One can argue that this can be achieved by creating resusable lookup as well. Thats true, but reusable components are created when the need is across mappings and not one mapping. Also, if we use connected lookup multiple times in a mapping, by default the cache would be persistent.

(21)

How do you handle error logic in Informatica? What are the transformations that you used while handling errors? How did you reload those error records in target?

Bad files contains column indicator and row indicator.

Row indicator: It generally happens when working with update strategy transformation. The writer/target rejects the rows going to the target

Columnindicator: D -valid

o - overflow n - null t - truncate

When the data is with nulls, or overflow it will be rejected to write the data to the target The reject data is stored on reject files. You can check the data and reload the data in to the target using reject reload utility.

What happens if you turn off version?

You would not be able to track the changes done to the respective mappings/sessions/workflows.

What is DTM buffer size, Default buffer blocksize. If any performance issue happens to session, which one we have to increase and which one we have decrease.?

DTM buffer size is memory you allocate to DTM process (12 MB)

Buffer Block size is Size of heaviest Source/Target* number of rows that can be moved at a time(should be minimum 20 (64 KB)

And Informatica bydeault assign it to be as for 83 sources and Targets(Buffer Memory) So we should increase/decrease size accordingly

if more then 83 sources and Target then we should increase DTM and if source or target are heavy we should go with increasing buffer Block size.

what is difference between source base and target base commit?

Suppose if we say the target base commit as 1000, then informatica server will apply commit for every 1000 on the target table.

if we say a source base commit for 1000, and due to tranformation logic suppose 500 rows are dropped, then only 500 rows will insert into the target table, informatica server will apply commit on those 500 rows.

(22)

A mapplet can't be used in another mapplet.Because if you try to drag and drop one mapplet from the left hand side under the mapplet subfolder to the mapplet designer workspace it won't allow you to do so, but if you try to drag and drop one mapplet to one mapping,i.e., in the mapping designer then it comes to the workspce.This means a mapplet can only be used in a mapping but can't be used in another mapplet.That's why mapplet is known as the

reusable form of mapping.

For SQ Transformation when I am writing a custom Query, do I need to have all the From tables as part of the mapping? That is say I have 3 from tables in the Custom Query, do I need to import all 3 tables in the mapping. All 3 tables are from same database schema.

Please assist,

No Need to import all tables .. just take care of Field names,Fields order, lengths and datatypes ..define the join condition properly between them as part of custom query ...

informatica question and answers2

Q. What is a mapplet?

A. A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings. Q. What does reusable transformation mean?

A. Reusable transformations can be used multiple times in a mapping. The reusable transformation is stored as a metadata separate from any other mapping that uses the

transformation. Whenever any changes to a reusable transformation are made, all the mappings where the transformation is used will be invalidated.

Q. What is update strategy and what are the options for update strategy?

A. Informatica processes the source data row-by-row. By default every row is marked to be inserted in the target table. If the row has to be updated/inserted based on some logic Update Strategy transformation is used. The condition can be specified in Update Strategy to mark the processed row for update or insert.

Following options are available for update strategy :

• DD_INSERT : If this is used the Update Strategy flags the row for insertion. Equivalent numeric value of DD_INSERT is 0.

• DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent numeric value of DD_UPDATE is 1.

• DD_DELETE : If this is used the Update Strategy flags the row for deletion. Equivalent numeric value of DD_DELETE is 2.

• DD_REJECT : If this is used the Update Strategy flags the row for rejection. Equivalent numeric value of DD_REJECT is 3.

informatica repositry manager questions and answers

Q. What type of repositories can be created using Informatica Repository Manager? A. Informatica PowerCenter includeds following type of repositories :

• Standalone Repository : A repository that functions individually and this is unrelated to any other repositories.

(23)

shared objects across the repositories in a domain. The objects are shared through global shortcuts.

• Local Repository : Local repository is within a domain and it’s not a global repository. Local repository can connect to a global repository using global shortcuts and can use objects in it’s shared folders.

• Versioned Repository : This can either be local or global repository but it allows version control for the repository. A versioned repository can store multiple copies, or versions of an object. This features allows to efficiently develop, test and deploy metadata in the production environment. Q. What is a code page?

A. A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text.

When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.

Q. Which all databases PowerCenter Server on Windows can connect to? A. PowerCenter Server on Windows can connect to following databases: • IBM DB2 • Informix • Microsoft Access • Microsoft Excel • Microsoft SQL Server • Oracle • Sybase • Teradata

Q. Which all databases PowerCenter Server on UNIX can connect to? A. PowerCenter Server on UNIX can connect to following databases: • IBM DB2

• Informix • Oracle • Sybase • Teradata

Infomratica Mapping Designer

Q. How to execute PL/SQL script from Informatica mapping?

A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure.

Q. How can you define a transformation? What are different types of transformations available in Informatica?

A. A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Below are the various transformations available in Informatica:

• Aggregator

• Application Source Qualifier • Custom • Expression • External Procedure • Filter • Input • Joiner • Lookup • Normalizer • Output • Rank • Router

(24)

• Sequence Generator • Sorter • Source Qualifier • Stored Procedure • Transaction Control • Union • Update Strategy • XML Generator • XML Parser • XML Source Qualifier

Q. What is a source qualifier? What is meant by Query Override?

A. Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.

PowerCenter Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.

Q. What is aggregator transformation?

A. The Aggregator transformation allows performing aggregate calculations, such as averages and sums. Unlike Expression Transformation, the Aggregator transformation can only be used to perform calculations on groups. The Expression transformation permits calculations on a row-by-row basis only.

Aggregator Transformation contains group by ports that indicate how to group the data. While grouping the data, the aggregator transformation outputs the last row of each group unless otherwise specified in the transformation properties.

Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE.

Q. What is Incremental Aggregation?

A. Whenever a session is created for a mapping Aggregate Transformation, the session option for Incremental Aggregation can be enabled. When PowerCenter performs incremental

aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally.

Q. How Union Transformation is used?

A. The union transformation is a multiple input group transformation that can be used to merge data from various sources (or pipelines). This transformation works just like UNION ALL statement in SQL, that is used to combine result set of two SELECT statements.

Q. Can two flat files be joined with Joiner Transformation?

A. Yes, joiner transformation can be used to join data from two flat file sources. Q. What is a look up transformation?

A. This transformation is used to lookup data in a flat file or a relational table, view or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations.

Q. Can a lookup be done on Flat Files? A. Yes.

Q. What is the difference between a connected look up and unconnected look up?

A. Connected lookup takes input values directly from other transformations in the pipleline. Unconnected lookup doesn’t take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.

(25)

Q. What is the main difference between Data Warehousing and Business Intelligence? The differentials are:

DW - is a way of storing data and creating information through leveraging data marts. DM's are segments or categories of information and/or data that are grouped together to provide

'information' into that segment or category. DW does not require BI to work. Reporting tools can generate reports from the DW.

BI - is the leveraging of DW to help make business decisions and recommendations. Information and data rules engines are leveraged here to help make these decisions along with statistical analysis tools and data mining tools.

Q. What is data modeling?

Q. What are the different steps for data modeling?

Q. What are the data modeling tools you have used? (Polaris) Q. What is a Physical data model?

During the physical design process, you convert the data gathered during the logical design phase into a description of the physical database, including tables and constraints.

Q. What is a Logical data model?

A logical design is a conceptual and abstract design. We do not deal with the physical implementation details yet; we deal only with defining the types of information that we need. The process of logical design involves arranging data into a series of logical relationships called entities and attributes.

Q. What are an Entity, Attribute and Relationship?

An entity represents a chunk of information. In relational databases, an entity often maps to a table.

An attribute is a component of an entity and helps define the uniqueness of the entity. In relational databases, an attribute maps to a column.

The entities are linked together using relationships. Q. What are the different types of Relationships? Entity-Relationship.

Q. What is the difference between Cardinality and Nullability? Q. What is Forward, Reverse and Re-engineering?

Q. What is meant by Normalization and De-normalization? Q. What are the different forms of Normalization?

Q. What is an ETL or ETT? And what are the different types?

ETL is the Data Warehouse acquisition processes of Extracting, Transforming (or Transporting) and Loading (ETL) data from source systems into the data warehouse.

(26)

E.g. Oracle Warehouse Builder, Powermart.

Q. Explain the Extraction process? (Polaris, Mascot)

Q. How do you extract data from different data sources explain with an example? (Polaris) Q. What are the reporting tools you have used? What is the difference between them? (Polaris) Q. How do you automate Extraction process? (Polaris)

Q. Without using ETL tool can u prepare a Data Warehouse and maintain? (Polaris) Q. How do you identify the changed records in operational data (Polaris)

Q. What is a Star Schema?

A star schema is a set of tables comprised of a single, central fact table surrounded by de-normalized dimensions. Each dimension is represented in a single table. Star schema implement dimensional data structures with de- normalized dimensions. Snowflake schema is an alternative to star schema. A relational database schema for representing multidimensional data. The data is stored in a central fact table, with one or more tables holding information on each dimension. Dimensions have levels, and all levels are usually shown as columns in each dimension table. Q. What is a Snowflake Schema?

A snowflake schema is a set of tables comprised of a single, central fact table surrounded by normalized dimension hierarchies. Each dimension level is represented in a table. Snowflake schema implements dimensional data structures with fully normalized dimensions. Star schema is an alternative to snowflake schema.

An example would be to break down the Time dimension and create tables for each level; years, quarters, months; weeks, days… These additional branches on the ERD create ore of a

Snowflake shape then Star. Q. What is Very Large Database? Q. What are SMP and MPP? Symmetric multi-processors (SMP) Q. What is data mining?

Data Mining is the process of automated extraction of predictive information from large databases. It predicts future trends and finds behaviour that the experts may miss as it lies beyond their expectations. Data Mining is part of a larger process called knowledge discovery; specifically, the step in which advanced statistical analysis and modeling techniques are applied to the data to find useful patterns and relationships.

Data mining can be defined as "a decision support process in which we search for patterns of information in data." This search may be done just by the user, i.e. just by performing queries, in which case it is quite hard and in most of the cases not comprehensive enough to reveal intricate patterns. Data mining uses sophisticated statistical analysis and modeling techniques to uncover such patterns and relationships hidden in organizational databases – patterns that ordinary methods might miss. Once found, the information needs to be presented in a suitable form, with graphs, reports, etc.

Q. What is an OLAP? (Mascot)

OLAP is software for manipulating multidimensional data from a variety of sources. The data is often stored in data warehouse. OLAP software helps a user create queries, views,

(27)

representations and reports. OLAP tools can provide a "front-end" for a data-driven DSS.

On-Line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user.

OLAP functionality is characterized by dynamic multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities

Q. What are the Different types of OLAP's? What are their differences? (Mascot) OLAP - Desktop OLAP(Cognos), ROLAP, MOLAP(Oracle Discoverer)

ROLAP, MOLAP and HOLAP are specialized OLAP (Online Analytical Analysis) applications. ROLAP stands for Relational OLAP. Users see their data organized in cubes with dimensions, but the data is really stored in a Relational Database (RDBMS) like Oracle. The RDBMS will store data at a fine grain level, response times are usually slow.

MOLAP stands for Multidimensional OLAP. Users see their data organized in cubes with dimensions, but the data is store in a Multi-dimensional database (MDBMS) like Oracle Express Server. In a MOLAP system lot of queries have a finite answer and performance is usually critical and fast.

HOLAP stands for Hybrid OLAP, it is a combination of both worlds. Seagate Software's “Holos” is an example HOLAP environment. In a HOLAP system one will find queries on aggregated data as well as on detailed data.

DOLAP

Q. What is the difference between data warehousing and OLAP?

The terms data warehousing and OLAP are often used interchangeably. As the definitions suggest, warehousing refers to the organization and storage of data from a variety of sources so that it can be analyzed and retrieved easily. OLAP deals with the software and the process of analyzing data, managing aggregations, and partitioning information into cubes for in-depth analysis, retrieval and visualization. Some vendors are replacing the term OLAP with the terms analytical software and business intelligence.

Q. What are the facilities provided by data warehouse to analytical users? Q. What are the facilities provided by OLAP to analytical users?

Q. What is a Histogram? How to generate statistics?

Q. In Erwin what are the different types of models (Honeywell)

Q. Many Suppliers – Many Products Model the above scenario in Erwin. How many tables and what do they contain (Honeywell)

Q. What are the options available in Erwin Tool box (Honeywell) Q. Aggregate navigation

Q. What are the Data Warehouse Center administration functions? The functions of Visual Warehouse administration are:

Creating Data Warehouse Center security groups.

Defining Data Warehouse Center privileges for that group. Registering Data Warehouse Center users.

Adding Data Warehouse Center users to security groups. Registering data sources.

(28)

Registering warehouses (targets). Creating subjects.

Registering agents.

Registering Data Warehouse Center programs.

Q. How do I set the log level higher for more detailed information within Data Warehouse Center 7.2?

Within DWC, log level capability can be set from 0 to 4. There is a log level 5, yet it cannot be turned on using the GUI, but must be turned on manually. A command line trace can be used for any trace level, and this is the only way to turn on a level 5 trace:

Go to start, programs, IBM DB2, command line processor. Connect to the control database:

db2 => connect to Control_Database_name Update the configuration table:

db2 => update iwh.configuration set value_int = 5 where name = 'TRACELVL' and (component = '')

Valid components are: Logger trace = log Agent trace = agent Server trace = RTK DDD = DDD ODBC = VWOdbc

For multiple traces the format is:

db2 => update iwh.configuration set value_int = 5 where name = 'TRACELVL' and (component = '' or component = '')

Reset the connection: db2 => connect reset

Stop and restart the Warehouse server and logger. Perform the failing operation.

Be sure to reset the trace level to 0 using the command line when you are done: db2 => update iwh.configuration set value_int = 0 where name =

'TRACELVL'

and (component = '')

When you run a trace, the Data Warehouse Center writes information to text files. Data Warehouse Center programs that are called from steps also write any trace information to this directory. These files are located in the directory specified by the VWS_LOGGING environment variable.

The default value of VWS_LOGGING is: Windows and OS/2 = x:\sqllib\logging UNIX = /var/IWH

AS/400 = /QIBM/UserData/IWH

(29)

administration guide.

Q. What types of data sources does Data Warehouse Center support?

The Data Warehouse Center supports a wide variety of relational and non relational data

sources. You can populate your Data Warehouse Center warehouse with data from the following databases and files:

Any DB2 family database Oracle

Sybase Informix

Microsoft SQL Server IBM DataJoiner

Multiple Virtual Storage (OS/390), Virtual Machine (VM), and local area network (LAN) files IMS and Virtual Storage Access Method (VSAM) (with Data Joiner Classic

Connect)

Q. What is the Data Warehouse Center control database?

When you install the warehouse server, the warehouse control database that you specify during installation is initialized. Initialization is the process in which the Data Warehouse Center creates the control tables that are required to store Data Warehouse Center metadata. If you have more than one warehouse control database, you can use the Data Warehouse Center -->

Control Database Management window to initialize the second warehouse control database. However, only one warehouse control database can be active at a time.

Q. What databases need to be registered as system ODBC data sources for the Data Warehouse Center?

The Data Warehouse Center database that needs to be registered as system ODBC data sources are:

source target

control databases

1. What was the original business problem that led you to do this project?

Whether the consultant is being hired to gather requirements or to customize an OLAP

application, this question indicates that she’s interested in the big picture. She’ll keep the answer in mind as she does her work, which is a measure of quality assurance.

2. Where are you in your current implementation process?

A consultant who asks this question knows not to make any assumptions about how much progress you’ve made. She probably also understands that you might be wrong. There are plenty of clients who have begun application development without having gathered requirements. Understanding where the client thinks he is is just as important as understanding where he wants to be. It also helps the consultant in making improvement suggestions or recommendations for additional skills or technologies.

3. How long do you see this position being filled by an external resource?

While the question might seem self-serving at first, a good consultant is ever mindful of his responsibility to render himself dispensable over time. Your answer will give him a good idea of how much time he has to perform the work as well as to cross train permanent staff within your organization. A variation on this question is: "Is there a dedicated person or group targeted for knowledge transfer in this area?"

4. What deliverables do you expect from this engagement?

The consultant who doesn’t ask about deliverables is the consultant who expects to sit around giving advice. Beware of the "ivory tower" consultants, who are too light for heavy work and too heavy for light work. Every consultant you talk to should expect to produce some sort of