SQL Generation for Multisource Reports - Advanced Project Design.pdf

If you have a report that uses objects from multiple data sources, the SQL generation for the report is handled a little differently than for single-sourced reports.

With reports that use multiple data sources, much of the work of the SQL Engine remains the same. However, the Engine performs two additional tasks:

• It determines the optimal database instance for each pass of SQL.

• It identifies when joins need to occur across database instances.

Designating Primary and Secondary Tables

Every project table has a primary database instance. The primary database instance is the first one to which it is mapped. If you have duplicate tables, the same table can have both primary and secondary database instances. The primary table is the one that exists in the primary database instance. The secondary table is the one that exists in the secondary database instance.

You can change which database instance is the primary one for a table.

For information on changing the primary database instance for a table, see “Changing the Primary Database Instance for a Table” starting on page 95.

You can have multiple secondary tables if the table is mapped to more than one secondary database instance.

For example, consider the scenario introduced previously:

Report with Objects from Multiple Data Sources

In this example, the two fact tables each map to only one database. The primary database instance for the REGION_SALES table is the Sales Data Warehouse. The primary database instance for the FORECAST_SALES table is the Forecast Data Warehouse.

However, the LU_REGION table exists in both data warehouses, so you can

If a table is available in a single data source, that source is the only one the Engine can use in the report SQL to obtain the necessary data. However, if a table is available in multiple data sources, the Engine uses specific logic to select the optimal data source.

Selecting the Optimal Data Source for Fact Tables

SQL generation for reports is focused on metric data. When the Engine needs to calculate a metric, it first has to determine the best source for the underlying fact. After taking into account attributes in the template, the metric's

dimensionality, and report or metric filters, the Engine uses the following logic to select the optimal data source for a fact:

• If the fact comes from a fact table that is available in the primary database instance for the project, the Engine calculates the metric using the primary database instance for the project.

• If the fact comes from a fact table that is not available in the primary database instance for the project, the Engine calculates the metric using a secondary database instance. If the fact table is available in more than one secondary database instance, the Engine selects the database instance with the smallest GUID (alphabetically).

In essence, the Engine considers only the primary and secondary database instance designation at the project level when selecting the data source for a fact. When you have a fact table available in multiple sources, it does not matter which sources have primary versus secondary designation for the table.

Selecting the Optimal Data Source for Lookup Tables

After selecting the optimal data source for a fact, the Engine also has to select the best source for any corresponding attributes. The Engine uses the following logic to select the optimal data source for an attribute:

• If the attribute comes from a lookup table that exists in the same data source as the one selected for the fact, the Engine obtains the attribute data from this same database instance.

• If the attribute comes from a lookup table that does not exist in the same data source as the one selected for the fact, the Engine obtains the attribute data from the primary database instance for the lookup table and moves it to the database instance used as the fact source.

In essence, when you have a lookup table available in multiple sources, it can matter which sources have primary versus secondary designation for the table.

This same logic also applies if the Engine has to retrieve attribute information from a relationship table.

However, if you are just browsing attribute elements, the Engine treats lookup tables for attributes like fact tables. If the lookup table exists in the primary database instance for a project, the Engine queries that database instance.

Otherwise, it uses the secondary database instance with the smallest GUID (alphabetically).

Joining Data from Different Data Sources

When the Engine needs to join data from different data sources, it selects data from the first data source to the memory of the Intelligence Server. Then, it creates a temporary table in the second data source and inserts the data into this table to continue processing the result set.

You may have a data source that either does not support the creation of temporary tables or in which you do not want to create temporary tables. If so, you can configure the CREATE and INSERT support

To work with tables from different data sources, the Engine joins table columns based on specific data type compatibility rules. The following table lists these rules:

If two columns do not have compatible data types, the Engine cannot join data using those columns.

Data Type Compatibility Rules

Data Type Compatible Data Type

BigDecimal BigDecimal

Binary Binary

CellFormatData CellFormatData

Char Char, VarChar

Date Date, TimeStamp

Decimal Decimal, Integer with scale of 0, Numeric

Double Double, Float, Real

Float Double, Float, Real

Integer Decimal with scale of 0, Integer, Numeric with scale of 0

LongVarBin LongVarBin

LongVarChar LongVarChar

NChar NChar

Numeric Decimal, Numeric, Integer with scale of 0

The data type of a column is based on the Engine data type definition, not the data type definition in the physical database.

These two data type definitions may not always be the same.

Often, different data sources are optimal for different passes in the SQL for a report. For such queries, the Engine follows the data type compatibility rules and moves data between the different data sources until it finishes processing the result set.

In document Advanced Project Design.pdf (Page 78-83)