• No results found

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

N/A
N/A
Protected

Academic year: 2021

Share "OLAP. Business Intelligence OLAP definition & application Multidimensional data representation"

Copied!
31
0
0

Loading.... (view fulltext now)

Full text

(1)

OLAP

„

Business Intelligence

„

OLAP definition & application

„

Multidimensional data representation

(2)

Business Intelligence

„ Accompanying the growth in data warehousing is an ever-increasing demand by users for more powerful access tools that provide advanced analytical capabilities.

„ There are two main types of access tools available to meet this demand, namely Online Analytical Processing

(3)

Business Intelligence

„ OLAP and Data Mining differ in what they offer the user and because of this they are complementary technologies.

„ An environment that includes a data warehouse (or more commonly one or more data marts) together with tools such as OLAP and /or data mining are collectively referred to as Business

Intelligence (BI) technologies.

„ BI = DW + OLAP BI = DW + DM

(4)

OLAP definition & application

„ The dynamic synthesis, analysis, and

consolidation of large volumes of multi- dimensional data, Codd (1993).

„ Describes a technology that uses a multi-dimensional view of aggregate

data to provide quick access to strategic information for the purposes of

(5)

OLAP definition & application

„ Can easily answer ‘who?’ and ‘what?’

questions, however, ability to answer

‘what if?’ and ‘why?’ type questions distinguishes OLAP from general-

purpose query tools.

„ Types of analysis ranges from basic navigation and browsing (slicing and

dicing) to calculations, to more complex analyses such as time series and

complex modeling.

(6)

OLAP definition & application

„ Examples of OLAP applications

(7)

OLAP definition & application

„

Although OLAP applications are

found in widely divergent

functional areas, they all have the

following key features:

„ multi-dimensional views of data

„ support for complex calculations

„ time intelligence

(8)

OLAP - multi-dimensional

views of data

„ Core requirement of building a ‘realistic’

business model.

„ Provides basis for analytical processing through flexible access to corporate

data.

„ The underlying database design that

provides the multi-dimensional view of data should treat all dimensions equally.

(9)

OLAP - support for complex

calculations

„ Must provide a range of powerful

computational methods such as that required by sales forecasting, which uses trend algorithms such as moving averages and percentage growth.

„ Mechanisms for implementing

computational methods should be clear and non-procedural.

(10)

OLAP – time intelligence

„ Key feature of almost any analytical application as performance is almost always judged over time.

„ Time hierarchy is not always used in the same manner as other hierarchies.

„ Concepts such as year-to-date and

period-over-period comparisons should be easily defined.

(11)

Representation of Multi-

dimensional Data

„

Example of two-dimensional query.

„ What is the total revenue generated by

property sales in each city, in each quarter of 2004?’

„

Choice of representation is based on

types of queries end-user may ask.

„

Compare representation - three-field

relational table versus two-

dimensional matrix.

(12)

Multi-dimensional Data as

Three-field table versus Two-

dimensional Matrix

(13)

Representation of Multi-

dimensional Data

„

Example of three-dimensional

query.

„ ‘What is the total revenue generated by property sales for each type of property (Flat or House) in each city, in each

quarter of 2004?’

„

Compare representation - four-

field relational table versus

three-dimensional cube.

(14)

Multi-dimensional Data as

Four-field Table versus Three-

dimensional Cube

(15)

Representation of Multi-

dimensional Data

„

A sample data cube

(16)

Representation of Multi-

dimensional Data

„ Cube represents data as cells in an array.

„ Relational table only represents multi- dimensional data in two dimensions.

„ Multi-dimensional structures are best visualized as cubes of data, and cubes

within cubes of data. Each side of a cube is a dimension.

„ A cube can be expanded to include other dimensions.

(17)

Representation of Multi-

dimensional Data

„

As number of dimensions increases,

number of the cube’s cells

increases exponentially.

„

However, majority of multi-

dimensional queries use

summarized, high-level data.

„

Solution is to pre-aggregate

(consolidate) all logical subtotals

and totals along all dimensions.

(18)

Representation of Multi-

dimensional Data

„ Pre-aggregation is valuable, as typical dimensions are hierarchical in nature.

„ (e.g. Time dimension hierarchy - years, quarters, months, weeks, and days)

„ Predefined hierarchy allows logical pre- aggregation and, conversely, allows for a logical ‘drill-down’.

(19)

Representation of Multi-

dimensional Data

„

Supports common analytical

operations

„ Consolidation

„ Drill-down

„ Slicing and dicing

(20)

Representation of Multi-

dimensional Data

„ Pivoting

„ select two dimensions used to aggregate a measure

„ e.g., the Sales for a Product in a particular City

Product City Date Sales

Juice NY 10 NY CA MN Cola CA 20 Juice 40 x x Juice NY 30 Cola 10 20 x Cola NY 10 Cream x x 10 Cream MN 10

(21)

Representation of Multi-

dimensional Data

„ Rollup

„ taking a current data and group-by further on another dimension

„ e.g., rollup the total sales on city by product

„ Drill-down

(22)
(23)
(24)

Representation of Multi-

dimensional Data

„ Slice_and_dice

„ reducing the dimensionality of data

„ the table (city, date) of a specific product

(25)
(26)
(27)

Representation of Multi-

dimensional Data

„

Consolidation - aggregation of data

such as simple ‘roll-ups’ or complex

expressions involving inter-related

data.

„

Drill-Down - is the reverse of

consolidation and involves displaying

the detailed data that comprises the

consolidated data.

(28)

Representation of Multi-

dimensional Data

„

Slicing and Dicing - (also called

pivoting) refers to the ability to

look at the data from different

viewpoints.

(29)

Representation of Multi-

dimensional Data

„ Can store data in a compressed form by dynamically selecting physical storage

organizations and compression techniques that maximize space utilization.

„ Dense data (that is, data that exists for a high percentage of cells) can be stored separately from sparse data (that is, a

significant percentage of cells are empty).

(30)

Representation of Multi-

dimensional Data

„

Ability to omit empty or repetitive

cells can greatly reduce the size of

the cube and the amount of

processing.

„

Allows analysis of exceptionally large

amounts of data.

(31)

Representation of Multi-

dimensional Data

„ In summary, pre-aggregation, dimensional hierarchy, and sparse data management

can significantly reduce the size of the

cube and the need to calculate values ‘on- the-fly’.

„ Removes need for multi-table joins and provides quick and direct access to arrays of data, thus significantly speeding up

execution of multi-dimensional queries.

References

Related documents

Such tools include Data Warehouse (DW), Extract-Transform and Load (ETL), On-Line Analytical Processing (OLAP), Data Mining (DM), Text Mining, Web Mining, Data Visualization,

– OLAP and related via a dimensional data warehouse (currently only Microsoft Analysis Services) shared with dedicated analytical tools. – “ TrueTime, ” combining behavioral

We describe back end tools for extracting, cleaning and loading data into a data warehouse; multidimensional data models typical of OLAP; front end client

Typically, any Data Warehouse and Mining (DWM) application includes large data warehouse, decision support systems, query and reporting, On Line Analytical Process

Besides enhancing classic OLAP with a clustering method, this operator also couples OLAP and data mining in order to deal with complex data in multidimensional context.. We have

 ROLAP accesses data stored in a data warehouse (relational database) to provide OLAP analyses.  OLAP is a three-tier,

We describe back end tools for extracting, cleaning and loading data into a data warehouse; multidimensional data models typical of OLAP; front end client tools for

The technological components used to support the business processes comprise tools that include online analytical processing (OLAP), data warehouses (DW), data mining (DM),