• No results found

Data Warehouses & OLAP

N/A
N/A
Protected

Academic year: 2021

Share "Data Warehouses & OLAP"

Copied!
48
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

1.

The Big Picture

2.

Data Warehouse Philosophy

3.

Data Warehouse Concepts

4.

Warehousing Applications

5.

Warehouse Schema Design

6.

Business Intelligence Reporting

7.

On-Line Analytical Processing

8.

OLAP Applications

9.

Data Warehouse Implementation

(3)

1.

The Big Picture

2.

Data Warehouse Philosophy

3.

Data Warehouse Concepts

4.

Warehousing Applications

5.

Warehouse Schema Design

6.

Business Intelligence Reporting

7.

On-Line Analytical Processing

8.

OLAP Applications

9.

Data Warehouse Implementation

10.

Warehousing Software

(4)

Data Warehouses & OLAP 4

What is OLAP?

On-Line Analytical Processing is not a

definition…

It gives no help in deciding if a product is an

OLAP tool or not!

Since late 1994, many vendors claim to have

OLAP compliant products

It is not possible to rely on the vendors’ own

description

Membership of the OLAP council is not a good

(5)

Data Warehouses & OLAP 5

What is OLAP?

Researchers were forced to create

their own

definition

It had to be

simple

,

memorable

and

product-independent

The

FASMI

test is one of the most converging

definition efforts for detecting OLAP

compliance

It defines the

characteristics of an OLAP

application

in a specific way…

FASMI

F

ast

A

nalysis of

S

hared

M

ultidimensional

(6)

FAST

The system is targeted to deliver responses to

users within about 5 seconds

◦ Simplest analysis ~ no more than 1 second

◦ Most complicated analysis ~ no more than 20 seconds

End-users assume that a process has failed if

results are not received within 30 seconds

Unless the system warns that the report will

(7)

Data Warehouses & OLAP 7

FAST

The OLAP response speed is not easy to

achieve…

… especially when on-the-fly and ad hoc

calculations are required

Vendors resort to many techniques to achieve

this goal:

◦ Specialized forms of data storage, ◦ Extensive pre-calculations,

(8)

FAST

None of the existent products is fully

optimized

… an area of developing technology

◦ The full pre-calculation approach fails with large and sparse data

◦ Doing everything on-the-fly is much too slow with large data

According to surveys, slow query response is

(9)

Data Warehouses & OLAP 9

ANALYSIS

The system can cope with any business

logic and statistical analysis relevant for the

application and the user

In some OLAP product some

pre-programming may be needed…

Without having to program, it is necessary to allow the user to define new ad hoc

(10)

ANALYSIS

Analysis could include specific features like:

◦ Time series analysis,

◦ Cost allocations,

◦ Currency translation, ◦ Goal seeking,

◦ Ad hoc multidimensional structural changes, ◦ Non-procedural modeling,

◦ Exception alerting, ◦ Data mining.

These capabilities differ between products,

(11)

Data Warehouses & OLAP 11

SHARED

The system implements all the

security

requirements for confidentiality

If multiple write access is needed,

concurrent update locking at an appropriate

level should be implemented

The system should be able to handle multiple

updates in a timely and secure manner

This is a major area of weakness in many

OLAP products

… assuming that OLAP applications will be

read-only

Even products with multi-user read-write have

(12)

MULTIDIMENSIONAL

Is the

key requirement

for all OLAP

applications

The system must provide a multidimensional

conceptual view including:

◦ Full support for hierarchies ◦ Multiple hierarchies

… This is the most

logical way

to analyze

(13)

Data Warehouses & OLAP 13

INFORMATION

Is all of the

data and derived information

needed

, wherever it is and however much is

relevant for the application

The capacity if handling data differ between

OLAP products

◦ The largest OLAP products can hold at least a thousand times as much as the smallest

Many considerations must be taking:

(14)

The

FASMI

test is a reasonable and

understandable definition of the

goals OLAP is meant to achieve

Researches encourage users and

vendors to adopt this definition, which

(15)

Data Warehouses & OLAP 15

The Codd rules

In 1993, Codd et al. published a white paper

“Providing OLAP to User-Analysts: An IT

Mandate”

Codd was very well known as a respected

database researcher from the 1960s till the

late 1980s

He is credited with being the

inventor of the

relational database model in 1969

Unfortunately

, his OLAP rules proved to be

controversial due to being

(16)

The Codd rules

The OLAP white paper included 12 rules,

which are now well known

They were followed by another 6 rules in

1995

Codd restructured the rules into four groups,

calling them “features”

Basic Features

Special Features

(17)

Data Warehouses & OLAP 17

The Codd rules

Basic Features

1. Multidimensional Conceptual View

Few would argue with this feature

◦ Codd believes this to be the central core of OLAP

(18)

The Codd rules

Basic Features

2. Intuitive Data Manipulation

◦ Data

manipulation through direct actions on

cells in the view

Without recourse to menus or multiple

actions, we assume that this is by using a

mouse

(19)

Data Warehouses & OLAP 19

The Codd rules

Basic Features

3. Accessibility: OLAP as a Mediator

◦ OLAP engines are considered as middleware, sitting between heterogeneous data sources and an OLAP front-end

Most products can achieve this, but often with

(20)

The Codd rules

Basic Features

4. Batch Extraction vs Interpretive

◦ This rule effectively required that products offer

both their own staging database for OLAP data as well as offering live access to external data

(21)

Data Warehouses & OLAP 21

The Codd rules

Basic Features

5. OLAP Analysis Models

◦ Codd required that OLAP products should support all four analysis models :

Categorical: parameterized static reporting ~ All OLAP tools

Exegetical: slicing and dicing with drill down ~ All OLAP tools

Contemplative: « what if? » analysis ~ Most OLAP

tools

Formulaic: goal seeking models ~ Very few OLAP

(22)

The Codd rules

Basic Features

6. Client/Server Architecture

◦ The OLAP server component of an OLAP product

should be sufficiently intelligent that various clients could be attached with minimum effort and

programming for integration

◦ Relatively few OLAP products are qualified for this test

(23)

Data Warehouses & OLAP 23

The Codd rules

Basic Features

7. Transparency

This test, dealing with openness, is also a tough

but valid one

◦ A spreadsheet user should be able to get full

values from an OLAP engine and not even be aware of where the data comes from

◦ OLAP products must allow live access to

heterogeneous data sources from a full function

spreadsheet add-in, with the OLAP server engine in between

(24)

The Codd rules

Basic Features

8. Multi-User Support

◦ OLAP tools must provide concurrent access (retrieval and update), integrity and security

Many OLAP applications are still read-only

(25)

Data Warehouses & OLAP 25

The Codd rules

Special Features

9. Treatment of Non-Normalized Data

◦ Refers to the integration between an OLAP engine and denormalized source data

◦ Any data updates performed in the OLAP

environment should not be allowed to alter stored denormalized data in feeder systems

(26)

The Codd rules

Special Features

10. Storing OLAP Results: Keeping them

Separate from Source Data

◦ This is really an implementation rather than a product issue

◦ But few would disagree with it

◦ Read-write OLAP applications should not be implemented directly on live transaction data

◦ OLAP data changes should be kept distinct from transaction data

(27)

Data Warehouses & OLAP 27

The Codd rules

Special Features

11. Extraction of Missing Values

◦ All missing values are cast in the uniform

representation defined by the Relational Model

Missing values are to be distinguished from

zero values

(28)

The Codd rules

Special Features

12. Treatment of Missing Values

◦ All missing values are to be ignored by the OLAP analyzer regardless of their source

(29)

Data Warehouses & OLAP 29

The Codd rules

Reporting Features

13. Flexible Reporting

◦ The dimensions can be laid out in any way that the user requires in reports

Most products are capable of this in their

formal report writers

◦ It is preferable that analysis and reporting facilities

(30)

The Codd rules

Reporting Features

14. Uniform Reporting Performance

◦ Reporting performance be not significantly

degraded by increasing the number of dimensions or database size

There are differences between products

◦ The principal factor that affects performance is the

degree to which the calculations are performed

(31)

Data Warehouses & OLAP 31

The Codd rules

Reporting Features

15. Automatic Adjustment of Physical

Level

◦ OLAP system must adjust its physical schema

automatically to adapt to the type of model, data volumes and sparsity

Most vendors fall far short of this noble ideal

Since 1996, users can benefit from it in

(32)

The Codd rules

Dimension Control

16. Generic Dimensionality

◦ Each dimension must be equivalent in both its structure and operational capabilities

◦ This has proven to be one of the most controversial Codd’s rules

(33)

Data Warehouses & OLAP 33

The Codd rules

Dimension Control

17. Unlimited Dimensions & Aggregation

Levels

Technically, no product can possibly comply with this feature

◦ There is no such thing as an unlimited entity on a limited computer

Few applications need more than about eight or

ten dimensions

Few hierarchies have more than about six

consolidation levels

(34)

The Codd rules

Dimension Control

18. Unrestricted Cross-dimensional

Operations

◦ All forms of calculation must be allowed across all dimensions, not just the “measures” dimension

Many products which use only relational storage are weak in this area

(35)

Data Warehouses & OLAP 35

(36)
(37)

Data Warehouses & OLAP 37

(38)
(39)

Data Warehouses & OLAP 39

(40)
(41)

Data Warehouses & OLAP 41

(42)
(43)

Data Warehouses & OLAP 43

(44)
(45)

Data Warehouses & OLAP 45

(46)
(47)

Data Warehouses & OLAP 47

(48)

References

Related documents

• Enables advanced analyses on both operational and OLAP data sources Custom BI Applications OLAP Excel Add-in OracleAS Discoverer.. Discoverer 10g –

Access to All OLAP Data from Excel OLAP API OLAP Catalog Database Warehouse Builder Enterprise Manager AW Manager. Oracle Business

by mass spectrometry, by the shape of their spectral peak in in more detail in the Example Section. time-of-?ight mass spectrometry and by their binding char- 25 The biomarkers

The Client also authorizes AFC to deduct all other applicable commissions and fees (including, but not limited to, inactivity fees, delivery charges, phone order

Business Information Warehouse Server BAPI BAPI 3rd party OLAP client 3rd party OLAP client OLAP Processor OLAP Processor Staging Engine Staging Engine InfoCubes InfoCubes Meta

And while a handful of advanced OLAP features are reserved for Enterprise Edition, all data mining features and the core OLAP functionality are included in SQL Server 2000

SSL VPN solutions provide secure, granular access controls, ensuring that users gain access only to the designated resources or applications specific to their needs and according

Cole submitted, however, that such a reading of the paragraph ignored the references therein to section 2(a)(iv) and (2)(a)(ii) of the Act which, so he submitted, made it clear