• No results found

Data Warehousing Data Warehouses, OLAP and Visualization

N/A
N/A
Protected

Academic year: 2021

Share "Data Warehousing Data Warehouses, OLAP and Visualization"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 1

Data Warehousing

Data Warehouses, OLAP and Visualization

By Kristian Staalø Knudsen and David Flenstrup, Netcompany A/S

Aalborg, 4th of April 2011

Contact information:

David Flenstrup, Senior Consultant dfl@netcompany.com

(2)

About us (David and Kristian) - David

• M.Sc. in Computer Science from the Technical University of

Denmark

• Senior Consultant at Netcompany A/S since 2009

• Microsoft Certified Business Intelligence Developer

• Worked with the following clients:

– Bankdata

– Maersk

– Sundhedsstyrelsen

– TDC

(3)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 3

About us (David and Kristian) - Kristian

• M.Sc. in Computer Science from the Technical University of

Denmark

• Senior Consultant at Netcompany A/S since 2007

• Microsoft Certified Business Intelligence Developer

• Worked with the following clients:

– Flügger

– Forca

– PFA

– Unipension

(4)

Netcompany facts

Netcompany delivers IT and business consulting services

Dedicated to ”old deeds” with high professional standards and focus on solutions which create

both technical and business value for our customers

Deep knowledge and experience within business, technology, process optimization, change

management and project management

Customer base consists of well-established Danish and international companies and

organizations

Awarded by Børsen as “The best IT consultants in Denmark” in 2005, 2006, 2007 and 2008

In 2005, 2006 and 2007 respectively awarded ”System Integrator of the year”, ”Dynamics CRM

partner of the year” and ”Information Worker partner of the year” by Microsoft Denmark.

Microsoft Gold Certified Business Intelligence partner (among other areas)

Average employee age 31

Etablished in 1999

(5)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 5

Please interrupt us!

(6)

Agenda

Analysis Chaos

Locating performance bottlenecks

KISS architecture

Visualization – Representing the data

Case – Maersk Drilling

Future technologies – potential in the DSS and BI realm

Microsoft Business Intelligence

(7)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 7

Analysis Chaos

Introduction

• What happened

• Lessons learned

(8)

Pension Administration Company

Pension Administration Company

Pension for

newly educated

computer

geeks

Pension for

redheaded

highschool

teachers

Pension for Ill

tempered

cleaning ladies

Customer Service Economysystems IT Development IT Maintenance

(9)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 9

New ERP system with reports

Employee

Employee

Manager

Enterprise Resource Planning

(ERP)

(10)

Analysis Chaos

• Introduction

What happened

(11)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 11

Økonomi m

realtid

HR

Projekt

Løn

Analysis

Design

Implementation

Test

Scope at first

Netcompany Consultant

(12)

Analysis

Design

Implementation

Test

Requirements

specification

Positive

Negative

People asked

30-40 People

Netcompany Consultant Netcompany Consultant

Scope Explosion

(13)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 13

Kreditor

Core

Analysis

Design

Implementation

Test

HR

Projekt

Kreditor

Debitor

Pension

Løn

Requirements

specification

Project plan

Scope (8 cubes)

Ressourcer (1700)

Deadline (1/6-2011)

Limitations

Assumptions/Prereq

uisites

Final Scope

Signed off by the business

Signed off by the business

(14)

Name Account Amount Jensen Wage 50

Analysis

Design

Implementation

Test

< 5 min

Real-time requirement at

Monthly balancing

Near Real-time requirement

Manager

Employee

Name Account Amount Jensen Wage 100

(15)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 15

Design

Implementation

Test

Analysis

(16)

Design

Implementation

Test

ERP

Nightly Extract-Transform-Load

F

ull

Proc

es

s

in

g

SQL Server Agent

Every 5 minutes

Delta Load since last run

No deleted Records

Last Modified Date

Partition for current day

SSIS Dataflow Destination:

Partition Processing Add

EDW

Near Real-time Design

Analysis

(17)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 17

Design

Implementation

Test

(18)

BI

Design

Implementation

Test

BI

ERP

Project plan

Scope (8 cubes)

Ressourcer (1700)

Deadline (1/6-2011)

Limitations

Assumptions/Prer

equisites

Status (weekly)

Riscs

ERP

Integration points change

Solution

(19)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 19

Design

Implementation

Test

(20)

Design

Implementation

Test

Lack of Testdata

Analysis

• Garbage In -> Garbage out

Project plan

Scope (8 cubes)

Ressourcer (1700)

Deadline (1/6-2011)

Limitations

Assumptions/Prer

equisites

(21)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 21

Analysis Chaos

• Introduction

• What happened

(22)

Design

Implementation

Test

Evaluation

• Things have different names in different companies

• Things have different names inside the company

• Things have different names in the same room

• Don’t be afraid to ask questions about the business!

Understanding the business

Analysis

(23)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 23

Design

Implementation

Test

Evaluation

• Proposed in October

by Netcompany

• Demanded in April

by the Business

Niveau 3

Niveau 2

Niveau 1

All Projects Project 1 Project 2 Maintenance Cleaning

Does the business know what they want?

Analysis

(24)

Design

Implementation

Test

Evaluation

Place responsibility in the business

• Who is the buyer?

• Make heads of areas

responsible

• Better overview

• Easier to plan workshops

Analysis

(25)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 25

Design

Implementation

Test

Evaluation

Don’t build BI on systems that are not in

production or completely specified yet.

• Hard to ”guess” the right

table

• Documentation!

• Reimplementation when

functionality or interface

change

• Testdata!

1700 tables in source

BI

ERP

Analysis

(26)

Design

Implementation

Test

Evaluation

Top-Down design

Løn

Progr

es

s

• Ensures

business

requirement

s are met.

Analysis

KPI’s Charts, Reports

Cubes

(27)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 27

Analysis

Design

Implementation

Test

Evaluation

Prototype / Iterations

• Show end users the reports -> better analysis / more value

• Find technical limitations early

• Performance testing early

• Discover heavy administrative processes in company

A

D

I

T

Analysis

Design

Implementation

Test

Creation of SSIS user

account takes 1 month 

(28)

Status now

• End of build phase

• Total estimate of 3900 hours (vs. 1700)!

• Ok because:

– Assumptions.

– Limitations.

– Riscs in status reports.

Plan BI as part of your business releases

(29)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 29

Analysis Chaos

Locating performance bottlenecks

KISS architecture

Visualization – Representing the data

Case – Maersk Drilling

Future technologies – potential in the DSS and BI realm

Microsoft Business Intelligence

(30)

Case: Understand the requirements and challenge the business

Requirement: A Top 100 most expensive events report (OLAP candidate?)

Customers are placed in parent/child hierarchy

About 400 million events a year

The report should be fast! (1-2 sec max)

Business requirements would result in a design with complex ETL, high requirements for

hardware platform.

Candidate for Outlier report, data analysis showed the following:

Challenged the business and ended with a reduction in data of 95%

Result: Simple ETL, customer saves money.

Price

No. of events

Reduction

All

7.150.683

0,00%

> 1,-

4.171.859

41,66%

> 5,-

796.349

88,86%

> 10,-

338.048

95,27%

(31)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 31

Locating performance bottlenecks in SSIS

Locate the bottleneck

Source (source component -> row count task)

Destination (flat file -> destination)

Network (local source vs. remote source)

Transformations (enable/disable transformations)

How to fix a slow source: Multiple reads from slow source (union in DF)

How to fix a slow destination: Multiple writes to slow destination (split destination table into

multiple tables and create view joining the tables)

How to fix a slow network: jumbo frames, network adapters…

Demo: Analyzing and fixing a slow query

Execute query in SSMS and Include Actual Execution Plan

Use SQL Profiler to log the activity on SQL Server

(32)

Analysis Chaos

Locating performance bottlenecks

KISS architecture

Visualization – Representing the data

Case – Maersk Drilling

Future technologies – potential in the DSS and BI realm

Microsoft Business Intelligence

(33)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 33

Architecture KISS

• Plan:

Existing Architecture

– Alternative Architecture

– Lessons learned

(34)

Existing Architecture

Names Insurance Customer Service Contibution Ext Ext Ext Ext Hist Hist Hist Hist 3NF EDW Source systems 1:1 Historical changes

40 minutes

Custom component:

Slowly Changing

Dimension on facts!

Aktuelle Views

Afledt periodisering

(35)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 35

Afledt Periodisering

Sag nr 127

Jens Hansen

Sag nr 127

(gammel)

Jens Hansen

(gammel)

Sag nr 127

Jens Jensen

• Possibly cascading through large parts of Datawarehouse

• Slow

(36)

Existing Architecture

Ext Ext Ext Ext Hist Hist Hist DSH 3NF EDW DMV DMV Insurance Cube Source systems 1:1 Historical changes Member Cube

Actual Views

Afledt periodisering

40 minutes

Custom component:

Slowly Changing

Dimension on facts!

Views on views

Denormalizing

Processing

SQL Query Optimizer

works bad on these

complicated views

Names Insurance Customer Service Contibution

(37)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 37

Dimensions in Insurance Cube

(38)

Architecture KISS

• Plan:

– Existing Architecture

Alternative Architecture

(39)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 39

Alternative design

DSH DIM EDW Insurance Cube Member Cube No historic changes in source: Delta

Load using Created date

Historic changes in source: Delta Load using Last Modified date + delete old rows

Historic changes in source and no Last Modified Date Source systems

NO Actual Views

NO Afledt periodisering

Names Insurance Customer Service Contibution

(40)

Architecture KISS

• Plan:

– Existing Architecture

– Alternative Architecture

(41)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 41

Lessons learned

• Do the requirements come from the business (Extract,

Historical changes, 3NF EDW)?

• 3NF vs. Dimensional EDW

• Pragmatic approcah

• Maintenance / further development

– New/changed field:

– Original architecture: 8 places.

– Alternative architecture: 3 places

• Smaller cubes and fewer in each dimensions (Perspective)

• KISS

(42)

Analysis Chaos

Locating performance bottlenecks

KISS architecture

Visualization – Representing the data

Case – Maersk Drilling

Future technologies – potential in the DSS and BI realm

Microsoft Business Intelligence

(43)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 43

Dashboards

• A dashboard is

”A visual display of the most information needed to achieve one or

more objectives which fits entirely on a single computer screen so

it can be monitored at a glance”

• A dashboard must provide an acurate overview of the data in seconds

• Dashboards should always provide unambigious up-to-date data.

• Dashboards can effectively impact the strategy of a corporation by

visually presenting goals and status

(44)
(45)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 45

(46)

Pie-charts – a pet peeve

Pie-charts are rarely ever effective at presenting

data

Pie-chart of Market

share

Same data but using simple bar

charts

(47)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 47

(48)

Dashboard challenges

Be carefull to avoid

• Scrolling – prevents users of ever getting an overview

• Space wasters – fancy dashboard designs

• Pie charts – non-intuitive data presentation

• Misused terminology – make sure to use terms that the entire

business agrees on

• Data errors – Both from source data, but also from in our own

calculations

Once a user looses trust in your dashboards, it is very hard

to gain it back!

(49)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 49

Analysis Chaos

Locating performance bottlenecks

KISS architecture

Visualization – Representing the data

Case – Maersk Drilling

Future technologies – potential in the DSS and BI realm

Microsoft Business Intelligence

(50)

Analysis Chaos

Locating performance bottlenecks

KISS architecture

Visualization – Representing the data

Case – Maersk Drilling

Future technologies – potential in the DSS and BI realm

Microsoft Business Intelligence

(51)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 51

Trends in Business Intelligence

• Search in Business Intelligence

– WolframAlpha

• Enrich data warehouse with data from online sources

– Azure datamarket (

http://datamarket.azure.com

)

– Amazon AWS (

http://aws.amazon.com

)

– Digitalisér (

http://digitaliser.dk/

)

– Etc.

• True real time Business Intelligence

– StreamInsight

• Self-serviced Business Intelligence

– PowerPivot

(52)

Trends in Business Intelligence: Search

Search in Business Intelligence

(53)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 53

Trends in BI: Add GPS coordinates from address (1/2)

(54)

Trends in BI: Add GPS coordinates from address (2/2)

Use for store location, sale heat map, etc.

(55)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 55

Trends in BI: : True Realtime BI with Microsoft StreamInsight

Applicable when data process is: Event driven, time sensitive, high volumes of data, no need

for persisting data

Used in call centers, financial markets, etc.

(56)

Future Technologies - Pervasive BI

• How do you administer 1.000 users with their own analytical tools?

• What happened to ”

single version of the truth

”?

• Visualization for

non-analysts

• Data warehouse still the main

centralized datastore (MDM?)

• How long will I have a job?

(57)

© 2011 Netcompany A/S Netcompany A/S · Grønningen 19 · 1270 København K · Tlf. 70131440 Side 57

References

• The Data Warehouse Lifecycle Toolkit – Ralph Kimball

• Building the Data Warehouse – W. H. Inmon

• The Visual Display of Quantitative Information - Edward Tufte

• Information Dashboard Design – Stephen Few

• Microsoft BI Conference – Seattle

• Database System Concepts - Silberschatz

• Netcompany References -

http://www.netcompany.com/losninger/business_Intelligence.

(58)

Questions?

David Flenstrup

Senior Consultant

mobil: +45 2228 2945

e-mail: dfl@netcompany.com

Kristian Staalø Knudsen

Senior Consultant

mobil: +45 2510 1280

e-mail: ksk@netcompany.com

References

Related documents

As part of the General Education Program at Kennesaw State University, this course is designed to provide students with basic knowledge of world history expected of an

The real GDP growth, the ratio between domestic money stock and foreign exchange reserves and delays of the real exchange rate, relative to its average over the previous

CUMBERLAND SERVICE,

This survey course introduces students to the major forces and dynamics of change in the modern history of Africa, from the late 19th century to the present. The primary focus is

if after receiving benefits, you return to work for less than 6 months only to become disabled again from the same cause, you will not have to go through another elimination period

This survey course introduces students to the major landmarks in the making of African history and societies from the earliest times to 1870 AD. Beginning with the dual premises

HIST 1110 Survey of World Civilization I HIST 1210 World History I HIST 1120 Survey of World Civilization II HIST 1220 World History II HIST 2010 Survey of American History I HIST

Widgets display data to user in a better manner by using different data layout components like grid, charts, tree, forms, and so on.. In widgets based RIA, the focus is on