• No results found

Chapter 7 Multidimensional Data Modeling (MDDM)

N/A
N/A
Protected

Academic year: 2021

Share "Chapter 7 Multidimensional Data Modeling (MDDM)"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

Chapter 7

Multidimensional Data

Modeling (MDDM)

(2)

Learning Objectives Learning Outcomes

1. To assess the capabilities of OLTP and OLAP systems

(a) Appreciate the differences between OLTP and OLAP systems

2. To understand Dimensional Modeling (DM)

(b) Understanding of DM, basics of data warehousing and related terminology; also an overview of what facts and dimensions are

3. To learn data warehousing in further detail through a case-study

(c) Understand how to convert an OLTP schema into a dimensional schema model through various techniques of data warehousing

Learning Objectives and Learning Outcomes

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(3)

Lecture time : 90 minutes approx.

Q/A : 15 minutes

Session Plan

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(4)

Agenda

• Database Concepts – Recap

• Introduction to On-line Analytical Processing (OLAP)

• Multidimensional Data Modeling (MDDM)

– To Answer Why ? Where ? When ? and How ?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(5)

Recap

• Databases and Tables

• Normalization and Keys

– ACID Properties

– Primary, Foreign and Surrogate Keys

– Cardinality

• Transactions

– On-Line Transaction Processing

– On-Line Analytical Processing

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(6)

Recap (contd.)

• Difference between OLTP and OLAP

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(7)

Comparison of OLTP and DSS

OLTP Capability Examples DSS Capability Examples • Search & locate student(s)

• Print student scores

• Filter students above 90% marks

• Update student Grade

• Group by Batch and

compute average score

• Find top 10 high

performance students

• Which courses have productivity impact on-the-job?

• Which colleges need to be rewarded for supplying students with consistent high on-the-job performance?

• What is the customer satisfaction improvement due to extended training?

• How project level profitability is influenced by certification?

• How much training is needed on future technologies for non-linear growth in BI?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(8)

Still a little fuzzy about what OLAP can do?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(9)

Introduction to On-line Analytical Processing (OLAP)

Scenario:

Internal systems department at Infosys maintains all relevant data in a database. Conceptual schema is as shown below.

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(10)

OLAP Contd.

• CEO of the company wants the following information from the IS department.

• Number of employees added in the role of the company during the last quarter/6 months/1 year

Q1. How many table(s) is/are required ?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(11)

Q2. How many employees are currently in the projects ? Q3. How many are on bench?

OLAP Contd.

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(12)

Q4. Which customer from a region has given maximum business during the previous quarter on a domain under specific technology and who are the PMs of the project and assets owned by them.

OLAP Contd.

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(13)

SOLUTION?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(14)

So if there were very few updates and more of data-retrieval queries being made on your database, what do you think would be a better schema to adopt?

OLTP or OLAP

Answer a Quick Question

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(15)

Introduction to Dimensional Modeling (DM)

• DM is a logical design technique used in Data Warehouses (DW). It is quite directly related to OLAP systems

• DM is a design technique for databases intended to support end-user queries in a DW

• It is oriented around understandability, as opposed to database administration

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(16)

However, before we actually jump into MDDM…

let’s first understand the language of

Dimensional Modeling

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(17)

MDDM Terminology

• Grain • Fact • Dimension • Cube • Star • Snowflake

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(18)

Of hierarchies and levels…

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(19)

What Is a Grain?

• Identifying the grain also means deciding the level of detail that will be made available in the dimensional model

Granularity is defined as the detailed level of information stored in a table

• The more the detail, the lower is the level of granularity

• The lesser the detail, higher is the level of granularity

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(20)

What can be measured, can be controlled…

…and do you know how such measurements are stored in a data warehouse?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(21)

Facts and Fact Tables

• Consists of at least two or more foreign keys

• Generally has huge numbers of records

• Useful facts tend to be numeric and additive

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(22)

Fact Types

Facts Additive Non Additive Semi Additive Factless Fact

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(23)

And what about descriptive data?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(24)

What Are Dimensions/Dimension Tables?

• The dimensional tables contain attributes (descriptive) which are typically static values containing textual data or discrete numbers which behave as text values.

• Main functionalities :

• Query filtering\constraining

• Query result set labeling

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(25)

Types of Dimensions

Dimension Type Slowly Changing Dimension Rapidly Changing Dimension Degenerate Dimension Junk (garbage) Dimension Role-playing Dimension

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(26)

Understanding Dimension

Cube

• An extension to the two-dimensional Table.

• For example in the previous scenario CEO wants a report on revenue generated by different services across regions during each quarter

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(27)

Understanding Dimension

Cube (contd.)

Dimension Hierarchy Grain Fact Testing Consulting Production Support N .America Europe Asia Pacific Q1 Q2 Q3 Q4

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(28)

Can a data warehouse schema take different forms with respect to normalization?

Answer a Quick Question

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(29)

Star Schema

The basic star schema contains four components. These are:

Fact table, Dimension tables, Attributes and Dimension hierarchies

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(30)

Snow Flake Schema

• Normalization and expansion of the dimension tables in a star schema result in the implementation of a snowflake design.

• A dimension table is said to be snow flaked when the low-cardinality attributes in the dimension have been removed to separate normalized tables and these normalized tables are then joined back into the original dimension table

.

(31)

Snow Flaking Example

• Consider the Normalized form of Region dimension

Region RegionID Country Code State Code City Code Country Country Code Country Name State Code

City Code State

State code

State Name

City Code City

City Code City Name ZIP

(32)

Armed with these weapons that we call ‘Concepts’, let’s

step into the battlefield!

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(33)

Case Study

Conversion of a ER Model to a Dimensional Model

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(34)

ORDER order_num (PK) customer_ID (FK) store_ID (FK) clerk_ID (FK) date STORE store_ID (PK) store_name floor_type address district CLERK clerk_id (PK) clerk_name clerk_grade PRODUCT SKU (PK) Description category brand CUSTOMER customer_ID (PK) customer_name credit_profile purchase_profile address PROMOTION promotion_NUM (PK) promotion_name ad_type price_type ORDER-LINE order_num (PK) (FK) SKU (PK) (FK) promotion_key (FK) dollars_cost dollars_sold units_sold ER Diagram

(35)

TIME time_key (PK) SQL_date day_of_week month STORE store_key (PK) store_ID store_name floor_type address district CLERK clerk_key (PK) clerk_id clerk_name clerk_grade PRODUCT product_key (PK) SKU category description brand CUSTOMER customer_key (PK) customer_name credit_profile purchase_profile address PROMOTION promotion_key (PK) promotion_name ad_type price_type FACT time_key (FK) customer_key (FK) store_key (FK) clerk_key (FK) product_key (FK) promotion_key (FK) dollars_cost dollars_sold units_sold DIMENSONAL MODEL

(36)

Summary

• Basics of Database • OLTP • MDDM • Cube • Star Schema • Snowflake schema

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

(37)

Food for Thought!

1. Who according to you would be the user of OLAP?

2. Who would need the Multidimensional perspective of data?

“Fundamentals of Business Analytics”

RN Prasad and Seema Acharya Copyright  2011 Wiley India Pvt. Ltd. All rights reserved.

References

Related documents

Tandelilin (2000) also stated that macroeconomic factors empirically have been shown to have an influence on capital market conditions. According to aforementioned descriptions,

We studied the independent effects of amount of woody habitat in the landscape and three levels of isolation from the next woody habitat (patch isolation) on trap nesting bees,

The object of this study is the billboard (as one type of out-of-home advertising) in Yogyakarta which local products and places of interest become its content..

This study examines the effect of exogenous isoprene (20 ppbv) on net photosynthesis, stomatal conductance and production of H 2 O 2 (a reactive oxygen species: ROS) in leaves

The outbreak began on Wednesday the 24th of December 2014, when NHS Greater Glasgow and Clyde (NHS GGC) health board Public Health Protection Unit (PHPU) were notified that a

The first robot programmes, like for example STRIPS developed at Stanford Research Institute, also followed the symbolic tradition by representing all the knowledge of the robot

Required Majors Engineering, Finance, HR, Business, Sales & Market- ing, Administration, Management. Company Profile Management Plus Recruiters (MPlus) was established

Straddling the gulf between vague customer ideas and the clear specifications that will guide the software team’s work, the analyst must first understand the users’ goals for the