Data
Warehouse Design
Part 06
Technology
Data Warehouse Design
Part 06
Data
Warehouse Design
Part 06
Data
Warehouse Design
Part 06
Technology
Data Warehouse Design
Goal:To introduce the students to the idea of data warehouse
To show how it can be designed To give an example
To point some of the difficulties and challenges involved
Data
Warehouse Design
Part 06
Technology
Data Warehouse Design – Part 06
Contents:Metadata
Data
Warehouse Part 06
Technology
Metadata
What you wanted to know about metadata
but never dared to ask
What do we know:
We know Metadata
Rhymes with Petadata
Data Warehouse Part 06 Technology
Metadata
Meta means
”on higher level”
”definitions of ”
”description of ”
Data Warehouse Part 06 Technology
Metadata
A database system has metadata in system tables.Data
Warehouse Part 06
Technology
Metadata
E.F. Codd’s 12 rules . . .
Rule 1: The Information RuleAll data should be presented to the user in table form.
Rule 4: Dynamic On-Line Catalog Based on the Relational Model.
A relational database must provide access to its structure through the same tools that are used to access the data. This is usually accomplished by
storing the structure definition within special system tables.
Data
Warehouse Part 06
Data
Warehouse Part 06
Metadata
The term metadata is a bit vague. It usually includes
• Description about the data in the DW
• type, format, name, description of each item It could include a lot more
Data
Warehouse Part 06
Metadata - could include
•
Information about source systems and the meaning of data from them• Business rules to be applied when reading, transforming, joining data and calculating derived values
• Business rules used in reporting and analysis
• Quality assessment: how complete is any source, how well validated, how reliable
• Who is the owner of data
Data
Warehouse Part 06
Metadata
•
Metadata can and should be used in all the application of the DW• The applications should be metadata-driven •
• The amount of metadata can be very large.
Data
Warehouse Part 06
Technology
There’s always a time for discussion
A time to be sowing
A time to be growing
A time to be courting
A girl of your own
Twas so good to be young then
In the wake of the earth
And to stand by your wife
At the moment of birth
Data
Warehouse Part 06
Technology
Data
Warehouse Part 06
Technology
So. Let us build a Data Warehouse
- And survive
Data
Warehouse Part 06
Technology
So. Let us build a Data Warehouse
Data
Warehouse Part 06
Technology
So. Let us build a Data Warehouse
The concept is over 10 years old. It is widely done.
Data
Warehouse Part 06
Technology
So. Let us build a Data Warehouse
Before we begin the project
Make sure the normal prequisities are there •The support from the top
•Clearly defined responsibilities •Genuine business need
•And so on But:
Data
Warehouse Part 06
Technology
Political problems
Is the project a Data Warehouse project?
It must be called that or it must not be called that Why “must not”:
• There is already a “Data Warehouse” • The previous DW project failed
• Similar projects have failed elsewhere
Data Warehouse Part 06 Technology
Political problems
It could be called• Analytic and Reporting Data Store • Detailed Extract Database
• Integrated Detail Layer
• Common History Datastore • Information Database
• Reporting Database • and so on . . .
Data
Warehouse Part 06
Technology
For whom should it be
• A Data Warehouse is for rather general purpose. • Not for one specific use for a small group.
Data
Warehouse Part 06
Technology
Business issues
At the beginning it might be a good idea to do something more at the same time (or before)
• Process redesign of information delivery • Any isolated islands
• Improved management decision systems • Customer focus – pay attention to this • Any new applications around / needed ? • Risk management
• Any new management visions
Data
Warehouse Part 06
We need it now
• A large, expensive, and long project may not be popular
• Deliver results (and money) quickly
• Does this spell “A quick little data mart”
• With a forethought the project can solve the
immediate need and prepare for the future at the same time
•Try to avoid building multiple independent data marts • Better be consistent
• Build in stages, based on an overall vision and plan for the longer term
Data
Warehouse Part 06
We need it now
• It is a good idea to begin with a limited selection of the data
A1 A2 A3 A4 A5 A6 A7
Data
Warehouse Part 06
We need it now
• It is a good idea to begin with a limited selection of the data
A1 A2 A3 A4 A5 A6 A7
We need it now
• It is a good idea to begin with a limited selection of the data
May be difficult to limit
Data
Warehouse Part 06
Technology
We need it now
• It is a good idea to begin with a limited selection of the data
May be difficult to limit
Data
Warehouse Part 06
Technology
Data
Warehouse Part 06
What information will there be?
• Good to already have conceptual and logical data models
• these can be used to understand the operational data sources
• Starting from scratch one has to construct a logical data model of the existing situation
• Gather subjects, entities and attributes
Data
Warehouse Part 06
About attributes
• Give each attribute a unique name
• Define the attributes
• type, domain, field length, . . .
• Identify the data source for this attribute
• Define the transformation needed for this attribute • Define the checking needed for this attribute
• Understand the timing aspects of this attribute • update cycle, time constraints, retrospection • Document that you have found and done
• This will form the metadata later on
Data
Warehouse Part 06
Being consistent
Try to design and define standards applicable for the whole data warehouse environment
• Naming the items
• E.G. Hungarian notation
• First characters describe the use or the contents • ID – some sort of Id or key: IDCUSTOMER
• NA – name sort of: NACUSTOMER
• EU – monetary attribute: EUCUSTOMER
• Checking routines • Input routines
Data
Warehouse Part 06
How big is too big?
How much can the project handle at one time • Some alarming points might be
• More than 15 source data tables
• More than 2 source application systems • Multiple complex transformations
• Many additive measures • more than 3 or 4 facts Try to concentrate.
Find trouble before it finds you.
Data
Warehouse Part 06
What to do with too big steps?
Some things to consider
• Break a large work into smaller sections • Deliver each one in turn
• Limit the subjects
• Delay some difficult reporting to a later stage • Often requirements change anyway
• Prioritize based on business value
• The highest value may be the highest project risk
• In that case secure considerable commitment from the sponsor
Data
Warehouse Part 06
Technology
Deliver value on the way
If at all possible provide something for the needy.
Some useful part of the total may be extremely valuable
Once you have some live users you have a production Data Warehouse. The deliverables should be right
Data
Warehouse Part 06
Technology
Deliver value on the way
If at all possible provide something for the needy.
Some useful part of the total may be extremely valuable
Once you have some live users you have a production Data Warehouse. The deliverables should be right
Data
Warehouse Part 06
On practical issues and constraints
There are some very practical issues
1. Job scheduling
• How to schedule the data capture • What cycle for the DW ?
• What cycles there are in the operational systems • Understand the dependencies
• What effect retrospection ?
• The date the DW has new data
• correct and ready to use or about such
Data
Warehouse Part 06
On practical issues and constraints
There are some very practical issues
2. Disk Space
• “One cannot be too rich or have too much disk space” • Figure the size of your tables, factor in the growth
• How about • work files
• summary files
• Multiple dimension data bases (MDDB’s) • Areas needed for exploitation tools
• metadata
Data
Warehouse Part 06
On practical issues and constraints
There are some very practical issues
3. Political issues
• “Every file has an owner” • Find the owner
• Who is responsible for data being correct ? • The Data Warehouse is for everyone
• should be treated as such • Usage rights
• who is allowed this or that •
Data
Warehouse Design
Part 06
Technology
Data
Warehouse Design
Part 06
Technology
Data
Warehouse Design
Part 06
Technology
I get up and I see the sunrise. And I feel good, yeah
For my life that begun. You and me were free We went out to see From morning, ooh
Till the end of the day.
The Kinks This is the end