• No results found

Data Warehousing Fundamentals

N/A
N/A
Protected

Academic year: 2021

Share "Data Warehousing Fundamentals"

Copied!
380
0
0

Loading.... (view fulltext now)

Full text

(1)

... 50102GC20 Production 2.0 May 1999 M08761

Data Warehousing

Fundamentals

(2)

Authors Chon S. Chua Richard Green Technical Contributors and Reviewers Jackie Collins Jennifer Jacoby Mike Schmitz John Haydu Russ Pitts Lauran Serhal Brian Pottle Donna Corrigan Patricia Moll Harry Penbert SuiWah Chan Joel Barkin Steve Dressler Publisher Tony McGettigan

Copyright  Oracle Corporation, 1999. All rights reserved.

This documentation contains proprietary information of Oracle Corporation. It is provided under a license agreement containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the software is prohibited. If this documentation is delivered to a U.S. Government Agency of the Department of Defense, then it is delivered with Restricted Rights and the following legend is applicable:

Restricted Rights Legend

Use, duplication or disclosure by the Government is subject to restrictions for commercial computer software and shall be deemed to be Restricted Rights software under Federal law, as set forth in subparagraph (c) (1) (ii) of DFARS 252.227-7013, Rights in Technical Data and Computer Software (October 1988). This material or any portion of it may not be copied in any form or by any means without the express prior written permission of Oracle Corporation. Any other copying is a violation of copyright law and may result in civil and/or criminal penalties.

If this documentation is delivered to a U.S. Government Agency not within the Department of Defense, then it is delivered with “Restricted Rights,” as defined in FAR 52.227-14, Rights in Data-General, including Alternate III (June 1987). The information in this document is subject to change without notice. If you find any problems in the documentation, please report them in writing to Education Products, Oracle Corporation, 500 Oracle Parkway, Box SB-6, Redwood Shores, CA 94065. Oracle Corporation does not warrant that this document is error-free. Data Warehouse Method—A Methodology for Designing Data Warehouse, SQL*Loader, PL/SQL, Pro*C, Oracle7, Oracle8, and Oracle8i, Distributed Option, Parallel Query Option, Parallel Server Option, Media Server, Spatial Data Option, ConText Option, Video Server, Text Server, WebServer, Oracle Universal Server ROLAP Option, Express Server, Web-enabled Express Server, SQL*Net, Developer/2000, Relational Access Manager, Discoverer, Designer/2000, SQL*Bridge, Transparent Gateway Developer’s Kit, Procedural Gateway Developer’s Kit, Express, Express Analyzer, Express Objects, Sales Analyzer, and Financial Analyzer are product names, trademarks, or registered trademarks of Oracle Corporation.

All other products or company names are used for identification purposes only and may be trademarks of their respective owners.

(3)

...

Data Warehousing Fundamentals iii

...Contents

Preface

Profile xi

Related Publications xiv Typographic Conventions xv

Lesson 1: Introduction

Course Objectives 1-3 Agenda 1-5

Questions About You 1-9

Lesson 2: Meeting a Business Need

Overview 2-3

Unsuitability of OLTP Systems for Complex Analysis 2-5 Management Information Systems and Decision Support 2-7 Data Extract Processing 2-9

Business Drivers for Data Warehouses 2-15

Current Situation and Growth of Data Warehousing 2-19 Typical Uses of a Data Warehouse 2-21

Summary 2-23 Practice 2-1 2-25

Lesson 3: Defining Data Warehouse Concepts and Terminology

Overview 3-3

Data Warehouse Definition 3-5 Data Warehouse Properties 3-7 Data Warehouse Terminology 3-21 Components of a Data Warehouse 3-25

Oracle Warehouse Vision, Products, and Services 3-31 Summary 3-41

Practice 3-1 3-43

Lesson 4: Driving Implementation Through a Methodology

Overview 4-3

Warehouse Development Approaches 4-5

(4)

... Contents

Oracle Data Warehouse Method 4-15 DWM Fundamental Elements 4-19

Oracle Warehouse Technology Initiative (WTI) 4-57 Summary 4-61

Practice 4-1 4-63

Lesson 5: Planning for a Successful Warehouse

Overview 5-3

Managing Financial Issues 5-5 Obtaining Business Commitment 5-9 Managing a Warehouse Project 5-15 Identifying Planning Phases 5-29

Identifying Warehouse Strategy Phase Deliverables 5-31 Identifying Project Scope Phase Deliverables 5-35 Summary 5-41

Practice 5-1 5-43

Lesson 6: Analyzing User Query Needs

Overview 6-3 Types of Users 6-5

Gathering User Requirements 6-7 Managing User Data Access 6-9 Security 6-21

OLAP 6-25

Query Access Architectures 6-47 Summary 6-51

Practice 6-1 6-53

Lesson 7: Modeling the Data Warehouse

Overview 7-3

(5)

...

Data Warehousing Fundamentals v

...Contents Summary 7-41

Practice 7-1 7-43

Lesson 8: Choosing a Computing Architecture

Overview 8-3

Architecture Requirements 8-5 The Hardware Architecture 8-7 Database Server Requirements 8-29 Parallel Processing 8-33

Summary 8-39 Practice 8-1 8-41

Lesson 9: Planning Warehouse Storage

Overview 9-3

The Server Data Architecture 9-5 Protecting the Database 9-17 Summary 9-27

Practice 9-1 9-29

Lesson 10: Building the Warehouse

Overview 10-3

Extracting, Transforming, and Transporting Data 10-5 Extracting Data 10-13

Examining Data Sources 10-15 Extraction Techniques 10-23 Extraction Tools 10-35 Summary 10-39 Practice 10-1 10-41

Lesson 11: Transforming Data

Overview 11-3

Importance of Data Quality 11-5 Transformation 11-13

Transforming Data: Problems and Solutions 11-17 Transformation Techniques 11-33

(6)

... Contents

Transformation Tools 11-53 Summary 11-57

Practice 11-1 11-59

Lesson 12: Transportation: Loading Warehouse Data

Overview 12-3

Transporting Data into the Warehouse 12-5 Building the Transportation Process 12-11 Transporting the Data 12-15

Postprocessing of Loaded Data 12-25 Summary 12-39

Practice 12-1 12-41

Lesson 13: Transportation: Refreshing Warehouse Data

Overview 13-3

Capturing Changed Data 13-5

Limitations of Methods for Applying Changes 13-25 Purging and Archiving Data 13-33

Final Tasks 13-39

Selecting ETT Tools 13-43 Summary 13-51

Practice 13-1 13-53

Lesson 14: Leaving a Metadata Trail

Overview 14-3

Defining Warehouse Metadata 14-5 Developing a Metadata Strategy 14-11 Examining Types of Metadata 14-19 Metadata Management Tools 14-33 Common Warehouse Metadata 14-35 Summary 14-37

(7)

...

Data Warehousing Fundamentals vii

...Contents Business Intelligence 15-5

Multidimensional Query Techniques 15-7 Categories of Business Intelligence Tools 15-9 Data Mining in a Warehouse Environment 15-19 Oracle Data Mining Partners 15-33

Summary 15-35 Practice 15-1 15-37

Lesson 16: Web-Enabling the Warehouse

Overview 16-3

Accessing the Warehouse Over the Web 16-5 Common Web Data Warehouse Architecture 16-9

Issues in Deploying a Data Warehouse on the Web 16-11 Evaluating Web-Based Tools 16-19

Summary 16-23 Practice 16-1 16-25

Lesson 17: Managing the Data Warehouse

Overview 17-3

Managing the Transition to Production 17-5 Managing Growth 17-19

Managing Backup and Recovery 17-33

Identifying Data Warehouse Performance Issues 17-45 Summary 17-51

Appendix A: Practice Solutions

Practice 2-1 A-2 Practice 3-1 A-4 Practice 4-1 A-7 Practice 5-1 A-11 Practice 6-1 A-12 Practice 7-1 A-13 Practice 8-1 A-14 Practice 9-1 A-15

(8)

... Contents Practice 10-1 A-18 Practice 11-1 A-20 Practice 12-1 A-21 Practice 13-1 A-23 Practice 14-1 A-24 Practice 15-1 A-26 Practice 16-1 A-28 Glossary

(9)

...

Preface

(10)
(11)

...

Data Warehousing Fundamentals xi

...Profile

Profile

Before You Begin This Course

This course is the entry-level course in the Data Warehousing curriculum. Therefore, there are no prerequisites to this course.

Prerequisites

There are no prerequisites for this course.

How This Course Is Organized

Data Warehousing Fundamentals is an instructor-led course featuring lecture and

paper and pencil exercises as well as group discussions to reinforce the concepts and skills introduced.

Lesson Aim

Lesson 1: Introduction

In this lesson, the class format is reviewed, the class agenda is described, and students introduce themselves. Because this class is expected to appeal to a broad audience, the introduction will give the instructor an idea of the composition of the class in terms of data warehouse knowledge, Oracle knowledge, and the specific role that each student plays with regard to data warehousing. Lesson 2: Meeting a

Business Need

This lesson examines how data warehousing has evolved from early management information systems to today’s decision support systems. The primary motivating factors for data warehouse creation are explored. The types of industries employing data warehouse are considered.

Lesson 3: Defining Data Warehouse Concepts and Terminology

This lesson introduces the Oracle definition of a data warehouse. The lesson offers a general description of the properties of a data warehouse. The standard components and tools required to build, operate, and use a data warehouse are identified.

Lesson 4: Driving Implementation Through a Methodology

This lesson introduces the Oracle Data Warehouse Method (DWM), a methodology employed by Oracle Consulting Services for incremental development of a total warehouse solution using a phased development approach. Partnering initiatives launched by Oracle are described.

Lesson 5: Planning for a Successful Warehouse

This lesson introduces the planning that is critical to the success of a data warehouse project. Planning phases, deliverables, and project roles are identified. Overall warehouse strategy and project scope are defined.

(12)

... Preface

Lesson 6: Analyzing User Query Needs

This lesson identifies the analysis required to identify and

categorize users that may need to access data from the warehouse, and how their requirements differ. Data access and reporting tools are considered.

Lesson 7: Modeling the Data Warehouse

This lesson examines the role of data modeling in a data

warehousing environment. The lesson presents a very high level overview of warehouse modeling steps. You consider the different types of models that can be employed, such as the star schema. Tools available for warehouse modeling are introduced.

Lesson 8: Choosing a Computing

Architecture

This lesson examines the computer architectures that commonly support data warehouses. The benefits of each hardware

architecture and reasons for using distributed warehouses are examined. Students examine the technology requirements of a database server for warehousing.

Lesson 9: Planning Warehouse Storage

This lesson examines the database setup and management issues such as partitioning, indexing, and ways to protect your database. Lesson 10: Building

the Warehouse

In this lesson, you explore the sources of data for the data warehouse data. You consider how the extraction and transformation processes take data from source systems and change it into data that is acceptable to the users of the data warehouse. The lesson also describes typical data anomalies and looks at ways to eliminate them.

Lesson 11: Transforming Data

In this lesson, you explore how the transformation process

transforms data from source systems into data suitable for end user query and analysis applications.

Lesson 12: Transportation: Loading Warehouse Data

In this lesson, you examine how the extracted and transformed data is transported into the warehouse.

Lesson 13: Transportation: Refreshing Warehouse Data

In this lesson, you examine methods for updating the warehouse with changed data, after the first-time load.

(13)

...

Data Warehousing Fundamentals xiii

...Profile

Lesson 14: Leaving a Metadata Trail

This lesson focuses on the concept of warehouse metadata, and the role it plays in a well-developed and managed warehousing environment.

Lesson 15:

Supporting End-User Access

This lesson investigates the ways that users may access the data in the data warehouse. Students are introduced to the concept of business intelligence. The lesson discusses the discovery model used by mining tools, and the reasons enterprises are looking at data mining solutions for discovery of information.

Lesson 16: Web-Enabling the Warehouse

This lesson discusses how to take advantage of the Web to deploy data warehouse information. It addresses internal and external access, as well as the advantages of Web-enabling a data

warehouse. The lesson outlines the steps involved in deploying a enabled data warehouse. Challenges in deploying a Web-enabled data warehouse are also discussed.

Lesson 17: Managing the Data Warehouse

This lesson explores the management issues, critical success factors, and challenges to successful data warehouse

implementation. The lesson addresses issues pertaining to the management of the entire warehouse life cycle.

(14)

... Preface

Related Publications

Oracle Publications

Additional Publications

Oracle DBA Handbook, Loney, Kevin, Osborne McGraw-Hill; ISBN: 007882406.

Oracle: The Complete Reference, Koch, George and Kevin Loney; Oracle Press;

ISBN: 007882396X.

The Data Warehouse Toolkit, Kimball, Ralph; John Wiley & Sons; ISBN:

0471153370.

Building the Data Warehouse, Inmon, W.; John Wiley & Sons; ISBN:

0471141615.

Oracle8 Data Warehousing, Dodge, Gary and Gorman, T.; John Wiley & Sons;

ISBN: 0471199524.

The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing,

Developing, and Deploying Data Warehouses, Kimball, Ralph and others; John

Wiley & Sons, 1998; ISBN: 0471255475.

Data Warehouse Design Solutions, Adamson, C. and Venerable, M.; John Wiley &

Sons, 1998; ISBN 0-471-25195-X.

Data Warehousing:Architecture and Implementation, Humphries, M. et. al.,

Prentice Hall PTR, 1999; ISBN: 0-13-080902-0.

Web Sites

• Data Warehouse Institute Web site, at http://www.dw-institute.com/ index.htm

The Data Warehouse Information Center Web site, at http://

Title URL

Oracle8i for Data Warehousing: Fast and Simple for More Data and More Users (Nov 1998)

http://

websight.us.oracle .com

Large Scale Data Warehousing with Oracle8i, Winter Corporation Sponsored Research Program

http://

websight.us.oracle .com

(15)

...

Data Warehousing Fundamentals xv

...Typographic Conventions

Typographic Conventions

Typographic Conventions in Text

Convention Element Example

Bold italic Glossary term (if there is a glossary)

The algorithm inserts the new key.

Caps and lowercase Buttons, check boxes, triggers, windows

Click the Executable button.

Select the Can’t Delete Card check box. Assign a When-Validate-Item trigger . . . Open the Master Schedule window. Courier new, case sensitive (default is lowercase) Code output, directory names, filenames, passwords, pathnames, URLs, user input, usernames

Code output: debug.seti(’I’,300);

Directory:bin (DOS), $FMHOME (UNIX) Filename: Locate the init.ora file. Password: Use tiger as your password. Pathname: Open c:\my_docs\projects

URL: Go to http://www.oracle.com

User input: Enter 300

Username: Log on as scott

Initial cap Graphics labels (unless the term is a proper noun)

Customer address (but Oracle Payables)

Italic Emphasized words

and phrases, titles of books and courses, variables

Do not save changes to the database. For further information, see Oracle7 Server

SQL Language Reference Manual.

Enter [email protected], where

user_id is the name of the user.

Quotation marks Interface elements with long names that have only initial caps; lesson and chapter titles in cross-references

Select “Include a reusable module component” and click Finish.

This subject is covered in Unit II, Lesson 3, “Working with Objects.”

Uppercase SQL column

names, commands, functions, schemas, table names

Use the SELECT command to view information stored in the LAST_NAME column of the EMP table.

(16)

... Preface

Typographic Conventions in Code

Typographic Conventions in Navigation Paths

This course uses simplified navigation paths, such as the following example, to direct you through Oracle Applications.

(N) Invoice—>Entry—>Invoice Batches Summary (M) Query—>Find (B) Approve

This simplified path translates to the following:

1 (N) From the Navigator window, select Invoice—>Entry—>Invoice Batches Summary.

2 (M) From the menu bar, select Query—>Find.

3 (B) Click the Approve button.

Arrow Menu paths Select File—>Save.

Brackets Key names Press [Enter].

Commas Key sequences Press and release these keys one at a time: [Alt], [F], [D]

Plus signs Key combinations Press and hold these keys simultaneously: [Ctrl]+[Alt]+[Del]

Convention Element Example

Caps and lowercase Oracle Forms triggers

When-Validate-Item

Lowercase Column names,

table names

SELECT last_name FROM s_emp;

Passwords DROP USER scott

IDENTIFIED BY tiger;

PL/SQL objects OG_ACTIVATE_LAYER

(OG_GET_LAYER (’prod_pie_layer’))

Lowercase italic Syntax variables CREATE ROLE role

Uppercase SQL commands

and functions

SELECT userid FROM emp;

(17)

...

1

(18)

... Lesson 1: Introduction

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Course Objectives

After completing this course, you should be able to do the following:

Explain why data warehousing is a popular solution

Describe data warehousing terminology

Identify components of an implementation

Explain the important of employing a method

Identify modeling concepts

Identify the management and maintenance processes

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Course Objectives

Identify the hardware platforms that can be employed with a data warehouse

Identify the features of the database server

Identify tools that can be employed at each stage

Describe user profiles and techniques for querying the warehouse

Identify data warehouse implementation issues and challenges

(19)

...

Data Warehousing Fundamentals 1-3

...Course Objectives

Course Objectives

After completing this course, you should be able to the following:

• Explain why data warehousing is a popular solution in today’s information technology environment

• Describe the terminology used with data warehousing

• Identify the standard components of a data warehouse implementation

• Explain the importance of using a methodology for development, and specifically identify the phases of the Oracle Data Warehouse Method

• Identify and use data warehouse modeling concepts

• Identify the different processes required to manage and maintain the warehouse • Identify the hardware platforms that can be employed with a data warehouse • Identify the features required of a database server for a warehouse implementation • Identify the tools that can be used at each phase during the data warehouse

development cycle

• Describe user profiles and the techniques users may employ for querying the warehouse

• Identify data warehousing implementation issues and challenges • Position the products for the Oracle warehouse

(20)

... Lesson 1: Introduction

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data Warehousing Fundamentals

Day 1

Lesson 1 Introduction

Lesson 2 Meeting a Business Need

Lesson 3 Defining Data Warehouse Concepts and Terminology

Lesson 4 Driving Implementation Through a Methodology

Lesson 5 Planning for a Successful Warehouse

Lesson 6 Analyzing User Query Needs

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data Warehousing Fundamentals

Day 2

Lesson 7 Modeling the Data Warehouse

Lesson 8 Choosing a Computing Architecture

Lesson 9 Planning Warehouse Storage

Lesson 10 Building the Warehouse

Lesson 11 Transforming Data

Lesson 12 Transportation: Loading Warehouse Data

(21)

...

Data Warehousing Fundamentals 1-5

...Agenda

Agenda

Day 1

Lesson 1: Introduction

Lesson 2: Meeting a Business Need

Lesson 3: Defining Data Warehouse Concepts and Terminology Lesson 4: Driving Implementation Through a Methodology Lesson 5: Planning for a Successful Warehouse

Lesson 6: Analyzing User Query Needs

Day 2

Lesson 7: Modeling the Data Warehouse Lesson 8: Choosing a Computing Architecture Lesson 9: Planning Warehouse Storage

Lesson 10: Building the Warehouse Lesson 11: Transforming Data

(22)

... Lesson 1: Introduction

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data Warehousing Fundamentals

Day 3

Lesson 13 Transportation: Refreshing Warehouse Data

Lesson 14 Leaving a Metadata Trail

Lesson 15 Supporting End-User Access

Lesson 16 Web-Enabling the Warehouse

(23)

...

Data Warehousing Fundamentals 1-7

...Agenda

Day 3

Lesson 13: Transportation: Refreshing Warehouse Data Lesson 14: Leaving a Metadata Trail

Lesson 15: Supporting End-User Access Lesson 16: Web-Enabling the Warehouse Lesson 17: Managing the Data Warehouse

(24)

... Lesson 1: Introduction

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Questions About You

To tailor the class to your specific needs and to encourage dialog among all, please answer the following questions:

What is your name and company?

What is your role in your organization?

What is your level of Oracle expertise?

Why are you building a data warehouse or data mart?

(25)

...

Data Warehousing Fundamentals 1-9

...Questions About You

Questions About You

You will get a lot more out of this class if you are aware of the background of your classmates and the issues that they face in the development of a data warehouse. Each student has a unique perspective and an experience and knowledge set from which we can learn. Because this class is expected to appeal to a broad audience, the

introduction will give the instructor an idea of the composition of the class in terms of data warehouse knowledge, Oracle knowledge, and the specific role that each student plays with regard to data warehousing.

(26)

... Lesson 1: Introduction

(27)

...

2

(28)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

® Project Management

(Methodology, Maintaining Metadata) Defining DW Concepts & Terminology Planning for a Successful Warehouse Analyzing User Query Needs Choosing a Computing Architecture Modeling the Data Warehouse Planning Warehouse Storage Overview ETT (Building the Warehouse) Meeting a Business Need Meeting a Business Need Supporting End User Access Managing the Data Warehouse

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Objectives

After completing this lesson, you should be able to do the following:

Describe why an online transaction processing (OLTP) system is not suitable for complex analysis

Describe how extract processing for decision support querying led to data warehouse solutions employed today

Explain why businesses are driven to employ data warehouse technology

Identify some of the industries that employ data warehouses

(29)

...

Data Warehousing Fundamentals 2-3

...Overview

Overview

The top slide on the facing page is a road map representing the flow of the course. The vertical box entitled “Meeting a Business Need” emphasizes that the warehouse is business driven. The determination of the warehouse architecture, data model, and user query needs all stem from business requirements. The horizontal box running across the bottom represents the ongoing project management throughout the warehouse lifecycle.

This lesson examines how data warehousing has evolved from early management information systems to today’s decision support systems. The primary motivating factors for data warehouse creation are explored. The types of industries employing data warehouse are considered.

Objectives

After completing this lesson, you should be able to do the following:

• Describe why an online transaction processing (OLTP) system is not suitable for complex analysis

• Describe how extract processing for decision support querying led to data warehouse solutions employed today

• Explain why businesses are driven to employ data warehouse technology • Identify some of the industries that employ data warehouses

(30)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Characteristics of OLTP Systems

Characteristic OLTP

Typical operation Update

Level of analytical requirements Low

Screens Unchanging

Amount of data per transaction Small

Data level Detailed

Age of data Current

Orientation Records

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Why OLTP Is Not Suitable for Complex Analysis

Complex Analysis Historical information to analyze

Data needs to be integrated

Database design:

Denormalized, star schema OLTP

Information to support day-to-day service Data stored at transaction level

(31)

...

Data Warehousing Fundamentals 2-5

...Unsuitability of OLTP Systems for Complex Analysis

Unsuitability of OLTP Systems for Complex Analysis

Operational systems largely exist to support transactions, for example, the booking of an airline ticket.

Decision support, which is a type of complex analysis, is very different from OLTP. Most OLTP transactions require a single record in a database to be located and updated or an addition of one or more new records. Even a simple decision support query such as “How many luxury cars did we sell in Boston for January 1999” requires very different operations at the database level to an OLTP transaction. A potentially large number of records must be located, and there are no update operations at all.

Characteristics of OLTP Systems

The characteristics of OLTP systems are described below.

Why OLTP Is Not Suitable for Complex Analysis

OLTP databases are fully normalized and are designed to consistently store operational data, one transaction at a time. Complex analysis, on the other hand, requires database design that even business users find directly usable. To achieve this, a different database design techniques are required, for example the use of

dimensional and star schemas with highly denormalized dimension tables.

OLTP focuses on recording and completing different types of business transactions but is unable to provide decision makers with the information they need. The data needed for such complex analysis is scattered throughout different OLTP systems and must first be carefully integrated before the information needed can be obtained. Extracting the data from these OLTP systems demands so much of the system resources that the IT professional must wait until nonoperational hours before running the queries required to produce the report. Thus OLTP systems are not suitable for complex analysis because the database design is not optimized to run such queries. Additionally, OLTP systems do not have an integrated pool of data from all the operation systems within the enterprise in order for business users to derive complex analysis. Also, OLTP systems do not store historical data that is needed for complex analysis.

Characteristic OLTP

Typical operation Update

Level of analytical requirements Low

Screens Unchanging

Amount of data per transaction Small

Data level Detailed

Age of data Current

(32)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Management Information Systems and Decision Support

Operational reports Decision makers

Production platforms

MIS systems provided business data

Reports were developed on request

Reports provided little analysis capability

Decision support tools gave personal ad hoc access to data

Ad hoc access

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Analyzing Data from Operational Systems

Data structures are complex

Systems are designed for high performance and throughput

Data is not meaningfully represented

Data is dispersed

OLTP systems may be unsuitable for intensive queries

Operational reports Production platforms

(33)

...

Data Warehousing Fundamentals 2-7

...Management Information Systems and Decision Support

Management Information Systems and Decision Support

Early Management Information Systems

Early Management Information Systems (MIS) provided management with reports to assess the performance of the business. Report requirements were submitted as a request to the MIS development team, who developed the report and made it available to the user some time afterward—days, weeks, or even months later. The data in the reports was made available in a way that was difficult to use for analysis and

forecasting.

Personal Computing

With the advent of personal computing and 4GL programming techniques, MIS became known as decision support (decision support systems or DSS). DSS was judged to support business users better, by giving them direct access to the operational data for additional ad hoc querying, which provided more flexible reporting as the information was needed.

Analyzing Data from Operational Systems

Although decision support tools are friendly, intuitive, and easy to use, often the structure of data in the online transaction processing systems does not support the user’s real analytical requirements.

• The structure of the operational data is often complex and too highly structured (3NF).

• The system was designed for high performance—high throughput online transaction processing—rather than CPU-intensive analysis of information. • The data is not always meaningfully presented to the end user query tool.

• The same data elements may be defined differently for each operational system. For example, a customer record may hold the customer telephone number. In one system this number is stored as a 15-digit number, and on another as a 20

alphanumeric character value.

• Data is dispersed on multiple and diverse systems, leading to data redundancy and the inability to coordinate data between systems to provide a global picture of the business.

• Running online transaction processing and decision support concurrently on one machine degrades performance of the operational system, response time to users, and performance of networks. The overall impact on the operational system may be too great.

(34)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

End user computing offloaded from the operational environment

User’s own data

Data Extract Processing

Extracts

Operational systems Decision makers

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Management Issues

Extract explosion

Extracts

(35)

...

Data Warehousing Fundamentals 2-9

...Data Extract Processing

Data Extract Processing

DSS and Degradation

The problem of performance degradation was partially solved by using extract

processing techniques, which select data from one environment and transport it to

another environment for user access (a data extract).

Data Extract Program

The data extract program searches through files and databases, gathering data

according to specific criteria. The data is then placed into a separate set of files, which may reside on another environment, for use by analysts for decision support activities. Extract processing was a logical progression from decision support systems. It was seen as a way to move the data from the high-performance, high throughput online transaction processing systems onto client machines dedicated to analysis. Extract processing also gave the user ownership of the data.

Management Issues with Data Extract Programs Although the principle of extracts appears logical, and to some degree represents a model similar to the way a data warehouse works, there are problems with processing extracts.

Extract programs may become the source for other extracts, and extract management can become a full-time task for information systems departments. In some companies hundreds of extract programs are run at any time.

(36)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

® Productivity IssuesDuplicated effortMultiple technologiesObsolete reportsNo metadata

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data Quality Issues

No common time basis

Different calculation algorithms

Different levels of extraction

Different levels of granularity

Different data field names

Different data field meanings

Missing information

No data correction rules

(37)

...

Data Warehousing Fundamentals 2-11

...Data Extract Processing

Data Extract Program (continued)

Productivity Issues with Extract Processing The productivity issues in an extract processing environment are listed below:

• Extract effort is duplicated, because multiple extracts access the same data and use mainframe resources unnecessarily.

• The program designed to access the extracted data must encompass all technologies employed by the source data.

• A report cannot always be reused, because business structures change.

• There is no common metadata providing a standard way of extracting, integrating, and using the data.

Data Quality Issues with Extract Processing The data quality issues in an extract processing environment are listed below:

• The data has no time basis and users cannot compare query results with confidence. The data extracts may have been taken at a different point-in-time. • Each data extract may use a different algorithm for calculating derived and

computed values. This makes the data difficult to evaluate, compare, and

communicate by managers who may not know the methods or algorithms used to create the data extract or reports.

• Data extract programs may use different levels of extraction.

• Access to external data may not be consistent, and the granularity of the external data may not be well defined.

• Data sources may be difficult to identify, and data elements may be repeated on many extracts.

• The data field names and values may have different meanings in the various systems in the enterprise (lack of semantic integrity).

• There are no data correction rules to ensure that the extracted data is correct and clean.

(38)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

From Extract to Warehouse DSS

Controlled

Reliable

Quality information

Single source of data

Data warehouse Internal and

external systems

Decision makers

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Advantages of Warehouse Processing Environment

No duplication of effort

No need for tools to support many technologies

No disparity in data, meaning, or representation

No time period conflict

No algorithm confusion

(39)

...

Data Warehousing Fundamentals 2-13

...Data Extract Processing

Transitioning from Extract Processing Environment to Warehouse Processing Environment

There was a transition from decision support using data extracts to decision support using the data warehouse. The data warehouse is a complete environment that requires skill, knowledge, and commitment to put together, particularly for the very large scale enterprise implementation.

The data warehouse environment is more controlled and therefore more reliable for decision support than an extract environment. The data warehouse environment supports your entire decision support requirements by providing high-quality information, made available by accurate and effective cleansing routines and using consistent and valid data transformation rules and documented presummarization of data values. It contains one single source of accurate, reliable information that can be used for analysis.

Advantages of the Warehouse Processing Environment over the Extract Processing Environment The advantages of the warehousing processing environment are listed below:

• No duplication of effort

• No need to consider using a query and reporting tool that supports more than one technology

• No disparity with the data and its meaning • No disparity with the way data is represented • No conflict over the time periods employed

• No contention over the algorithms that have been used • No restriction on drill-down capabilities

(40)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Business Motivators

Know the business

Reinvent to face new challenges

Invest in products

Invest in customers

Retain customers

Invest in technology

Improve access to business information

Be profitable

Provide superior services and products

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Business Motivators

Provide supporting information systems

Get quality information – Reduce costs

– Streamline the business – Improve margins

(41)

...

Data Warehousing Fundamentals 2-15

...Business Drivers for Data Warehouses

Business Drivers for Data Warehouses

Businesses in the nineties face challenges such as regulatory control, competition, market maturity, product differentiation, customer behavior, and accelerated product life cycles, all of which require businesses to develop market awareness,

responsiveness, adaptability, innovation, efficiency, and quality.

Critical Success Factors for a Dynamic Business Environment

In order to succeed in an ever-changing business environment a company must: • Know both the market they are in and their business (internally and externally). • Reinvent themselves to face new challenges. This may be changing product

requirements, diverse and effective services, or even changes in internal organizational structures.

• Invest in research and development of new product channels.

• Invest in high-value customers who contribute greater returns to the business. • Retain existing customers and attract new customers.

• Invest in new technology to support business needs.

• Improve access to information so that they can make rapid decisions, based on an accurate picture of the business.

• Be profitable. At the same time, they must be able to invest in resources for the future, such as technology and people.

• Provide superior services and products to keep market share and maintain income.

Information Needed to Ensure Success

To support these strategies, a business needs to have:

• Access to consistent and high-quality information on the behaviors of the business and the external markets, so that they can constantly monitor the state of the business.

• Information that can help to reduce costs, streamline the business, and improve margins.

(42)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

® Technological AdvancesParallelism – Hardware – Operating system – Database – Query – Index – Applications Large databases 64-bit architectures Indexing techniques Affordable, cost-effective open systems

Robust warehouse tools

Sophisticated end user tools

(43)

...

Data Warehousing Fundamentals 2-17

...Business Drivers for Data Warehouses

Technology Needed to Support the Business Needs

Today’s information technology climate provides you with cost-effective computing resources in the hardware and software arena, Internet and intranet solutions, and databases that can hold very large volumes of data for analysis, using a multitude of data access technologies.

Technological Advances Enabling Data Warehousing

Technology (specifically open systems technology) is making it affordable to analyze vast amounts of data, and hardware solutions are now more cost-effective.

Parallelism Recent advances in parallelism have benefited all aspects of computing: • Hardware environment

• Operating system environment

• Database management systems and all associated database operations • Query techniques

• Indexing strategies • Applications Other Factors

• Very large volumes of data can be managed for warehouses greater than one terabyte in size.

• Recently introduced 64-bit architectures are increasing server capacity and speed. • Improved indexing techniques (bitmap index, hash index, star join) provide rapid

access to data.

• Warehouse tools are becoming more robust and less expensive. • Licensing strategies are more effective and affordable.

• Open systems are available.

• Sophisticated, user-friendly, and intuitive tools are available to the user community for all types of data warehouse access.

(44)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Current Situation and Growth

1996 2001 0 5 10 15 20 25 1996 2001 Revenue Projected Growth

USA Europe APAC Other 0 10 20 30 40 50 60

USA Europe APAC Other USA Europe APAC Other

Current Revenue

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Growth Motivators and Inhibitors

Successful implementations

Decreased risk

Robust extraction software

Improving price to performance ratios

Improved staff training

Year 2000 compliance

Skills shortage

Lack of integrated metadata

(45)

...

Data Warehousing Fundamentals 2-19

...Current Situation and Growth of Data Warehousing

Current Situation and Growth of Data Warehousing

Data warehouses are becoming increasingly popular. The statistics for the estimated growth of data warehousing are compelling. These figures are not specific to Oracle but are industry wide.

Revenues

A recent report has shown that in 1996 data warehouse revenues (which include hardware, software, and people-provided services) netted $8 billion (US). It is forecast that in 2001 this figure will rise to $23 billion (U.S.), assuming a compound annual growth rate of around 20% per year.

Geography

Most data warehouse implementations exist in the U.S., with Europe following close behind, and then Asia Pacific.

Growth Motivators

These include:

• Increased successful implementations

• Decreased risk with vendors supplying a total solution • More robust and functional extraction software

• Improved (and improving) price-to-performance equipment ratios • Improved training for IT staff

Growth Inhibitors

These may include: • Year 2000 compliance

• Shortage of skills in specific areas of data warehousing • The lack of integrated metadata components

• The labor-intensive commitment to the data cleaning function and its corresponding dollar and time cost

Enterprisewide Implementations and Data Marts Enterprise data warehouses are in position to dominate the business, compared with the smaller data mart

(46)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Typical Uses of a Data Warehouse

AirlineBankingHealth careInvestmentInsuranceRetailTelecommunicationsManufacturing

Credit card suppliers

Clothing distributors 0 10 20 30 40 Financial Retail Telecom Manufacturing Others

(47)

...

Data Warehousing Fundamentals 2-21

...Typical Uses of a Data Warehouse

Typical Uses of a Data Warehouse

The requirements of a business can be met by employing a data warehouse solution, which collects data from internal business operations and external data from outside organizations to provide a single source of reliable data for analysis.

Typical Users of a Data Warehouse

There are many industries that employ data warehouses:

• Airlines for aircraft deployment, analysis of route profitability, frequent flyer promotions, and maintenance

• Banking for trend analysis, promotion of products and services, and customer service

• Health care for analysis and cost reduction

• Investment and insurance companies for planning, customer analysis, risk assessment, and portfolio management

• Retail stores for trend analysis, buying pattern analysis, promotions, customer profiling, and pricing

• Telecommunications for analysis and for product and service promotions

Other industries that currently use data warehouse solutions are manufacturers, credit card issuers, and clothing distributors

Figures show that the highest proportion of revenues in data warehousing is spent by the financial services, retail, telecommunications, and manufacturing industries

(48)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Summary

This lesson covered the following topics:

Describing why an online transaction processing (OLTP) system is not suitable for complex analysis

Describing how extracting processing for decision support querying led to data warehouse solutions employed today

Explaining why businesses are driven to employ data warehouse technology

Identifying some of the industries that employ data warehouses

(49)

...

Data Warehousing Fundamentals 2-23

...Summary

Summary

This lesson covered the following topics:

• Describing why an online transaction processing (OLTP) system is not suitable for complex analysis

• Describing how extracting processing for decision support querying led to data warehouse solutions employed today

• Explaining why businesses are driven to employ data warehouse technology • Identifying some of the industries that employ data warehouses

(50)

... Lesson 2: Meeting a Business Need

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Practice 2-1 Overview

The practice covers answering questions and discussing how data warehousing meets business needs

(51)

...

Data Warehousing Fundamentals 2-25

...Practice 2-1

Practice 2-1

1 OLTP databases hold up-to-the-minute information and are most commonly designed as read-only databases.

True False

2 In the scenario below, state whether it refers to an operational system or an analytical processing system.

“Show me how a specific brand of printer is selling throughout different parts of the United States and how this specific brand of printer is selling since it was first introduced into my stores.”

This scenario refers to:

a An operational system

b An analytical processing system

3 Who is the target audience for the data warehouse?

a The business community in the organization

b IT professionals

c Data-entry clerks

d None of the above

e All of the above

4 Are the following statements true or false?

a Operational systems display the following qualities:

Good performance _____

Static data contents _____

High availability _____

Unpredictable CPU use _____

b Identify the reasons why business analysis is not easy with operational systems.

Data is not structured for drill-down capablity. _____ The system is not designed for querying. _____

Data analysis can be CPU-intensive. _____

Data is not integrated between systems. _____

5 In groups of three or four, discuss the questions below and present your points to the class at the end of the discussion.

a List some of the reasons that your company is considering implementing a data warehouse or data mart.

(52)

... Lesson 2: Meeting a Business Need

b What are some of the business problems that your company is trying to answer?

c Why is the business community in your organization unable to find the

(53)

...

3

Defining Data Warehouse

Concepts and

Terminology

(54)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Overview

Project Management (Methodology, Maintaining Metadata) Defining DW Concepts & Terminology Defining DW Concepts & Terminology Planning for a Successful Warehouse Analyzing User Query Needs Choosing a Computing Architecture Modeling the Data Warehouse Planning Warehouse Storage ETT (Building the Warehouse) Meeting a Business Need Supporting End User Access Managing the Data Warehouse

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Objectives

After completing this lesson, you should be able to do the following:

Identify a common, broadly accepted definition of a data warehouse

Recognize some of the operational properties of a data warehouse

Recognize common data warehousing terminology

Identify the functionality associated with each component required for a successful data warehouse implementation

Identify and position the Oracle Warehouse vision, products, and services

(55)

...

Data Warehousing Fundamentals 3-3

...Overview

Overview

The previous lesson covered how data warehousing has evolved from early

management information systems to today’s decision support systems that meets a business need. This lesson defines data warehouse concepts and terminology. Note that the “Defining Data Warehouse Concepts and Terminology” block is highlighted in the course road map on the facing page.

Specifically, this lesson introduces the Oracle definition of a data warehouse. The lesson offers a general description of the properties of a data warehouse. The standard components and tools required to build, operate, and use a data warehouse are

identified.

Objectives

After completing this lesson, you should be able to do the following: • Identify a common, broadly accepted definition of a data warehouse • Recognize some of the operational properties of a data warehouse • Recognize common data warehousing terminology

• Identify the functionality associated with each component required for a successful data warehouse implementation

(56)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Definition of a Data Warehouse

“ An enterprise structured repository of subject-oriented, time-variant, historical data used for information retrieval and decision support. The data warehouse stores atomic and summary data.”

(57)

...

Data Warehousing Fundamentals 3-5

...Data Warehouse Definition

Data Warehouse Definition

This definition of a data warehouse from the Oracle Data Warehouse Method

describes many of the most significant characteristics of a data warehouse. The Oracle Data Warehouse Method was developed using experiences gained from successful data warehouse projects carried out by Oracle Consulting Services. This method is discussed in Lesson 4.

Subject-Oriented

While the data in an OLTP system is stored to support a specific business process (for example, order entry, campaign management, and so on) as efficiently as possible, data in a data warehouse is stored based on common subject areas (for example, customer, product, and so on) for ease of access. That is because the complete set of questions to be posed to a data warehouse are never known. Every question the data warehouse answers spawns new questions. Thus, the focus of the design of a data warehouse is providing users easy access to the data so that current and future questions can be answered.

Time-Variant

The data warehouse contains slices of data across different periods of time. With these data slices, the user can view reports from now and in the past.

Historical

A data warehouse typically contains several years worth of data. This is necessary to support trending, forecasting, and time-based performance reporting (for example, current year versus previous year).

Information Retrieval and Decision Support

A data warehouse is a facility for getting at information to answer questions. It is not meant for direct data entry; batch updates are the norm for refreshing data warehouses.

Atomic and Summary Data

Depending on the purpose of the data warehouse, it may contain atomic data, summary data, or both.

(58)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data Warehouse Properties

Subject Oriented Integrated Time Variant Non Volatile Data Warehouse

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data is categorized and stored by business subject rather than by application.

Subject-Oriented

OLTP Applications

Customer financial information

Data Warehouse Subject

Equity

Plans Shares

Insurance

(59)

...

Data Warehousing Fundamentals 3-7

...Data Warehouse Properties

Data Warehouse Properties

Bill Inmon defines data warehousing as:

“A Data Warehouse is a subject oriented, integrated, time variant, non volatile collection of data in support of management’s decision making process.”

Subject-Oriented

Subject-oriented data is organized around major subject areas of an enterprise, and is useful for an enterprise-wide understanding of those subjects. For example, a banking operational system keeps independent records of customer savings, loans, and other transactions. A warehouse pulls this independent data together to provide financial information. You can access subject-oriented data related to any major subject area of an enterprise:

• Customer financial information

• Toll calls made in the telecommunications industry • Airline passenger booking information

• Insurance claim data

(60)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Integrated

Data Warehouse OLTP Applications

Data on a given subject is defined and stored once.

Savings Current accounts

Loans Customer

Copyright  Oracle Corporation, 1999. All rights reserved.

® Time-Variant 01/97 02/97 03/97 January February March Data Warehouse Time Data

Data is stored as a series of snapshots, each representing a period of time.

1997 1997 1997

(61)

...

Data Warehousing Fundamentals 3-9

...Data Warehouse Properties

Integrated

In many organizations, data resides in diverse independent systems, making it difficult to integrate into one set of meaningful information for analysis. A key characteristic of a warehouse is that data is completely integrated. Data is stored in a globally

acceptable manner, even when the underlying source data is stored differently. The transformation and integration process can be time-consuming and costly. It requires commitment from every part of the organization, particularly top-level managers who make the decisions and allocate resources and funds.

Data Consistency You must deal with data inconsistencies and anomalies before the data is loaded into the warehouse. Consistency is applied to naming conventions, measurements, encoding structures, and physical attributes of the data.

Data Redundancy Data redundancy at the detail level in the warehouse

environment is eliminated; the warehouse only contains data that is physically selected and moved into it; however, selective and deliberate redundancy in the form of aggregates and summaries is required in the warehouse to improve the performance of queries especially drill-down analysis.

Time-Variant

Warehouse data is by nature historical; it does not usually contain the current

transactional data. Data is represented over a long time horizon, from two to ten years, compared with one to three months of data for a typical operational system. The data allows for analysis of past and present trends, and for forecasting using “what-if” scenarios.

Time Element The data warehouse always contains a key element of time, such as quarter, month, week, or day, that determines when the data was loaded. The date may be a single snapshot date, such as 10-JAN-97, or a range, such as 01-JAN-97 to 31-JAN-97.

Snapshots by Time Period Warehouse data is essentially a series of snapshots by time periods that do not change.

Special Dates A time dimension usually contains all the dates required for analysis, including special dates like holidays and events.

(62)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Nonvolatile

Typically data in the data warehouse is not updated or deleted. Warehouse Read Insert Read Update Delete Load Operational

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Changing Data

Operational Databases Warehouse Database

First time load

Refresh

Refresh

Refresh

(63)

...

Data Warehousing Fundamentals 3-11

...Data Warehouse Properties

Nonvolatile

Typically, data in the data warehouse is read-only. Data is loaded into the data warehouse for the first-time load, and then refreshed regularly. Warehouse data is accessed by the business users. Warehouse operations typically involve:

• Loading the initial set of warehouse data (often called the first-time load) • Refreshing the data regularly (called the refresh cycle)

Accessing the Data Once a snapshot of data is loaded into the warehouse, it rarely changes. Therefore, data manipulation is not a consideration at the physical design level. The physical warehouse is optimized for data retrieval and analysis.

Refresh Cycle The data in the warehouse is refreshed; that is, snapshots are added. The refresh cycle is determined by the business users. A refresh cycle need not be the same as the grain (level at which the data is stored) of the data for that cycle. For example, you may choose to refresh the warehouse weekly, but the grain of the data may be daily.

Changing Warehouse Data

The following operations are typical of a data warehouse:

• The initial set of data is loaded into the warehouse, often called the first-time load. This is the data by which you will measure the business, and the data containing the criteria by which you will analyze the business.

• Frequent snapshots of core data warehouse data are added, (more occurrences), according to the refresh cycle and using data from the multiple source systems. Warehouse data may need to be changed in other ways:

• The data you are using to analyze the business may change, the data warehouse must be kept up-to-date to keep it accurate.

• The business determines how much historical data is needed for analysis, say five years worth. Older data is either archived or purged.

Inappropriate or inaccurate data values may be deleted from or migrated out of the data warehouse.

(64)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Data Warehouse Versus OLTP

Property Response Time Operations Nature of Data Data Organization Size Data Sources Activities Data Warehouse Seconds to hours

Primarily read only Snapshots over time Subject, time Large to very large Operational, Internal, External Analysis Operational Sub seconds to seconds DML 30-60 days Application Small to large Operational, Internal Processes

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Usage Curves

Operational system is predictable

Data warehouse – Variable – Random

(65)

...

Data Warehousing Fundamentals 3-13

...Data Warehouse Properties

Data Warehouse Versus Online Transaction Processing (OLTP)

Response Time and Data Operations Data warehouses are constructed for very different reasons than online transactional processing (OLTP) systems. OLTP systems are optimized for getting data in—for storing data as a transaction occurs. Data warehouses are optimized for getting data out—for providing quick response for analysis purposes.

Since there tends to be a high volume of activity in the OLTP environment, rapid response is critical; whereas, data warehouse applications are analytical rather than operational. Therefore slower performance is acceptable.

Nature of Data The data stored in each database varies in nature: the data warehouse contains snapshots of data over time to support time-series analysis whereas, the OLTP system stores very detailed data for a short time such as 30 to 60 days.

Data Organization The data warehouse is subject specific and supports analysis so data is arranged accordingly. In order for the OLTP system to support subsecond response, the data must be arranged to optimize the application. For example, an order entry system may have tables which hold each of the elements of the order whereas a data warehouse may hold the same data but arrange it by subject such as customer, product, and so on.

Data Sources Since the data warehouse is created to support analytical activities, data from a variety of sources can be integrated. The operational data store of the OLTP system holds only internal data or data necessary to capture the operation or transaction.

Usage Curves

Operational systems and data warehouses have different usage curves.

An operational system has a more predictable usage curve, the warehouse a less predictable, more varied, and random usage curve.

Access to the warehouse varies not just on a daily basis, but may even be affected by forces such as a seasonal variations. For this reason, you cannot expect the operational system to handle heavy analytical queries (DSS) and continue to give good transaction rates for the minute-by-minute processing required.

(66)

... Lesson 3: Defining Data Warehouse Concepts and Terminology

Copyright  Oracle Corporation, 1999. All rights reserved.

®

User Expectations

Control expectations

Set achievable targets for query response

Set SLAs

Educate

Growth and use is exponential

Copyright  Oracle Corporation, 1999. All rights reserved.

®

Enterprisewide Warehouse

Large scale implementation

Scopes the entire business

Data from all subject areas

Developed incrementally

Single source of enterprisewide data

Synchronized enterprisewide data

Single distribution point to dependent data marts

References

Related documents

Economic Development & Welfare: Course taught for the certification program on Border Studies offered by El Colegio de la Frontera Norte, Ciudad Juárez..

Bovet et al 32 reported that among newly diagnosed hypertensive pa- tients in a developing country, the percentage of patients who had good adherence (as determined by

This amplifier is sketched in figure 13-a. This is the most general differential amplifier. The output is taken between both collectors. Ideally the circuit is symetric, that is

The concept of flight testing a new technology such as the da Vinci parachute is similar; together they are a good topic for further class discussion..

Results are presented below, organized by major outcome group: (a) classroom quality and teacher-child interactions, (b) teacher beliefs and practices, (c) teacher feelings about job

The aim of this document is to provide a short and clear description of parameters (data items) that are to be reported in the data collection forms of the Global Monitoring Plan

A research model is developed that proposes both a direct effect of IT infrastructure on organizational performance and an indirect effect mediated by the effects of IT

Elisabeth Gerarda de Waal, daughter of Nicolaas de Waal and Christina Kaljee, was born op 12 september 1891 te Amersfoort and was baptized.. She is waschmeisje,