• No results found

Data Warehouse Overview. Srini Rengarajan

N/A
N/A
Protected

Academic year: 2021

Share "Data Warehouse Overview. Srini Rengarajan"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Data Warehouse

Overview

Srini Rengarajan

(2)
(3)

Please mute

Your cell!

(4)

Agenda

• Data Warehouse Architecture

• Approaches to build a Data Warehouse

– Top Down Approach

– Bottom Up Approach

• Best Practices

• Case Example

(5)
(6)

Today’s data warehousing defined

Data warehousing is the coordinated, architected, and

periodic copying of data from various sources, both

inside and outside the enterprise, into a centralized

repository optimized for reporting and analytical

processing

(7)

Other

Data Sources ERP

Social M edia Content

Unstructured Content

Data Quality Data Transformation

Data Profiling

Data Cleansing

Data Validation

Data Auditing

Real Time

Batch

BI Applications

On Demand Services Browsers

M obile

Source Layer Data Integration Layer Data Warehouse Layer Presentation

Layer

Data Warehouse Architecture

(8)

Data Warehouse Architecture

Source Layer

• Core Systems

• Unstructured /External Data

• Social Media

Data Integration Layer

• Extraction

• Cleansing

• Transformation

• Profiling

• Auditing

(9)

Data Warehouse Architecture

Data Warehouse Layer

• Staging Database

• Operational Data Store

• Data Warehouse

• Data Mart

Presentation Layer

• Reporting

• Dashboards

• Mobile Devices

(10)

Data Warehouse Approaches

Top Down Approach

Suggested by Bill Inmon

Build a centralized repository (DW) to house corporate wide business data

The data in the DW is stored in a normalized form in order to avoid redundancy

Once the DW is implemented , start building subject area specific data marts which contain summarized information based on the end users analytical requirements. This is to enable faster access to the data for the end users analytics

Longer to implement - May fail due to the lack of patience and commitment

Bottom Up Approach

Suggested by Ralph Kimball

The bottom up approach is an incremental approach to build a data warehouse

We build the data marts separately at different points of time as and when the specific subject area requirements are clear

The data marts are integrated or combined together to form a data warehouse

Separate data marts are combined through the use of conformed dimensions and conformed facts. A conformed dimension and a conformed fact is one that can be shared across data marts.

Faster to implement - Implementation in stages

(11)

Data Warehouse Approaches

Top Down Approach

The requirements of users across different levels are gathered and one schema for the entire data warehouse is built. Afterwards, separate data marts are tailored according to the particular characteristics of each business area or process.

The top-down approach emphasizes the users’ requirements thus making it a more user driven approach

Bottom Up Approach

A separate schema is built for each data mart, taking into account the availability of data for decision-making. Later, these schemas are merged, forming a global schema for the entire data warehouse.

The bottom-up approach aims at

identifying all the star schemas that can be implemented using the available source systems making it a more source/data driven approach

(12)

How to Choose ?

Favors Top Down Favors Bottom up

Nature of the organization's decision support Requirements

Strategic Tactical

Data Integration

Requirements Enterprise Wide integration Individual Business areas

Nature of Source Data High rate of change from source systems

Source systems are relatively stable

Time to Delivery Organization's requirements allow for longer start-up time

Need for the first data warehouse application is urgent

Cost to Deploy

Higher start-up costs, with lower subsequent project

Lower start-up costs, with each subsequent project costing about

(13)

Changing Times

 Real-time business intelligence

 Dynamic data warehouse environment

 Consolidation of data warehouse and transaction systems

 Information integrated with other sources

 Speed of delivery and development is critical

 Point-in-time business intelligence

 Batch data warehousing

 Separation of data warehouse and transaction systems

 Self-contained historical data warehouse

 Latency in development and

deployment of business applications

Today Yesterday

(14)

Hybrid Approach

The Hybrid approach aims to harness the speed of the Bottom up

approach and user orientation/integration of the Top down

approach – Combination of User and Data Driven

(15)

Data Warehouse Architecture Best Practices

• Best Practice #1

– Use a Data model that is optimized for Information retrieval

• Dimensional Model

• Hybrid Approach

• Best Practice #2

– Carefully design the data integration processes for your DW

• Ensure the data is processed efficiently and accurately

• Consider an ETL or Data Cleansing tool

• Use them well!

(16)

Data Warehouse Architecture Best Practices

• Best Practice #3

– Design a metadata architecture that allows sharing of

metadata between components of your DW

• Consider metadata standards such as OMG’s Common Warehouse

Metamodel (CWM)

http://www.omg.org/technology/documents/modeling_spec_catalog.htm

• Best Practice #4

– Take an approach that consolidates data into ‘a single

version of the truth’

• Hybrid Model

(17)

Case Example

At a Glance:

• India's largest private bank

• Over 10 million customers

• 364 branches and 46 extension counters

• Network of 1,050 ATMs

• Multiple call center’s and Internet banking

• ICICI Bank considered various data warehousing and customer relationship

management (CRM) solutions. Based on worldwide industry experience, Teradata emerged as the foremost solution

• The implementation of an enterprise data warehouse from Teradata allowed ICICI to analyse its customer database, which includes information from separate operational systems including retail banking, bonds, fixed deposits, credit cards, and more

• .

(18)

Thank you for

your attention!

Any Questions?

References

Related documents

Scheduling to minimize busy times is APX-hard already in the special case where all jobs have the same (unit) processing times and can be scheduled in a fixed time interval.. Our

These steps include creating a high-level customer strategy that builds on the new CRM capabilities to improve the way the company interacts with customers and prospects; developing

Puji dan syukur penulis panjatkan kehadirat Allah Swt yang telah memberikan rahmat, kekuatan serta kemudahan kepada penulis sehingga dapat menyelesaikan

It is also necessary to decide what, if any, data items from sources outside of the application database should be added to the data model.. Once a data warehouse is

dimensional data warehousing environment is excluded targets of related dimension values are fact table also contains more dimensional schema data warehouse keeps

The buffer contains a solution of glycolic acid and its sodium salt, sodium glycolate.. Explain what a buffer is and how this buffer

Working and schema in oracle list of schemas available tables in the same application data schema Edwrep schema passwords for oracle get list schemas in the listed in

– Early Crisis Cover Multiplier, a limited premium payment term supplementary benefit which provides whole of life protection for early and intermediate stages of critical