• No results found

Data Quality Management The Most Critical Initiative You Can Implement

N/A
N/A
Protected

Academic year: 2021

Share "Data Quality Management The Most Critical Initiative You Can Implement"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

Data Quality Management

Data Quality Management

The Most Critical Initiative You Can Implement

The Most Critical Initiative You Can Implement

SUGI 29

SUGI 29 –– MontrealMontreal May 2004

May 2004

Claudia Imhoff President

Jonathan G. Geiger Executive Vice President

(2)

Topics

ƒ What is Data Quality Management?

ƒ Data Quality Management Challenges

ƒ Data Quality Definition

ƒ Four Pillars of Data Quality Management

(3)

Data is an Asset

ƒ Other corporate assets include

ƒ People

ƒ Capital (Money) ƒ Property

ƒ Materials

ƒ Assigning value is difficult

ƒ Establishing ROI for Data Quality

(4)

What is Data Quality

Management?

ƒ Establishment and deployment of:

ƒ Roles,

ƒ Responsibilities, ƒ Policies and

ƒ Procedures

ƒ Concerning the acquisition, maintenance,

dissemination and disposition of data

ƒ Viability of business decisions – contingent on good data...

ƒ Good data – contingent on an effective approach to Data Quality Management

(5)

Data Quality Management

Responsibilities

ƒ Business Responsibilities

ƒ Business rules governing data ƒ Data quality verification

ƒ Information Technology Responsibilities

ƒ Manage environment for acquiring, maintaining, disseminating, and disposing of electronic data

ƒ Architecture

(6)

ƒ Program Manager and Project Leader

ƒ Organization Change Agent

ƒ Business Analyst and Data Analyst

ƒ Data Steward

(7)

Data Quality Management

Components

ƒ Reactive: addresses problems that already exist

ƒ Deal with inherent data problems, integration issues, merger and acquisition challenges

ƒ Proactive: diminishes the potential for new problems to arise

ƒ Governance, roles and responsibilities, quality expectations, supporting business practices, specialized tools.

(8)

Data Quality Management

Importance

ƒ Companies often realize the importance too late

ƒ Only after several documented problems with the data do they recognize the need to improve its quality.

ƒ Billions of dollars are lost annually due to data quality problems.

ƒ Additional estimates have shown that 15-20% of the data in a typical organization is erroneous or otherwise unusable.

ƒ The importance of Data Quality Management should be evident – so why aren’t companies addressing it more aggressively?

(9)

Topics

ƒ What is Data Quality Management?

ƒ Data Quality Management Challenges

ƒ Data Quality Definition

(10)

Data Quality Management

Challenges: Responsibility

ƒ No single business unit is responsible for

enterprise data

ƒ Once captured in operational system,

business unit washes hands of further responsibility

ƒ Savvy corporations adopt data

stewardship approach

(11)

Data Quality Management

Challenges: Cross Functionality

ƒ Horizontal alignment in a vertical world

ƒ Data Quality Management crosses

organizational boundaries

(12)

Data Quality Management

Challenges: Problem Recognition

ƒ Corporation must recognize that it HAS a

Data Quality Management problem

ƒ Is your company in denial?

ƒ Getting money for a unrecognized problem is

(13)

Data Quality Management

Challenges: Discipline

ƒ Downstream impacts must be understood

and considered in decisions

ƒ Corporation must define and assign

responsibilities

ƒ In job descriptions

(14)

Data Quality Management

Challenges: Investment

ƒ Time

ƒ Funding

ƒ Resources

ƒ All needed to overcome “unquality”

ƒ Examples

ƒ Duplicate materials to the same customer or prospect

(15)

Data Quality Management

Challenges: On-Going Effort

ƒ This is not a one-time effort

ƒ Data Quality Management Staffing is required

ƒ Should reduce staffing requirements elsewhere

ƒ Governance is the name of the game

(16)

Data Quality Management

Challenges: Return on Investment

ƒ What is the cost of “unquality”?

ƒ Work-arounds absorbed into daily processes

(17)

Topics

ƒ What is Data Quality Management?

ƒ Data Quality Management Challenges

ƒ Data Quality Definition

(18)

Quality - Definition

ƒ Quality is conformance to requirements

ƒ Whose requirements?

ƒ How are requirements set?

(19)

Defect Rate Target

Quality - Definition

ƒ Quality is not

(20)

Quality - Definition

ƒ To the user, the data warehouse is the source

ƒ Data model provides basis for data collection

ƒ Definitions

ƒ Validation rules

ƒ Relationship rules

ƒ Actual data must also be examined

ƒ Operational business process implications

ƒ Abuse of defined fields

ƒ Undocumented business rules

(21)

$

100% C O M P L E T E N E S Complete but with errors Very Dangerous May be a

proto-Perfect data Expensive

Incomplete but accurate

(22)

Four Types of Error

Correction

ƒ Reject the error

ƒ Accept the error

ƒ Correct the error

(23)

Reject the Error!

ƒ Better to have missing data than inaccurate data

ƒ Reject the complete record

ƒ Correct at the source and re-extract the data

(24)

Accept the Error!

ƒ Data error is within tolerance limits

ƒ Correct data at the source

ƒ If not correctable,

provide meta data on the error

(25)

Correct the Error!

ƒ Data essential for completeness

ƒ Correction is required

ƒ Use temporary file

(26)

Use Default Value for Data

in Error!

ƒ Data needed for completeness

ƒ Data is unusable as is

ƒ Data value is replaced with a default value

ƒ Meta Data must be used to explain when and how the default is used

(27)

Topics

ƒ What is Data Quality Management?

ƒ Data Quality Management Challenges

ƒ Data Quality Definition

(28)

Four Pillars of Data Quality

Management

(29)

Four Pillars of Data Quality

Management

ƒ Data Profiling – Gaining an understanding of existing data relative to quality specifications

ƒ This is your starting point from which improvement (and ROI) is measured

ƒ Is the data complete? ƒ Is the data accurate?

ƒ Data Quality – Gaining an understanding of the causes of quality problems

(30)

Four Pillars of Data Quality

Management

ƒ Data Integration – Collapsing disparate versions of data into a single one

ƒ Recognition that same data exists in multiple locations with variable content

ƒ Standardize the multiple versions (e.g., customers, products, geographies, etc.) to single version

ƒ Data Augmentation – incorporation of additional external data to gain insight

ƒ Combine internal customer data with third party data to increase understanding of the customer

ƒ External data – competitor, customer demographic or credit history, total industry sales data

(31)

Topics

ƒ What is Data Quality Management?

ƒ Data Quality Management Challenges

ƒ Data Quality Definition

(32)

Getting Started

ƒ Education

ƒ Stewardship Program

ƒ Partnerships & Environment

ƒ Four-Phase Program

(33)

Education

ƒ Involve key data warehouse effort participants

ƒ Business users ƒ Developers

ƒ Influencing people

ƒ Better chance of getting commitment

ƒ Involves various techniques

ƒ Facilitated sessions ƒ Interviews

(34)

Stewardship - Definition

ƒ Webster’s Dictionary: A steward is one who

is called upon to exercise responsible care over possessions entrusted to him/her

ƒ The steward does not own the possessions

ƒ The steward has a responsibility affecting the

processes that impact the possessions

ƒ The steward may be a business unit or

(35)

Data acquisition •Processes •System roles •Update authority •Validation rules •Business rules •Quality Dissemination •Access security

•Standard queries and reports

•Capabilities •System use •Quality

•Meta data provided

Disposal •Retention •Erasure Data management •Data models •Demographics •Naming standards

•Meta data requirements

We need to approach this in an

organized manner

Data Steward

Responsibilities

(36)

Partnerships & Environment

Business Unit InformationTechnology Business Unit Business Unit Executive

Management MiddleManagement Information

(37)

Partnerships & Environment

ƒ Address quality issues explicitly

ƒ Address known quality problems

ƒ Business processes

ƒ Operational data

ƒ Ensure environment supports quality

ƒ Properly train and equip team

(38)

Partnerships & Environment

ƒ Quality expectations must be:

ƒ Understood ƒ Negotiated

ƒ Communicated ƒ Met

ƒ Quality is a business issue -- NOT just a technical issue

ƒ Quality is not an issue for one business unit --horizontal activity

ƒ Quality Committee ƒ Data Stewardship

(39)
(40)

Technology Support

ƒ Data Quality Management companies like DataFlux are available to help you get started.

They can:

ƒ Help you determine your Data Quality Management needs

ƒ Develop a plan to help meet your needs

ƒ Provide the technology, methodology and services to execute your plan

(41)

Summary

ƒ Data Quality Management is not a luxury – it

is essential

ƒ The first step is to recognize that you have data “unquality”

ƒ A sound program consists of four pillars

ƒ Getting started requires commitment and

(42)

References

Related documents

SNPs in the Add Health Sibling Pairs genetic database were matched to SNPs with reported results in the educational attainment.. GWAS (Rietveld et

Whooping Crane ( Grus americana ) – endangered Interior Least Tern ( Sterna antillarum ) - endangered Black-capped Vireo ( Vireo atricapillus ) - endangered Piping Plover

This thesis reports on the development of in vitro cytotoxicity and reporter gene assays (oxidative stress response and xenobiotic metabolism toxicity pathways) in zebrafish

competition on international corporate governance. Debt maturity structure and liquidity risk. Seniority and maturity of debt contracts. Capital market financing, firm growth,

Building on previous collaborations in physical health generally (Hargate et al. 2009) and medicines management training that specifically sought to build a

In any consultation several attributes of the physician-patient relation- ship may affect the outcome, including a longitudinal relationship between patient and physician

Discovery at Shadow Creek Ranch offers free personal fitness classes for our residents with a certified instructor lead team.. Should you need to find additional

Other extensions are as follows. First, while our baseline model assumes that the interest rate in the capital market is given, we analyze how our results are a¤ected if we close