Measuring Data
Management
Practice Maturity
Increasing data management practice
maturity levels can positively impact the
coordination of data flow among
organizations, individuals and systems
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
[email protected]
1
Peter Aiken
•
DoD Computer Scientist
–
Reverse Engineering Program Manager/Office of the Chief Information Officer (1992-1997)
•
Visiting Scientist
–
Software Engineering Institute/Carnegie Mellon University (2001-2002)
•
DAMA International President
(http://dama.org)
–
2001 DAMA International Individual Achievement Award (with Dr. E. F. "Ted" Codd)
–
2005 DAMA Community Award
•
Founding Advisor/International Association for Information and Data Quality
(http://iaidq.org)
•
Founding Advisor/Meta-data Professionals Organization
(http://metadataprofessional.org)
•
Founding Director
Data Blueprint
1993
•
BS VCU 1981 Information Systems & Management
•
MS VCU 1985 Information Systems
•
PhD GMU 1989 Information Technology Engineering
•
Full time in information technology since 1981
•
IT engineering research and project background
•
University teaching experience since 1979
•
Seven books and dozens of articles
•
Research Areas
–
reengineering, data reverse engineering, software requirements engineering,
information engineering, human-computer interaction, systems integration/
systems engineering, strategic planning, and DSS/BI
•
Director
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
DM Maturity
Organizations Surveyed
3•
Results from
more than 500
organizations
•
32%
government
•
Appropriate
public company
representation
•
Enough data to
demonstrate
European
organization DM
practices are
generally more
mature
Local Government
4%
State Government Agencies
17%
Federal Government
11%
Public Companies
58%
International Organizations
10%
IT Project Failure Rates
Recent IT project failure rates statistics
can be summarized as follows:
–
Carr 1994
•
16% of IT Projects completed on time,
within budget, with full functionality
–
OASIG Study (1995)
•
7 out of 10 IT projects "fail" in some respect
–
The Chaos Report (1995)
•
75% blew their schedules by 30% or more
•
31% of projects will be canceled before they ever get completed
•
53% of projects will cost over 189% of their original estimates
•
16% for projects are completed on-time and on-budget
–
KPMG Canada Survey (1997)
•
61% of IT projects were deemed to have failed
–
Conference Board Survey (2001)
•
Only 1 in 3 large IT project customers were very “satisfied"
–
Robbins-Gioia Survey (2001)
•
51% of respondents viewed their large IT implementation project as unsuccessful
–
MacDonalds Innovate
(2002)
•
Automate fast food network from fry temperature to # of burgers sold-$180M USD
write-off
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
DM Origins – Which arrives first – DM or DBMS?
•
A Key Indicator
•
70% reacting instead of anticipating
•
Best practices are obvious
26%
68%
6%
9%
75%
6%
DM 1st
DBMS 1st
Simultaneously
1981
2007
5DM Involvement
Data Warehousing
XML
Data Quality
Customer Relationship Management
Master Data Management
Customer Data Integration
Enterprise Resource Planning
Enterprise Application Integration
0
12.5
25.0
37.5
50.0
Particpation Percentage
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Why Data Projects Fail by
Joseph R. Hudicka
•
Assessed 1200
migration projects!
–
Surveyed only
experienced migration
specialists who have
done at least four
migration projects
•
The median project
costs over 10 times the amount planned!
• Biggest Challenges: Bad Data; Missing Data; Duplicate Data
•
The survey did not consider projects that were cancelled largely
due to data migration difficulties
•
"… problems are encountered rather than discovered"
Median Project Expense
Median Project Cost
$0
$125,000
$250,000
$375,000
$500,000
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail"
Oracle Developers Technical Users Group Journal
June 2005 pp. 29-31
7Monitization: Legacy System Migration to ERP
•
Challenge
–
Millions of NSN/SKUs
–
Key and other data stored in clear text/comment fields
–
Original suggestion was manual approach to text
extraction
–
Left structuring problem unsolved
•
Solution
–
Proprietary, improvable text extraction process
–
Converted non-tabular data into tabular data
–
Saved a minimum of $5 million
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
An Iterative Approach to Data Quality Engineering
9
Unmatched
Items
Unmatched
Items
Ignorable Ignorable
Items
Extracted
Avg
Items Matched
Items Matched
Rev
#
(% Total)
NSNs
(% Total)
Matched
Items
Per Item
Total)
(%
Extracted
Items
1
329948
31.47%
14034
1.34% N/A
N/A
N/A
264703
2
222474
21.22%
73069
6.97% N/A
N/A
N/A
286675
3
216552
20.66%
78520
7.49% N/A
N/A
N/A
287196
4
340514
32.48%
125708
11.99%
582101 1.1000222 55.53%
640324
…
…
…
…
…
…
…
…
…
14
94542
9.02%
237113
22.62%
716668 1.1142914 68.36%
798577
15
94929
9.06%
237118
22.62%
716276 1.1139282 68.33%
797880
16
99890
9.53%
237128
22.62%
711305 1.1153008 67.85%
793319
17
99591
9.50%
237128
22.62%
711604 1.1154392 67.88%
793751
18
78213
7.46%
237130
22.62%
732980 1.2072812
69.92%
884913
Time needed to review all NSNs once over the life of the project:
Time needed to review all NSNs once over the life of the project:
NSNs
2,000,000
Average time to review & cleanse (in minutes)
5
Total Time (in minutes)
10,000,000
Time available per resource over a one year period of time:
Time available per resource over a one year period of time:
Work weeks in a year
48
Work days in a week
5
Work hours in a day
7.5
Work minutes in a day
450
Total Work minutes/year
108,000
Person years required to cleanse each NSN once prior to migration:
Person years required to cleanse each NSN once prior to migration:
Minutes needed
10,000,000
Minutes available person/year
108,000
Total Person-Years
92.6
Resource Cost to cleanse NSN's prior to migration:
Resource Cost to cleanse NSN's prior to migration:
Avg Salary for SME year (not including overhead)
$60,000.00
Projected Years Required to Cleanse/Total DLA Person Year
Saved
93
Total Cost to Cleanse/Total DLA Savings to Cleanse NSN's:
$5.5 million
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Misunderstanding Data Management
11
Data Governance, Data Quality,
Data Security, Analytics,
Data
Compliance,
Data Mashups,
Business Rules
(more ...)
Data
Management
(DM)
!
2000-Organization-wide DM coordination
Organization-wide data integration
Data stewardship, Data use
Enterprise
Data
Administration
(EDA)
!
1990-2000
Data requirements analysis
Data modeling
Data
Administration
(DA)
!
1970-1990
Expanding DM Scope
DataBase Administration (DBA)
!
1950-1970
Database design
Database operation
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Enterprise Information Management is concerned with Architecture
13
He who doesn’t lay his
foundations before
hand, may by great
abilities do so
afterward, although with great
trouble to the architect and
danger to the building.
Nicolo Machiavelli
(1469-1527)
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Building from the Top
15
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Motivation
•
"We want to move our data management
program to the next level"
–
Question:
What level are you at now?
•
You are currently managing your data,
–
But, if you can't measure it,
–
How can you manage it effectively?
•
How do you know where to put time, money,
and energy so that data management best
supports the mission?
"One day Alice came to a fork in the road
and saw a Cheshire cat in a tree. Which
road do I take? she asked. Where do you
want to go? was his response. I don't
know, Alice answered. Then, said the cat, it
doesn't matter."
Lewis Carroll from
Alice in Wonderland
17
Standard
Data
Data Management
Data Program
Coordination
Organizational
Data Integration
Data
Stewardship
Data Support
Operations
Asset Use
Data
Organizational Strategies
Goals
Integrated
Models
Business
Data
Business Value
Application
Models & Designs
Feedback
Implementation
Direction
Data
Development
Guidance
Assign responsibilities for data.
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Manage data coherently.
Share data across boundaries.
Engineer data delivery systems.
Maintain data availability.
!!
Data Program
Coordination
Organizational
Data Integration
Data
Stewardship
Data
Development
Data Support
Operations
Data Management
Our DM practices are
ad hoc
and
Initial
Repeatable
(2)
We have DM experience and
have the ability to implement
disciplined
processes
Data Management
Capability Maturity
Model Levels
Defined
(3)
We have experience that
we have
standardized
so
that all in the organization
can follow it
Managed
(4)
We
manage
our DM processes so
that the whole organization can
follow our standard DM guidance
Optimizing
(5)
We have a process
for
improving
our
DM capabilities
One concept for
process improvement,
others include:
•
Norton Stage Theory
•
TQM
•
TQdM
•
TDQM
•
ISO 9000
and focus on
understanding current
processes and
determining where to
make improvements.
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Source: Applications Executive Council, Applications Budget, Spend, and Performance Benchmarks: 2005 Member Survey Results, Washington D.C.: Corporate Executive Board 2006, p. 23.
Percentage of Projects on Budget
By Process Framework Adoption
…while the same pattern generally holds true for on-time performance
Percentage of Projects on Time
By Process Framework Adoption
Key Finding: Process Frameworks are not Created Equal
With the exception of CMM and ITIL, use of process-efficiency
frameworks does not predict higher on-budget project delivery…
21
Assessment Components
Data Management Practice Areas
Data Management Practice Areas
Data program
coordination
DM is practiced as a
coherent and
coordinated set of
activities
Organizational
data integration
Delivery of data is
support of
organizational
objectives –
the
currency of DM
Data stewardship
Designating specific
individuals
caretakers for
certain data
Data
development
Efficient delivery of
data via appropriate
channels
Data support
Ensuring reliable
access to data
Capability
Maturity Model
Levels
Examples of practice
maturity
1 – Initial
Our DM practices are ad hoc
and dependent upon "heroes"
and heroic efforts
2 - Repeatable
We have DM experience and
have the ability to implement
disciplined processes
3 - Documented
We have standardized DM
practices so that all in the
organization can perform it
with uniform quality
4 - Managed
We manage our DM processes
so that the whole organization
can follow our standard DM
guidance
5 - Optimizing
We have a process for
improving our DM capabilities
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Weakest Link Results Reporting Results
•
Understand five organizational data
management practice areas
–
Rate each area per capability maturity
model
•
Understand the "weakest link" nature
of the results reporting
–
Engineered components can only be
as strong as their weakest component
–
Low scores seem harsh but are
realistic – (and on the upside) easily
improvable
–
A single "1" degrades the entire
practice area – as shown with
"stewardship"
•
DMPA results are granularized for
each practice area providing
improvement process guidance
23Assessment Components
Data Management Practice Areas
Data Management Practice Areas
Data program
coordination
DM is practiced as a
coherent and
coordinated set of
activities
Organizational
data integration
Delivery of data is
support of
organizational
objectives –
the
currency of DM
Data stewardship
Designating specific
individuals
caretakers for
certain data
Data
development
Efficient delivery of
data via appropriate
channels
Capability
Maturity Model
Levels
Examples of practice
maturity
1 – Initial
Our DM practices are ad hoc
and dependent upon "heroes"
and heroic efforts
2 - Repeatable
We have DM experience and
have the ability to implement
disciplined processes
3 - Documented
We have standardized DM
practices so that all in the
organization can perform it
with uniform quality
4 - Managed
We manage our DM processes
so that the whole organization
can follow our standard DM
guidance
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
The challenge ahead
0.00
1.00
2.00
3.00
4.00
5.00
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
The chart represents the average scores
presented on the previous slide -
interesting that none have apparently
reached level-3
25
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
Data Management Practices
Measurement (DMPA)
Focus:
Implementation
and Access
Focus:
Guidance and
Facilitation
Optimizing (V)
Managed (IV)
Documented (III)
Repeatable (II)
Initial (I)
•
CMU's Software
Engineering Institute (SEI)
Collaboration
•
Results from hundreds organizations in
various industries including:
–
Public Companies
–
State Government Agencies
–
Federal Government
–
International Organizations
•
Defined industry standard
•
Steps toward defining data
management "state of the practice"
0
1
2
3
4
5
Development
Guidance
Data
Adminstration
Support Systems
Asset Recovery
Capability
Development
Training
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
Interpreting Assessment Results
for a Sample Organization
2.0
1.0
3.0
1.4
Average
Verified
27Perceptions
are higher
than actual
practice
Perceptions
are below
actual
practice
Comparative Assessment Results
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
Challenge
Challenge
Page
High Marks for IFC’s Program
Data Mgmt Audit
Leadership & Guidance
Asset Creation
Metadata Management
Quality Assurance
Change Management
Data Quality
0
1
2
3
4
5
TRE
ISG
IFC
Industry Benchmarks
Overall Benchmarks
"These IFC scores
represent the highest
aggregate scores in
the area of data
stewardship recorded
in our database of
hundreds of
assessments that has
been recognized as as
a representative
scientific sample."
Why is our organizational Data Stewardship score so low?
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved! 31
What expertise do we
have in
Data Program
Coordination?
- datablueprint.com 3/3/2010 © Copyright this and previous years by Data Blueprint - all rights reserved!
http://peteraiken.net
Contact Information
:
Peter Aiken, Ph.D.
Department of Information Systems
School of Business
Virginia Commonwealth University
1015 Floyd Avenue - Room 4170
Richmond, Virginia 23284-4000
Data Blueprint
Maggie L. Walker Business & Technology Center
501 East Franklin Street
Richmond, VA 23219
804.521.4056
http://datablueprint.com
office :+1.804.883.759
cell:+1.804.382.5957
e-mail:[email protected]
http://peteraiken.net
Copyright 12/18/07 by Data Blueprint - all rights reserved!