Healthcare Analytics 101 Workshop
Necessary Pre-requisites
The Case for the
Chief Data O
ffi
cer
Recasting the C-Suite to Leverage
Your Most Valuable Asset
Peter Aiken and
Michael Gorman
Copyright 2013 by Data Blueprint
Peter Aiken, Ph.D.
2
•
25+ years of experience in data
management
•
Multiple international awards &
recognition
•
Founder, Data Blueprint
(datablueprint.com)
•
Associate Professor of IS, VCU
(vcu.edu)
•
President, DAMA International
(dama.org)
•
8 books and dozens of articles
•
Experienced w/ 500+ data
management practices in 20 countries
•
Multi-year immersions with
organizations as diverse as the
US DoD, Nokia, Deutsche Bank,
Wells Fargo, and the Commonwealth
of Virginia
Copyright 2013 by Data Blueprint
With thanks to:
•
J. Brian Cassel, PhD
Senior Analyst
Oncology Administration
VCU Health System
•
Lisa Shickle, MS, former
director Massey Data Analytics
(now at Wellpoint)
•
Gordon Ginder, MD & Mary Ann Hager, MSN, Massey /
VCUHS
•
Kathleen Kerr, Kerr Healthcare Analytics
•
Tom Smith, Johns Hopkins
•
Massey / VCUHS palliative care team
Slide 39
3Copyright 2013 by Data Blueprint
Copyright 2013 by Data Blueprint
5
IBM's Data Baby
Copyright 2013 by Data Blueprint
6
Copyright 2013 by Data Blueprint
7
Bills of Mortality
Copyright 2013 by Data Blueprint
8
Mortality Geocoding
Where is it happening?
Copyright 2013 by Data Blueprint
9
Plague Peak
When is it happening?
("Whereas of the Plague")
Copyright 2013 by Data Blueprint
10
Black Rats or Rattus Rattus
Why is it happening?
Black Rats or Rattus Rattus
Copyright 2013 by Data Blueprint
11
What will happen?
Copyright 2013 by Data Blueprint
12
Copyright 2013 by Data Blueprint
1. Adopting a crawl, walk, run strategy
2. Understanding current and potential
organizational maturity and corresponding
capabilities
3. Achieving an appropriate technology/human
capability balance
4. Implementing useful IT systems development
practices
5. Installing necessary non-IT leadership
13
101 Workshop: Necessary Pre-requisites
Copyright 2013 by Data Blueprint
IT Project Failure Rates
Recent IT project failure rates statistics
can be summarized as follows:
–
Carr 1994
•
16% of IT Projects completed on time,
within budget, with full functionality
–
OASIG Study (1995)
•
7 out of 10 IT projects "fail" in some respect
–
The Chaos Report (1995)
•
75% blew their schedules by 30% or more
•
31% of projects will be canceled before they ever get completed
•
53% of projects will cost over 189% of their original estimates
•
16% for projects are completed on-time and on-budget
–
KPMG Canada Survey (1997)
•
61% of IT projects were deemed to have failed
–
Conference Board Survey (2001)
•
Only 1 in 3 large IT project customers were very “satisfied"
–
Robbins-Gioia Survey (2001)
•
51% of respondents viewed their large IT implementation project as unsuccessful
–
MacDonalds Innovate
(2002)
•
Automate fast food network from fry temperature to # of burgers sold-$180M USD write-off
–
Ford Everest
(2004)
•
Replacing internal purchasing systems-$200 million over budget
–
FBI (2005)
•
Blew $170M USD on suspected terrorist database-"start over from scratch"
http://www.it-cortex.com/stat_failure_rate.htm (accessed 9/14/02)
New York Times 1/22/05 pA31
14
1 in 3 IT projects suffers on
•
Price
•
Schedule
Copyright 2013 by Data Blueprint
IT Project Failure Rates
(moving average)
15
Source:
Standish Chaos Reports as reported at: http://www.galorath.com/wp/software-project-failure-costs-billions-better-estimation-planning-can-help.php
0%
15%
30%
45%
60%
1994
1993
1998
2000
2002
2004
2009
16%
27%
26%
28%
34%
29%
32%
53%
33%
46%
49%
51%
53%
44%
31%
40%
28%
23%
15%
18%
24%
Failed
Challenged
Succeeded
0
0.09
0.18
0.27
0.36
0.45
Successful
Partial Success
Don't know/too soon to tell
Unsuccessful
Does not exist
•
In 25 years:
–
"Successful" DM organizations fell from 43% to 15%
–
"Unsuccessful" increased from 5% to 21%.
Copyright 2013 by Data Blueprint
% of DM organizations labeled "successful"
16
1981
2007
Copyright 2013 by Data Blueprint
Why Data Projects Fail by
Joseph R. Hudicka
•
Assessed 1200
migration projects!
–
Surveyed only
experienced migration
specialists who have
done at least four
migration projects
•
The median project
costs over 10 times the amount planned!
•
Biggest Challenges: Bad Data; Missing Data; Duplicate Data
•
The survey did not consider projects that were cancelled largely due
to data migration difficulties
•
"… problems are encountered rather than discovered"
$0
$125,000
$250,000
$375,000
Median Project Expense
Median Project Cost
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-31
17
Copyright 2013 by Data Blueprint
Not Enough Data Management Involvement
18
Data Warehousing
XML
Data Quality
Customer Relationship Management
Master Data Management
Customer Data Integration
Enterprise Resource Planning
Enterprise Application Integration
£12bn NHS computer system is scrapped
Copyright 2013 by Data Blueprint
•
The biggest civilian IT project of its kind
in the world, it has already squandered at
least £12.7billion. Some estimates put
the cost far higher.
•
Following an official review, the ‘one size
fits all’ IT project will be replaced by much
cheaper regional initiatives, with hospitals
and GPs choosing the IT system they
need.
Read more:
http://www.dailymail.co.uk/news/article-2040259/NHS-IT-project-failure-Labours-12bn-scheme-scrapped.html#ixzz2R1yb9F1i
19
Copyright 2013 by Data Blueprint
20
Data Strategy in Context
Organizational
IT Strategy
Data Strategy
Only 1 is 10 organizations has a board approved data
strategy!
Copyright 2013 by Data Blueprint
What does it mean to treat data as an organizational asset?
•
Assets are economic resources
–
Must own or control
–
Must use to produce value
–
Value can be converted into cash
•
An asset is a resource controlled by the
organization as a result of past events
or transactions and from which future
economic benefits are expected to flow
to the organization [Wikipedia]
•
Data are an organization's
–
Sole, non-depletable, non-degrading,
durable, strategic asset
•
With assets:
–
Formalize the care and feeding of data
• Cash management - HR planning
–
Put data to work in unique/significant ways
• Identify data the organization will need
[Redman 2008]
21
Copyright 2013 by Data Blueprint
Reduce-Reuse-Recycle … Data?
•
Reduce the amount of organizational data ROT
–
Redundant, obsolete, trivial
•
Reuse the remainder
–
Fewer vocabulary items to resolve
–
Greater quality engineering leverage
•
Integration is impossible without information architecture
components (for mapping)
–
Maintenance of these components
promotes greater reuse
•
Shared data is typified by
organizational ability to use
information as a strategic asset
•
However, assets are useless
without knowledge of the
asset characteristics
Copyright 2013 by Data Blueprint
Leverage is an Engineering Concept
23
Copyright 2013 by Data Blueprint
What is meant by use
of an information
architecture?
•
Application of data assets
towards organizational
strategic objectives
•
Assessed by the maturity of
organizational data
management practices
•
Results in increased
capabilities, dexterity, and
self awareness
•
Accomplished through use
of data-centric development
practices (including
taxonomies, stewardship,
and repository use)
Copyright 2013 by Data Blueprint
Data Leverage
•
Permits organizations to better manage their sole depletable,
non-degrading, durable, strategic asset - data
–
within the organization, and
–
with organizational data exchange partners
•
Leverage
–
Obtained by implementation of data-centric technologies, processes, and human skill
sets
–
Increased by elimination of data ROT (redundant, obsolete, or trivial)
• The bigger the organization, the greater potential leverage exists
•
Treating data more asset-like simultaneously
1. lowers organizational IT costs and
2. increases organizational knowledge worker productivity and the pace of innovation
25Less ROT
Technologies
Process
People
Copyright 2013 by Data Blueprint
Data Strategy Choices
26
Q1
Keeping the doors open
(little or no proactive
data management)
Q2
Increasing organizational
efficiencies/effectiveness
Q3
Using data to create
strategic opportunities
Both
Q4
(Cash Cow)
Improve Operations
Innovation
Only 1 is 10 organizations has a
Copyright 2013 by Data Blueprint
Great point of initial
inspiration ...
•
Formalizing stuff forces clarity
•
Special shout out to Chapter 7
–
Measuring the value of
information
–
ISBN: 0470539399
–
http://www.amazon.com/How-
Measure-Anything-Intangibles-Business
27•
This Virginia cancer center is a
leader in shaping the fight
against cancer
•
Over 500 researchers and
staff tend to over 12,000
patients annually
•
This requires robust
information management and
analytical services
•
The problem: It takes 1 month
to run a report on an incident,
i.e. a patient’s hospital visit
that shows all touch points
Copyright 2013 by Data Blueprint
A National Cancer Institute
•
Data Blueprint engineered a
solution that provides a 360 degree
view of an incident, i.e. patient’s
hospital visit
•
New solution provides reports in 2
days: 360 degree view of patient’s
data including diagnosis, treatment,
etc.
•
Integrated hospital and physician
data enhances financial and asset
utilization
•
Results include improved quality of
care, optimized workflow processes
as well as operational performance
Copyright 2013 by Data Blueprint
A National Cancer Institute (cont’d)
29
Copyright 2013 by Data Blueprint
30
1.Manual transfer of digital data
2.Manual file movement/duplication
3.Manual data manipulation
4.Disparate synonym reconciliation
5.Tribal knowledge requirements
6.Non-sustainable technology
0
25
50
75
100
Current
Improved
Copyright 2013 by Data Blueprint
Reversing The Measures
•
Currently:
–
Analysts spend 80% of their time manipulating data and 20% of their time
analyzing data
–
Used to take 1 month to produce key reports
•
After rearchitecting:
–
Analysts spend 20% of their time manipulating data and 80% of their time
analyzing data
–
Two days to produce key reports
31Manipulation
Analysis
Savings come from a variety
of agreed upon categories
and values:
•
Reduced hospital
re-admissions
•
Patient Monitoring:
Inpatient, out-patient,
emergency visits and ICU
•
Preventive care for ACO
•
Epidemiology
•
Patient care quality and
program analysis
Copyright 2013 by Data Blueprint
$300 billion is the potential annual value to health care
32
$165
$108
$47
$9$5
Transparency in clinical data and clinical decision support
Research & Development
Advanced fraud detection-performance based drug pricing
Public health surveillance/response systems
Copyright 2013 by Data Blueprint
Book Recommendation
•
Permits the
reorientation of
medicine
–
From populations
–
To individuals
•
Big Data Capture
–
Wireless sensors
–
Genome sequencing
–
Printing organs
33
Analytics in Health Care
Copyright 2013 by Data Blueprint
3
!
Organization-wide
!
Volume and Noise
!
Utility
!
Meaningful scoring
!
Actionable recs
!
Realistic goals
!
Support
!
Manage & measure
•
Descriptive
Ask:
What happened?
What is happening?
Find:
Structured data
Show:
Profiles, Bar/pie charts, Narrative
Predictive
Ask:
What will happen?
Why will it happen?
Find:
Structured/unstructured data
Show:
Risk Profiles, Pros/Cons, Care Recs
Prescriptive
Ask:
What should I do? Why should I do it?
Find:
Unstructured/structured data
Copyright 2013 by Data Blueprint
35
Copyright 2013 by Data Blueprint
Results: It is not always about money
•
Solution:
–
Integrate multiple databases into
one to create holistic view of data
–
Automation of manual process
•
Results:
–
Data is passed safely and effectively
–
Eliminate inconsistencies,
redundancies, and corruption
–
Ability to cross-analyze
–
Significantly reduced turnaround
time for matching patients with
potential donor -> increased
potential to make life-saving
connection in a manner that is
faster, safer and more reliable
–
Increased safe matches from 3 out
of 10 to 6 out of 10
Copyright 2013 by Data Blueprint
“Our hospital wants us to use the
existing system, can we create an
Oncology ‘cube’?”
•
Can you get all the information you need in a “cube” from
an existing business intelligence data system?
–
Would it include outpatient care?
–
Would it capture the whole care continuum?
–
Would it allow you to categorize by disease type?
–
Would it allow you to categorize by modality of care?
37
Copyright 2013 by Data Blueprint
9
Getting the C-suite’s attention:
How much of the hospital’s business is Oncology?
Disease-centered analyses are not limited to cost centers, divisions,
clinics, units, or other “silos” for strategic planning purposes.
Copyright 2013 by Data Blueprint
Profitability by disease and modality
13
39
Copyright 2013 by Data Blueprint
21
Market Analysis
•
Two measures of market share indicate that 26% - 32% of
cancer patients in Central Virginia receive some or all of
their treatment here.
•
In other words, 68% - 74% receive none of their care here.
•
VCU is capturing only 20% of inpatient oncologic surgeries
originating in our primary service area (Bon Secours
captures 40%, HCA 36%). VCU is 4
th
in state for oncologic
surgeries, behind UVA, Inova Fairfax, and St Mary’s.
•
This gives us plenty of opportunity to increase volumes.
Copyright 2013 by Data Blueprint
State-wide, regional context
22
41Copyright 2013 by Data Blueprint
Patient-based analyses following patient
from diagnosis through treatment
From diagnosis:
Primary site of cancer
Date of diagnosis
Stage/spread of disease
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x x
x
x
x
x
x
x
Hospital
Cancer
Registry
Hospital, Physician, Pharm Claims
1111111
2222222
3333333
4444444
5555555
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Follow patient interactions over time:
Capture all encounter dates and details
+
Copyright 2013 by Data Blueprint
Consulting firm: “Close down palliative care program”
•
VCU Health System opened one of first Palliative Care
Units in the US, May 2000.
•
Consultants recommended closing it in 2002.
–
They looked at net margin for hospitalizations ending on the PC Unit
and saw that the costs greatly exceeded reimbursement.
–
They thought that getting rid of the unit would get rid of this problem.
•
RWJ Foundation supported urgent response.
•
Appropriate financial analyses convinced consultants that
the unit actually produced valuable hospital outcomes.
–
See KR White & JB Cassel (2009). “The Business Case for a Hospital
Palliative Care Unit: Justifying its Continued Existence”. Practice of
Evidence-Based Management, T Kovner, D Fine & R D’Aquila (Eds.), Chicago: Health
Administration Press, pp 171-180.
43
Copyright 2013 by Data Blueprint
Cost-avoidance in drugs (-77%), labs
(-95%), imaging (-95%), supplies (-60%).
Copyright 2013 by Data Blueprint
8 Hospital study of cost reduction
Slide 34
Morrison, Penrod, Cassel et al. (2008). Cost savings associated with US hospital palliative care consultation programs. Archives
of Internal Medicine 168 (16), 1783-1790.
45
Copyright 2013 by Data Blueprint
Slide 35
250.0000
500.0000
750.0000
1000.0000
1250.0000
1500.0000
1750.0000
2000.0000
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21
Direct Cost ($)
Day of Admission
8 Hospital Study of Cost Reduction
Morrison, Penrod, Cassel et al. (2008). Cost savings associated with US hospital palliative care consultation programs. Archives
of Internal Medicine 168 (16), 1783-1790.
PC consult day
10-11
PC consult day
12-13
Usual Care
46Copyright 2013 by Data Blueprint
What we know from the cancer
registry…
What we gain
from integrating
billing claims
!
!
A closer look …
47Copyright 2013 by Data Blueprint
1. Adopting a crawl, walk, run strategy
2. Understanding current and potential
organizational maturity and corresponding
capabilities
3. Achieving an appropriate technology/human
capability balance
4. Implementing useful IT systems development
practices
5. Installing necessary non-IT leadership
48
Copyright 2013 by Data Blueprint
Not Enough Data Management Involvement
49
Data Warehousing
XML
Data Quality
Customer Relationship Management
Master Data Management
Customer Data Integration
Enterprise Resource Planning
Enterprise Application Integration
Initiative Leader
Initiative Involvement
Not Involved
0
0.09
0.18
0.27
0.36
0.45
Successful
Partial Success
Don't know/too soon to tell
Unsuccessful
Does not exist
•
In 25 years:
–
"Successful" DM organizations fell from 43% to 15%
–
"Unsuccessful" increased from 5% to 21%.
Copyright 2013 by Data Blueprint
% of DM organizations labeled "successful"
50
1981
2007
26%
68%
6%
9%
75%
6%
DM 1st
DBMS 1st
Simultaneously
Copyright 2013 by Data Blueprint
DM Origins – Which arrives first – DM or DBMS?
•
A key indicator of organizational awareness
•
75% reacting instead of anticipating
•
Best practices are obvious
1981
2007
51
Copyright 2013 by Data Blueprint
Why Data Projects Fail by
Joseph R. Hudicka
•
Assessed 1200
migration projects!
–
Surveyed only
experienced migration
specialists who have
done at least four
migration projects
•
The median project
costs over 10 times the amount planned!
•
Biggest Challenges: Bad Data; Missing Data; Duplicate Data
•
The survey did not consider projects that were cancelled largely due
to data migration difficulties
•
"… problems are encountered rather than discovered"
$0
$125,000
$250,000
$375,000
$500,000
Median Project Expense
Median Project Cost
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-31
•
Approximately, 10%
percent of
organizations
achieve parity and
(potential positive
returns) on their DM
investments
•
Only 30% of DM
investments achieve
tangible returns at all
•
Seventy percent of
organizations have
very small or no
tangible return on
their DM investments
Copyright 2013 by Data Blueprint
Largely
Ineffective Data
Management
Investments
Investments
53Investment <= Return
10%
Investment > Return
20%
Return
≈
0
70%
Copyright 2013 by Data Blueprint
Copyright 2013 by Data Blueprint
Cruiser Collector
55Data Program
Coordination
Feedback
Data
Development
Copyright 2013 by Data Blueprint
Standard
Data
Five Integrated DM Practice Areas
Organizational Strategies
Goals
Business
Data
Business Value
Application
Models &
Designs
Implementation
Direction
Guidance
56Organizational
Data Integration
Data
Stewardship
Data Support
Operations
Data
Asset Use
Integrated
Models
Leverage data in organizational activities
Data management
processes and
infrastructure
Combining multiple
assets to produce
extra value
Organizational-entity
subject area data
integration
Provide reliable
data access
Achieve sharing of data
within a business area
Copyright 2013 by Data Blueprint
Organizational DM Practices and Inter-relationships
57
Assign responsibilities for data.
Manage data coherently.
Share data across boundaries.
Engineer data delivery systems.
Maintain data availability.
Data Program
Coordination
Organizational
Data Integration
Data
Stewardship
Development
Data
Data Support
Operations
Copyright 2013 by Data Blueprint
Data Management Capability
Maturity Model Levels
Our DM practices are
ad hoc
and dependent
upon "heroes" and heroic efforts
Initial
(1)
Repeatable
(2)
We have DM experience
and have the ability to
implement
disciplined
processes
We have experience that we
have
standardized
so that all in
the organization can follow it
Defined
(3)
Managed
(4)
We
manage
our DM processes so that the
whole organization can follow our standard
DM guidance
Optimizing
(5)
We have a process for
improving
our DM capabilities
One concept for process
improvement, others
include:
•
Norton Stage Theory
•
TQM
•
TQdM
•
TDQM
•
ISO 9000
and focus on
understanding current
processes and
determining where to
make improvements.
Copyright 2013 by Data Blueprint
Assessment Components
Data Management Practice Areas
Data Management Practice Areas
Data program
coordination
DM is practiced as a
coherent and coordinated
set of activities
Organizational data
integration
Delivery of data is support
of organizational
objectives –
the currency
of DM
Data stewardship
Designating specific
individuals caretakers for
certain data
Data development
Efficient delivery of data
via appropriate channels
Data support
Ensuring reliable access to
data
Capability Maturity
Model Levels
Examples of practice maturity
1 – Initial
Our DM practices are ad hoc and
dependent upon "heroes" and heroic
efforts
2 - Repeatable
We have DM experience and have the
ability to implement disciplined
processes
3 - Documented
We have standardized DM practices so
that all in the organization can perform it
with uniform quality
4 - Managed
We manage our DM processes so that
the whole organization can follow our
standard DM guidance
5 - Optimizing
We have a process for improving our
DM capabilities
59
Copyright 2013 by Data Blueprint
•
CMU's Software
Engineering Institute (SEI) Collaboration
•
Results from hundreds organizations in
various industries including:
✓
Public Companies
✓
State Government Agencies
✓
Federal Government
✓
International Organizations
•
Defined industry standard
•
Steps toward defining data management
"state of the practice"
Data Management Practices
Measurement (DMPA)
60
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
Focus:
Implementation
and Access
Focus:
Guidance and
Facilitation
Optimizing (V)
Managed (IV)
Documented (III)
Repeatable (II)
Initial (I)
Copyright 2013 by Data Blueprint
Comparison of DM Maturity 2007-2012
611
2
3
4
5
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
2007 Maturity Levels
2012 Maturity Levels
Service Orient or Be Doomed!
Copyright 2013 by Data Blueprint
•
Service Orient or Be
Doomed!
–
How Service Orientation Will
Change Your Business
(Hardcover) by Jason
Bloomberg & Ronald
Schmelzer
–
I'm not quite sure what "doom"
awaits by not service orienting,
other than remaining mired in
archaic, calcified and siloed
processes — which a lot of
businesses do anyway, and still
manage to stay afloat. But
that's the topic for another
posting.
• Reviewer
Copyright 2013 by Data Blueprint
How SOA/Services are "Sold"
Integration Possibilities
•
User Interface
•
Business Process
•
Application
•
Data
AV Component
•
Well defined components
•
Self-contained
•
No interdependencies
Analogy derived from D. Barry "Web Services" Intelligent Enterprise 10/10/03 pp. 26-47 - wiring diagram from sunflowerbroadband.com
63Copyright 2013 by Data Blueprint
Contractor Implemented Wiring
Concise Notes on
Software Engineering
Copyright 2013 by Data Blueprint
–
Published in 1979
–
93 pages including appendices
& references
–
Out of print
–
$1.99 at half.com
•
Principles of Information Hiding (p.
32-33)
–
Conceal complex data
structures whenever possible
–
Allow only selected service
modules to know about the
concealed data structures
–
Bind together modules that
know about concealed data
structures
–
Package such modules along
with the data itself
65
All Contents © 2008 Burton Group. All rights reserved.
SOA is Dead; Long Live Services
8 April 2009
Anne Thomas Manes
VP & Research Director
[email protected]
www.burtongroup.com
•
We have a replay of the presentation, which I
gave in February, on the Burton BrightTalk
Channel:
http://www.brighttalk.com/channels/750/view
(You have to page down to get to the SOA is Dead
presentation.)
All Contents © 2008 Burton Group. All rights reserved.
ECONOMY
SOAsaurus
SOA
met its demise on January 1, 2009,
when it was wiped out by the catastrophic
impact of the economic recession.
SOA is survived by its offspring: mashups,
SaaS, Cloud Computing, BPM, and all other
architectural approaches that depend on
"services."
SOA Obituary
SOA Postmortem: Why did it die?
•
Vague abstract architectural concept
•
No universally accepted meaning
•
Indefensible value proposition
•
How do you measure flexibility/agility?
•
Cost savings are lower than anticipated
•
Success rate is very low
•
Ill defined term of dubious business value
67
Copyright 2013 by Data Blueprint
SOA DM Maturity Requirements
1.00
2.00
3.00
4.00
Data Program Coordination
Organizational Data Integration
Data Stewardship
Data Development
Data Support Operations
•
Conclusion - more ground to cover than has been attained to
date
8/7/09 8:40 AM
Lack of Focus on Data Killing SOA - Leveraging Information and Intelligence
Page 1 of 6
http://www.ebizq.net/blogs/linthicum/2009/07/lack_of_focus_on_da…WfF9wsRons67fLqzsmxzEJ8%2Fx6%2BwqT%2Frn28M3109ad%2BrmPBy93Yo%3D
Subscribe
Slashdot
Digg
De.li.cio.us
Stumble It!
newsvine
Leveraging Information and Intelligence
David Linthicum
Lack of Focus on Data Killing SOA
By
David Linthicum
on July 24, 2009 1:42 PM
14
Vote 0 Votes
For those of you that have been following me know that I'm very much an advocate of SOA. The
architectural pattern of SOA is helpful in defining an enterprise architecture that much more agile,
and thus pays for itself once the business has to shift and needs IT to follow.
SOA, however, is complex and requires that the architect understand all aspects of the "as is"
architecture before moving to the "to be." This means decomposing the existing architecture down to
a primitive state, and rebuilding it up again at sets of services, with a process configuration or
composite applications layer to define and redefine business functions. I think most get that.
What's missing within most typical SOA projects is the focus on the data, and that is killing SOA.
Since the "S" in SOA, means service, most architects focus on the service definition, abstracting the
existing data into collections of services, but don't pay much attention to the data within the
architecture. Not good.
The truth is that the foundation of a healthy and functional SOA is the data, and you have to deal
with the underlying data first, understand it, perhaps reorganize and abstract it, before defining the
services that will sit on top of the data. While this is architecture 101, the fact is that those driving
SOAs these days have little understanding of the importance of understanding and defining the data,
and thus the architecture ends up being a bunch of well defined services that sit on top of very
dysfunctional data. The end result is performance issues, data integrity issues, and even the lack of
agility which is why you build SOAs in the first place.
The truth is that most failed SOA projects can be traced to the lack of a data level understanding,
and while this is still an issue in this day and time is beyond me. There are many technology and
tools out there to assist you, and we've been doing data for a long long time. Nothing new here, just
data. However, if you ignore it your SOA will be still born.
More Articles:
« "Fighting cancer with Business Intelligence"
|
Main
|
Why IBM Buying SPSS is bad for BI »
Vijay Narayanan
|
July 26, 2009 10:42 PM
|
Reply
Completely agree David. I think data from a SOA standpoint is extremely critical for several reasons:
- reducing errors/rework in automated business processes that leverage enterprise data services
Industry expert Dave Linthicum's tells you what
you need to know about building efficiency into
the information management infrastructure
David Linthicum
David Linthicum is an internationally
known distributed computing and
application integration expert.
View more
Subscribe
Subscribe in a reader
Subscribe to SOA Visionaries
Recently Commented On
More on Why Big BI is Bad BI
(2)
Brian Gentile
wrote: In the near-term, BI
vendor consoli... [
more
]
Why IBM Buying SPSS is bad for BI
(3)
Chris Ballenger
wrote: Well the world of BI
may be getting... [
more
]
Lack of Focus on Data Killing SOA
(14)
Kingsley Idehen
wrote: David, I've come
late to this post... [
more
]
"Fighting cancer with Business Intelligence"
(1)
guest wrote: As an IT professional and a
person ... [
more
]
BI is Evolving Quickly
(3)
Abel_T wrote: On the web-based
EVERYTHING note, I... [
more
]
Recent Entries
More on Why Big BI is Bad BI
Why IBM Buying SPSS is bad for BI
Lack of Focus on Data Killing SOA
The Insider's Guide to Business and IT Agility
Search
Advanced Search
Go
Login
Sign Up
14 Comments
|
Leave a comment
Home
Event Center
Solution Center
Blogs
Subscriptions
White Papers
Copyright 2013 by Data Blueprint
Lack of Focus on Data is Killing SOA
69
What's missing within most typical SOA
projects is the focus on the data
The truth is that the foundation
of a healthy and functional
SOA is the data
Most failed SOA projects can be traced to the
lack of a data level understanding
Copyright 2013 by Data Blueprint
Hierarchy of Data Management Practices (after Maslow)
http://3.bp.blogspot.com/-ptl-9mAieuQ/T-idBt1YFmI/AAAAAAAABgw/Ib-nVkMmMEQ/s1600/maslows_hierarchy_of_needs.png
Advanced
Data
Practices
•
Cloud
•
MDM
•
Mining
•
Big Data
•
Analytics
•
Warehousing
•
SOA
•
5 Data management
practices areas /
data management
basics ...
•
... are necessary but
insufficient
prerequisites to
organizational data
leveraging
applications that is
self actualizing data
or advanced data
practices
Basic Data Management Practices
–
Data Program Management
–
Organizational Data Integration
–
Data Stewardship
–
Data Development
–
Data Support Operations
Copyright 2013 by Data Blueprint
1. Adopting a crawl, walk, run strategy
2. Understanding current and potential
organizational maturity and corresponding
capabilities
3. Achieving an appropriate technology/human
capability balance
4. Implementing useful IT systems development
practices
5. Installing necessary non-IT leadership
71
101 Workshop: Necessary Pre-requisites
Copyright 2013 by Data Blueprint
J. C. R. Lickleider's Man-Computer Symbiosis
72
Humans Generally Better
Machines Generally Better
•
Sense low level stimuli
•
Detect stimuli in noisy background
•
Recognize constant patterns in varying situations
•
Sense unusual and unexpected events
•
Remember principles and strategies
•
Retrieve pertinent details without a priori
connection
•
Draw upon experience and adapt decision to
situation
•
Select alternatives if original approach fails
•
Reason inductively; generalize from observations
•
Act in unanticipated emergencies and novel
situations
•
Apply principles to solve varied problems
•
Make subjective evaluations
•
Develop new solutions
•
Concentrate on important tasks when overload
occurs
•
Adapt physical response to changes in situation
•
Sense stimuli outside human's range
•
Count or measure physical quantities
•
Store quantities of coded information accurately
•
Monitor prespecified events, especially infrequent
•
Make rapid and consisted responses to input
signals
•
Recall quantities of detailed information accurately
•
Retrieve pertinent detailed without a priori
connection
•
Process quantitative data in prespecified ways
•
Perform repetitive preprogrammed actions reliably
•
Exert great, highly controlled physical force
•
Perform several activities simultaneously
•
Maintain operations under heavy operation load
•
Maintain performance over extended periods of
time
Copyright 2013 by Data Blueprint
73
•
60 GB of data/second
•
200,000 hours of big data will
be generated testing systems
•
2,000 hours media coverage/
daily
•
845 million facebook users
averaging 15 TB/day
•
13,000 tweets/second
•
4 billion watching
•
8.5 billion devices connected
2012 London Summer Games
Copyright 2013 by Data Blueprint
Corporate Governance
• "Corporate governance - which can be
defined narrowly as the relationship of a
company to its shareholders or, more
broadly, as its relationship to
society….", Financial Times, 1997.
• "Corporate governance is about
promoting corporate fairness,
transparency and accountability" James
Wolfensohn, World Bank, President
Financial Times, June 1999.
• “Corporate governance deals with the
ways in which suppliers of finance to
corporations assure themselves of
getting a return on their investment”,
The Journal of Finance, Shleifer and
Vishny, 1997.
Copyright 2013 by Data Blueprint
Definition of IT Governance
•
IT Governance:
•
"putting structure around how organizations align IT strategy with business
strategy, ensuring that companies stay on track to achieve their strategies
and goals, and implementing good ways to measure IT’s performance.
•
It makes sure that all stakeholders’ interests are taken into account and
that processes provide measurable results.
•
An IT governance framework should answer some key questions, such as
how the IT department is functioning overall, what key metrics
management needs and what return IT is giving back to the business
from the investment it’s making."
CIO Magazine
(May 2007)
According to the IT Governance Institute, there are five areas of focus:
•
Strategic Alignment
•
Value Delivery
•
Resource Management
•
Risk Management
•
Performance Measures
75Copyright 2013 by Data Blueprint
Data Governance Definitions
• The other half of MDM – The Bloor Group
• The formal orchestration of people, process, and technology to enable an organization to
leverage data as an enterprise asset. - The MDM Institute
• A convergence of data quality, data management, business process management, and risk
management surrounding the handling of data in an organization –
Wikipedia
• A system of decision rights and accountabilities for information-related processes, executed
according to agreed-upon models which describe who can take what actions with what
information, and when, under what circumstances, using what methods
– Data Governance
Institute
• The execution and enforcement of authority over the management of data assets and the
performance of data functions –
KiK Consulting
• A quality control discipline for assessing, managing, using, improving, monitoring,
maintaining, and protecting organizational information
– IBM Data Governance Council
•
Data
governance
is the formulation of policy to optimize, secure, and leverage information
as an enterprise asset by aligning the objectives of multiple functions
–
Sunil Soares
• The exercise of authority and control over the management of data assets –
DM BoK
Suicide Mitigation
Copyright 2013 by Data Blueprint
77
Suicide Mitigation
Data Mapping
12
Mental
illness
Deploy
ments
Work
History
Soldier
Legal
Issues
Abuse
Suicide
Analysis
FAP
DMSS
G1
DMDC
CID
Data objects
complete?
All sources
identified?
Best source for
each object?
How reconcile
differences
between
sources?
MDR
Copyright 2013 by Data Blueprint
Copyright 2013 by Data Blueprint
Senior Army Official
•
A very heavy dose of
management support
•
Any questions as to future
data ownership, "they should make an
appointment to speak directly with me!"
•
Empower the team
–
The conversation turned from "can this be
done?" to "how are we going to accomplish
this?"
–
Mistakes along the way would be tolerated
–
Implement a workable solution in prototype form
79
Copyright 2013 by Data Blueprint
Communication Patterns
80
Source: The Challenge and the Promise: Strengthening the Force, Preventing Suicide and Saving Lives - The Final Report of the
Copyright 2013 by Data Blueprint
Technique/Technical Interdependencies
81
Master Data Management
Data Quality
Data Governance
Copyright 2013 by Data Blueprint
Benefits of a Database
•
Data can be shared
•
Redundancy can be reduced
–
All redundancy cannot be or necessarily should be reduced
•
Inconsistency can be avoided
–
Data obtained by Physics department will be the same as
the Chemistry department
•
Transaction support can be provided
–
Transaction is not complete until money is deleted from the
savings account after adding it to the checking account
•
Integrity can be maintained
–
A student can be recorded as having obtained 1000 marks, as compared to 100 – this can be corrected by
enforcing integrity.
•
Security can be enforced
–
Information on demand – Finance need to see the records related to Human resources
•
Conflicting requirements can be balanced
–
Volume of data as compared to speed Business Standards can be enforced
•
Data Dependency
–
Technique used to physically stored and accessed are dictated by the application, and the knowledge of physical
representation and access technique is built into the application code.
–
Not desirable in a Database System Different users require different views of the same data
–
Freedom to change the physical representation or access technique in view of the changing requirements
• Changing record types
• Physical storage location
Architecture is both the process and
product of planning, designing and
constructing space that reflects functional,
social, and aesthetic considerations.
A wider definition may comprise all design
activity from the macro-level (urban
design, landscape architecture) to the
micro-level (construction details and
furniture).
In fact, architecture today may refer to the
activity of designing any kind of system
and is often used in the IT world.
Copyright 2013 by Data BlueprintArchitecture
83
Copyright 2013 by Data Blueprint
Typically Managed Architectures
84
•
Enterprise Architecture
•
Business Architecture
•
Systems Architecture
Network
Arrangement
Hierarchical
Arrangement
•
Process Architecture
–
A
rrangement of inputs -> transformations = value -> outputs
–
Typical elements: Functions, activities, workflow, events, cycles,
products, procedures
•
Systems Architecture
–
Applications, software components, interfaces, projects
•
Business Architecture
–
Goals, strategies, roles, organizational structure, location(s)
•
Security Architecture
–
Arrangement of security controls relation to IT Architecture
•
Technical Architecture/Tarchitecture
–
Relation of software capabilities/technology stack
–
Structure of the technology infrastructure of an enterprise, solution or
system
–
Typical elements: Networks, hardware, software platforms, standards/
protocols
•
Data/Information Architecture
–
Arrangement of data assets supporting organizational strategy
–
Typical elements: specifications expressed as entities, relationships,
attributes, definitions, values, vocabularies
Copyright 2013 by Data Blueprint
Information Architectures
•
The underlying (information) design principals
upon which construction is based
– Source: http://architecturepractitioner.blogspot.com/
•
… are plans, guiding the transformation of
strategic organizational information needs into
specific information systems development
projects
– Source: Internet
•
A framework providing a structured description
of an enterprise’s information assets —
including structured data and unstructured or
semistructured content — and the relationship
of those assets to business processes,
business management, and IT systems.
– Source: Gene Leganza, Forrester 2009
•
"Information architecture is a foundation
discipline describing the theory, principles,
guidelines, standards, conventions, and factors
for managing information as a resource. It
produces drawings, charts, plans, documents,
designs, blueprints, and templates, helping
everyone make efficient, effective, productive
and innovative use of all types of information."
– Source: Information First by Roger & Elaine Evernden, 2003
ISBN 0 7506 5858 4 p.1.
•
Defining the data needs of the enterprise and
designing the master blueprints to meet those
needs
– Source: DM BoK
85
Copyright 2013 by Data Blueprint
Data Architecture – Better Definition
86
•
All organizations have information
architectures
–
Some are better
understood
and
documented
(and therefore more
useful
to the organization) than
others.
•
Common vocabulary expressing
integrated requirements ensuring
that data assets are stored,
arranged, managed, and used in
systems in support of
Copyright 2013 by Data Blueprint
Vocabulary is Important-Tank, Tanks, Tankers, Tanked
87
Copyright 2013 by Data Blueprint
How one inventory item proliferates data throughout the chain
88
555"Subassemblies"&"subcomponents
17,659"Repair"parts"or"Consumables
System 1:
18,214 Total items
75 Attributes/ item
1,366,050 Total attributes
System"2
47"Total"items
15+"A@ributes/item
720"Total"a@ributes
System"3
16,594"Total"items
73"A@ributes/item
1,211,362"Total"a@ributes
System"4
8,535"Total"items
16""A@ributes/item
136,560"Total"a@ributes
System"5
15,959""Total"items
22""A@ributes/item
351,098"Total"a@ributes
Total"for"the"five"systems"show"above:
59,350"Items
179"Unique"a@ributes
3,065,790"values
•
National Stock Number (NSN)
Discrepancies
–
If NSNs in LUAF, GABF, and RTLS are
not present in the MHIF, these records
cannot be updated in SASSY
–
Additional overhead is created to correct
data before performing the real
maintenance of records
•
Serial Number Duplication
–
If multiple items are assigned the same
serial number in RTLS, the traceability of
those items is severely impacted
–
Approximately $531 million of SAC 3
items have duplicated serial numbers
•
On-Hand Quantity Discrepancies
–
If the LUAF O/H QTY and number of items serialized in RTLS conflict, there can
be no clear answer as to how many items a unit actually has on-hand
–
Approximately $
5 billion
of equipment does not tie out between the LUAF and
RTLS
Copyright 2013 by Data Blueprint
Business Implications
Copyright 2013 by Data Blueprint
Information Architecture Representation
•
Information architectures are the symbolic representation
of the structure, use and reuse of information resources
•
Common components are represented using standardized
notation and are sufficiently detailed to permit both
business analysts and technical personnel to separately
read the same model, and come away with a common
understanding and yet they are developed effectively.
Copyright 2013 by Data Blueprint
Architectural Answers
(Adapted from [Allen & Boynton 1991])
Computers
Human resources
Communication facilities
Software
Management
responsibilities
Policies,
directives,
and rules
Data
91•
Where do they go?
•
When are they needed?
•
What standards
should be adopted?
•
What vendors
should be chosen?
•
What rules should govern
the decisions?
•
What policies should guide
the process?
•
How and why do the components interact?
•
Why and how will the changes be implemented?
•
What should be managed organization-wide and what should
be managed locally?
Copyright 2013 by Data Blueprint