Considerations for Data Management
Handle unstructured data growth, in-flight /streaming
(perishable) & data at rest
parse, standardize and match/household to create holistic
views
Data governance
Reference data management
Data profiling
Data cleansing
Data matching/survivorship
Data monitoring
Considerations for Master Data Management
Data Domains
customer member patient citizen product partner materials employees location assetData Sources
– Structured, semi-structured, unstructured – Enterprise applications – Databases, text files, XML – On premises, off premises (cloud)Data Latency
and Granularity
– Real-time, transaction – Bulk, batch – Virtual, federatedThe conduit used to move or merge data
across disparate processes & systems
Explains business information
Describes business process results
Ensures consistent meaning and interpretation
Supports identification of things
uniquely identifies a business entity
A set of common attributes that in combination are unique
Considerations for Master Data Management
Dark Orange
sRGB (255,140,0): (FF8C00)
Orange
The IM Supply Chain – Considerations for AM
formulate & answer questions
acquire, explore & prepare data, ensure quality
transform and integrate
develop & evaluate Validate, deploy & monitor
models
deliver & act upon reports and intelligence
Data management
Analytics management
Considerations for Analytics Management
centralize, organize, administer, evaluate & monitor
analytical models and their outcomes
push operations to database – query, scoring, bulk load
Considerations for Decision Management
Development, design, deployment
Determine where source/target tables and transformations are used in jobs and downstream BI & analytics processes -
understand the impact of a development change before making it Design jobs for grid or undistributed execution - understand job steps by modularizing procedure steps
Visual flows, metadata & data quality monitoring & scorecarding
Consumption Diversity
support an increasingly diverse set of audiences, interactions, and devices in a scalable, contiguous & concurrent fashion
Key Point #2
An organization’s ability to use information strategically is directly
Agenda
What is IM?
Review business drivers for IM
•
How has IM Evolved? - distinguishing ways to conduct IM
•
Considerations for expanding & sharpening your IM
arsenal
Illustrations
Master Data Management
Reference data describes business entities
Subject area specific
Reflects enterprise
business rules and values
Critical in integrating operational systems Customers Products Locations Employees Suppliers Assets
Creating a Master Record
Cust. Id 3721B First Name Willaim Middle James Last Name Corp. DOB April 12 SSN 56349123 Address3224 Pkwy G, Los Osos
ERP
Cust. Id First Name Middle Last Name DOB SSN Address
data
Cust. Id First Name
j Middle s Last Name sosuluski DOB April 12 SSN 123 Address BubbaJ@bubbagroup.com contact us Cust. Id 30391244 First Name William Middle J. Last Name Sosulski DOB 4-12-39 SSN 563491234 Address 123 Oak St., Eves, IL billing Cust. Id 30391-244 First Name William Middle Jim Last Name Sosewlsky DOB 04/12/39 SSN 563-49-1234 Address 123 Oak St., Eves, IL 30319 service claims
The IM Supply Chain
Formulate & answer questions
Acquire, explore & prepare data, ensure quality
Transform and integrate Validate, deploy & monitor
models
Deliver reports and intelligence
Data management
Illustrations – Analytics Management
Develop Candidate Models Compare Models Declare a Champion Model Validate Model Deploy Model or Scoring Function Validate Scoring Monitor Model Performance Request New Challenger Model Retire ModelModel Comparison
Determine champion/challenger Lift ROC K-S (Kolmogorov-Smirnov) Profile Delta Custom reportingDeclare Champion Model
Model promoted for scoring or used in production
Model Validation
Perform scoring tests
Create validation reports for the champion
Validate using test data sources
Model or Scoring Deployment
Publish or export to production system
Validate Scoring
Perform score code
validation independently from validation during model development
Score code validated in-database or in operational system
Monitor Model Performance
Summarize model details
Detect shifts in distribution of variable values over time in input and scored data
Evaluate the predicted and actual target values for a champion model at multiple points in time