© 2008 IBM Corporation
MDM Architecture and Implementation
Best Practices
Agenda
Fragmented Data Problems …
Master Data & Master Data Management
IBM Infosphere Master Data Management Server
MDM Server Implementation Pattern
MDM Architecture
MDM Data Stewardship Application
Islands of Information
Reduce customer satisfaction, decrease revenue, hinder relationships
Document Management Michael Johnson Mortgage.tif ERP JP Morgan, USA Cust ID : JP003 CRM Call Center Mike Johnson JP Morgan Chase Last Interaction: 4/11/03 (product not received) Retail Point of Sale
JP Morgan & Chase Contact : Michael A
Johnson, CIO 270 West St, NY
Portal
Michael Johnson User ID: Mjohnso ! Personalized access ! Gold Customer ! Sub: Newsletter 1 CRM Marketing Michael Johnson ! Opt-Out flag ! No Promotion flag Application Forms Payments Service and Support Online Registration In-Store Interactions Direct Mail Response Data Warehouse Michael P Johnson 1400 54rd Avenue NY NY 212 995-3345 3rdParty Information Internet Commerce Michael Johnson User id :mjohnson JP Morgan Contract:: JP987 Online Purchases
Coffee Beans GTIN 20012294219421 CAN Code : 21204 USA Code : 21192 BR, CR, MEX Code : 21186
CH, AUT, DE, UK, FR, BEL, NL, IT : Code : 21184
DE, FIN, SWE, NOR, ESP, POR, Code : 21190
BUL, YUG, CR, RO, SLOV Code : 19616 ISR Code : 21204 MidEa Code : 21204 World Trade Code : 19619, 19616 AUS Code : 21190
HK, TAI, SIN, MAL, S.KOR Code : 21188
JAP, THAI, INDO, PHI Code : 21189
CZ, LIT, EST, SLOV, RU Code : 2002494
Inconsistent Master Information is a Major Hurdle
Impacts Revenue, Cost, Agility and Compliance
ARG
Code : 21184
Gaining control over product information results:
Errors in data –30% of data in retailers systems is wrong
Lost productiv ity –25 minutes manual cleansing per SKU, per year
Slow time to market –4 weeks to introduce new products
Inv oice deductions –43% of invoices result in deductions
Failed scans –up to 70,000 per week (1 large US Retailer)
Lost sales –up to 3.5% per year
Source: A.T. Kearney, GMA, AMR Industry Driv ers:RFID, Waste Electrical and Electronic Equipment Recycling, Product Information Exchange Standards, Return of Hazardous Substances,
Master Data
Facts describing the core business entities: customers, suppliers,
partners, products, materials, bill of materials, chart of
accounts, location and employees
•
The high value information an organization uses repeatedly
across many business processes
•
Generally used across multiple LOB
Master Data Management
Master Data Management (MDM) is the set of disciplines,
technologies and solutions to create and maintain consistent,
complete, contextual and accurate business data for all
MDM Repository / Hub is analogous to a Version Control System,
albeit for your Master Data
MDM Server Implementation Environment
MDM Domains Business Servi ces
Pre- buil t Cust omi zable
Data Stewardshi p
Party Account Product C ustom
Integratio n
Content Dat a Analy tic s
InfoSphere MDM Server
I nfoSphere M DM Server
MDM Domains Business Servi ces
Pre- buil t Cust omi zable
Data Stewardshi p
Party Account Product C ustom
Integratio n
Content Dat a Analy tic s
InfoSphere MDM Server
I nfoSphere M DM Server
RDBMS
Security, Mail, etc
Real-time/Near-Real-Time Connectivity Services (ESB, EAI, Web Services, MQ, etc.)
Enterprise Data Integration (InfoServer)
Billing/ Provisioning
Call Center
W eb Phone Sales Vendor & Other
Business Partners Management
Service
Master Data Batch Load
External Data Providers (e.g. D&B, ACXIOM,
Experian)
Enterprise
Data Warehouse/ Data Mart
Corporate & Others Content Management Process Server
Understand Cleanse Transform Deliv er
Order Customer Product Account Others Customer Product Account Others Customer Product Account Others Customer Product Account Others New Systems (e.g. SOA –based)
(e.g. Siebel) (e.g COGNOS)
Real-time
Understand Cleanse Transform Deliver Unified Deployment
ETL Tooling
Unified Metadata Management
The Complete Picture – Multi Form MDM Server
Multiform MDM manages data domains critical to business processes Multiform MDM leverages merged, cleansed and standardized
IBM Master Data Management
Banking Insurance Government Healthcare Retail Telco
Focused on critical information intensive business problems In d u s tr y Mo d e ls & A s s e ts
Multiform Master Data Management
Collaborate Operationalize Analyze
Party (Customer, citizen, prospect, organization, supplier, distributor, etc.)
Product (good, service, product bundle, catalogue, product component, etc.)
A dmin Web A pp D ata Stewardship Web A pp Web Services A dapter Batc h P roc essor E vent M anager E SB / M Q / E A I Broker D as hboard/ P ortal U Is C lient A pplications MDM Consumers
Service Controller P arser
C onstructor
Request Framework
Bus iness Transaction Manager Bus iness Proxies
MDM Core
Utility Components
Bus iness L ogic C omponents E xtension C ontroller Java Classes Rule Sets Rules Engine Pre/Post Txn Pre/Post Action Extension Framework Common Components Standardization Audit Data Metadata Error Messages Data-Level Entitlements M etaData E vent M anager P erformance T rac ker E rror M es saging L ogging T ransaction Audit I nformation L og E xternal V alidation E xternal Bus iness Rules
Rules of V isibility C onfiguration M anager Rules Engine Logs ARM Agent Configuration Settings Validation Rules XM L C omposite T ransaction H andler Java Classes Rule Sets M es saging A dapter N otification Request H andler C ontroller Components Business Services Admin Services Party Services Contract Services History Services Fast Track Sec urity Events Validators E vergreen P roc essor Suspect Processing Components Notifications Adapters
Web Services As ynchronous Synchronous Client-Defined I nterface
Data WC C C o r e Operational Tables History Data History Tables Rule Data Code Tables Data Extension Tables Behavior Extensions Data Extensions JMS T opic E xtension T oolkit
Step 1: Optimizes data for statistical comparisons
– Normalizes & compacts data, creates derived data layer source data remains intact
– Phonetic equivalences, tokenization, nicknames, etc.
Step 2: Finds all the potential matches
– Casts a wide net – all matches on current or historical attributes, prevents misses
– Partial matches, reversals, anonymous values, etc.
Step 3: Scores accurately via probabilistic statistics
– Compares attributes one-by-one and produces a weighted score (likelihood ratio) for each pair of records
– Frequency weights specific to your business – Edit distance, proximity of match
Step 4: Custom threshold settings – Single or dual threshold models – Link, don’t link, don’t know – “learns”
from manual input
Manual review Manual review Lowest possible score Lowest possible score Highest possible score Highest possible score Don’t link Don’t
link LinkLink
Should be linked Should not be linked
Name 1222 2222 3334
Data Derived Hash
Buckets Robert RBT 121213444 Potter PTR 34839020 ZIP 3456 6666 5435 SSN 3421 3333 3555
7
9
8.63
Useful Links
IBM Information On Demand website
(http://www-306.ibm.com/software/data/information-on-demand)
IBM Information Management website
(http://www.ibm.com/software/data)
Extreme Leverage website
(http://w3-103.ibm.com/software/xl/portal/viewcontent?type=doc&srcID=XT&docID=B329727F31168I32)
Customer Success Stories
(http://www-306.ibm.com/software/success/cssdb.nsf/topstoriesFM?OpenForm&Site=db2software)
Information Management Demos
(http://demos.dfw.ibm.com/solutions/infomgmt/)
Information as a Service Demo
(http://media.dvdpowertools.com/ibm/infomanagement/interface.php)
Videos