Teradata Big Data Analytics
Zagreb Nov 5
thStefan Ruhland
Industry Consultant Banking
Teradata Austria
•
Teradata
•
Big Data approach
•
Banking Use Cases
•
Conclusion
2014 Highlights:
- Focus on Advanced Analytics and Big Data
- Revenue 2.7B$;
- Employees growth: from 6000 (2008) to more than 12000 (2014)
- International: 77 New Accounts; +8% Revenue vs 2013
-
25 New Big Data Projects in EMEA
-
We bolstered our portfolio through acquisitions of 3 SW
companies specialized on Big Data Analytics
-
We broadened our ecosystem of technology partners, with new
and strengthened relationships with Cloudera, Hortonworks,
MapR, and MongoDB
Teradata Big Data References
DISC
F I N A N C I A L S E R V I C E SH S B C
'II:]
o v e rstoc k .c o m· . .i
n s i g h t
e x p r e s s
:
·
d i s c o v erte>morrow.t o d ay•a n
k
Cq .
HI N A E V E R B R* . t l l f " T
IGHT B A N K •e
d
m
0
d
0
r
A u r o r a H e a l t h C a r e· • •M o b i l e .
'\
.
• MACHI NIMAC O M C A S T
e
G I L T
G R O U P EBARNES
'
NOBLE
C A N A L +
v e r 1z o n wirelessW I N D
B a n k o f A m e r i c am
M
z
i
n q a
V / I L L I A l \ I S -S O TOJ.V Ao t t o g r o u p
L i n k e d
( m
..
v o d o t o n es w i s s c o m
A u r o r a H e a l t h C a r e·5
•
Teradata
•
Big Data approach
•
Banking Use Cases
•
Conclusion
Is not about Volume, Velocity and Variety anymor
e….
Big Data Discovery Process: a complex iterative process
Typical Challenges
Data Acquisition1
3
Data Preparation2
AnalysisLONG PROCESS
5
Production 7Big Data Reference Architecture
Discovery Platform integrated with Hadoop, Teradata, Oracle data sources
Agile Discovery Process
–
Solving Typical Challenges
Data Acquisition
1
2
3
Data Preparation Analysis
FASTER PROCESS
5
Production
EDW Model vs Big Data Discovery
EDW Model 3 Releases Over 2 years
Highly Planned & Controlled Slow Release Schedule
• 3x releases 2 years
High Central funding cost Low Risk / High Success
Discovery Model Small Iterative Projects
• 40+ Discoveries / 2 years
Low cost per project
• $50k-$70k per project
BAU funded initiatives High Risk / High Fail
• Iterates to a new project
$3m
Release 1
$3mRelease 2
$3mRelease 3
40+ Projects Over 2 years
3 5 7 9 11 13 15 17 19 21 23
March May July Sept Nov Jan March May July Sept Nov
3 5 7 9 11 13 15 17 19 21 23
March
Release 1
5 7
May July Sept Nov Jan March May July Sept Nov
3 9 11 13 15 17 19 21 23
EDW Model vs Big Data Discovery
(cont’d)
EDW Model
• EDW projects must succeed
• Successful Discoveries
productionised as part of release schedule
Discovery Model
• Many projects“fail”
• Failure is accepted as part of
the process and leads to new innovations and iterative projects
• Successful projects are often
productionised on the EDW for execution
$3m
$3m
$3m 3 Releases Over 2 years
Release 2
Release 3
40+ Projects Over 2 years
Successful Project
11 Failed Project
3 5 7 9 11 13 15 17 19 21 23
•
Teradata
•
Big Data approach
•
Banking Use Cases
•
Conclusion
Big Data Business Value Framework
Overview of Key Areas
Marketing
Customer
Experience
Fraud
Risk
Online fraud Unusual usage of authenticated website based on context Path to Fraud ID the detailedmultichannel steps that precede fraud
Fraud Networks
Find connections between related parties
Claims Fraud Identification of valid v fraudulent customer claims Abandon online purchase
Insight and action to drive follow up
Mktg Attribution
ID the contribution of each contact to a sale
Sales Process Improvement
ID and Improve sales process effectiveness
Path to Churn
ID the path leading to attrition
Identify broken processes
based on multi channel engagement
Customer Sat/NPS
Understand the cause of dissatisfaction and loyalty
Predict Complaint
ID root cause and identify opportunity to intervene and fix
People Like Me Affinity groupings refine people like me recommendations
Pre default risk
Path to default via golden path analysis
Connection risk
High risk associates via social or txn networks
Collections analytics
Identify path to repay via collections
Operational
(Banking)
Reduction in manual Claims review Increased productivity Automate Claims notificationOptimise handling and client satisfaction
Advanced Risk & Pricing insights
Minimise adverse selection
Behavioral-based Pricing
with Telematics data
Operational
(Insurance)
Real & live
POC/POV
Idea
Real Estate Pricing
Using new data and techniques to enhance risk-based price
Call Centre Analytics
Adherence to core processes and service standards at busy times
Sales Compliance & Miss-Selling
Detect key words that mislead client
OnlineT&C’s
Email follow up from opt-out or rapid T&C
completion
Service Efficiency
ID the paths leading to high cost service calls and rectify cause
Understand Online Cross-Session Behaviour
Using log data could help understand the customer journey
The data looks lik
e…
The web data provides you with a lot of facts:
• This visitor is interested in home loans & came from acompetitor site
• They spent 6 minutes looking at 8 re-mortgage pages
• They spent 2 minutes on the 3rd page reading the detail
• They run 3 re-mortgage calculator scenarios, then abandon
• then they called the call centre
• They want to borrow more than they currently owe • We can see pressure on the level of hardcore borrowing • 2 weeks ago they visited the getting married financial
planning web pages for 30 minutes over 2 sessions 15
Remember What the Customer Tells You
Capturing data from forms adds more insight
Quote #1: Home Loan Term 12 Years
The data looks lik
e….
Quote #2: Home Loan Term 20 Years
Term then changed to 15 Years before session abandoned
The data tells y
ou….
• This visitor is happy with the Purpose, Type & Value of loan.
• They are undecided over the term…isthis about affordability of monthly payments?
• Knowing the sticking point helps:
• Gives you a reason to contact the customer • Gives you the‘hook’ to open the conversation
These financial service examples
shows the strength of an
integrated inbound real-time and
outbound solution.
When a special event occurs
online, you can let your branch
network, or personal advisor make
a follow-up call.
For for less urgent matters you can
use a cheaper channel like SMS,
E-mail or place a banner ad on the
customers next visit.
17
Real-time capture of events and actions you might take
Sales Funnels
Analysed Sales Funnels:
-
Personal Loan Quote
-
Savings Account
-
Current Account
-
New Credit Card
-
Mortgage Application
Analysing Digital Journeys
3. Affordability Details
2. Loan Quote Request
1. Application
Data is also actionable -opportunity for personalised
triggers based upon where customers abandon.
4. Personal & Financial
Details Displayed
5. Personal & Financial
Details Update
Leads can be delivered in session (via RTIM) and/or offline
via CIM into the branc
6
h.
oP
re
cr
aso
llnal & Financial
Details Displayed
7. Review
Application
Sales Funnels
Analysing Digital Journeys
Analysed Sales Funnels:
-
Personal Loan Quote
-
Savings Account
-
Current Account
-
New Credit Card
-
Mortgage Application
4. Personal &
Financial Details
Displayed
5. Personal & Financial
Details Update
5. Review Application
H o m e p a g e H o m e p a g e H o m e p a g e H o m e p a g e H o m e p a g e H o m e p a g e
Sales Funnels
Analysed Sales Funnels:
-
Personal Loan Quote
-
Savings Account
-
Current Account
Analysing Digital Journeys
- New Credit Card
- Mortgage Application
4. Personal &
Financial Details
Displayed
Personal & Financial
Details Update
Accept Rate: 36%
Review Application
Accept Rate: 27%
Closing the gap in
accept rates is worth
roughly 5-6k sales per
year (worth $3m profit
p.a).
Trial a process with forced update step, at least for certain
Churn Prevention
Customer Retention Improvement Opportunity
Churn Analysis
–
Statistical & Pattern Matching techiques
Churn
Statistical
Pattern
Matching
Space of all possible
customers at risk
of churn…
…cus
tomers that can be
identified through Classic
Statistics, e.g., SAS model
s…
…cus
tomers that can be
identified through pattern
matching via path analytics.
Events Preceding Account Closure
Discovery Process
–
First step
Events Preceding Account Closure
Iterative Process
–
Reducing the
“
Noise
”
to find the
“S
ignal
”
“Com
mission Reduction Request
”
and “Ser
vice Complaint
”
seem
to be
“S
ignals
”
Events Preceding Account Closure
Insight Identification
–
Most common event Sequences (aka
“
golden path
”)
Path to Churn Outputs
Three possible output
Business Rules
New input variables for
current models
New predictive models
Triggers that need to be analyzed to determine whether the bank should add customer names to a list of potential defectors
Identification of new statistically reliable input variables:
– Single Events
– Paths
– Frequent Sub-paths
Building news predictive model from scratch:
– Event path based model
27
•
Teradata
•
Big Data approach
•
Banking Use Case
•
Conclusion
• Strategic Consulting
Strategic Consulting Service: address organizational
• changes
Roadmap Service: What Big Data & Analytic Projects generate most revenue, where to start?
• Implementation
From strategy to production: supporting the organization in making data become a first-class citizen
Implementing Big Data & Analytic Projects in your organization
• Analytics as a Service
Far-reaching between you r organization and Teradata
Shared risk/benefit: for projects: only pay for what brings you value!
• Support
Services on all layers of the Stack
Strategic Consulting Implementation Analytics as a Service
Support
Big Data Governance and Models Aster Analytics Platform All major Hadoop Distributors are
Teradata Partners Platform Implementation
Data Lake Architecture
How can we help?
Consumer Credit Risk Models
«T
raditional
»
Machine-Learning Algorithms
Credit Bureaus data Transactions data Exploratory data
analysis variables (ADS)Model input Forecast model (Decision Tree)
Credit Bureau Data
• Total Number of Trade Lines
• Number and balance of home loans
• Balance of all auto loans to total debt
• Total credit-card balance to limits
• ... Transaction Data
• Number of Transactions
• Total inflow
• Total outflow
• Total expenses at discount stores
• Total clothing stores expenses
• Total restaurant expenses
• Total vehicle related expenses
• Total education related expenses
Deposit Data
• Savings account balance
• Checking account balance
• CD account balance
• Brokerage account balance
• ...
Account Balance
data
31 Source: MIT - Massachusetts Institute of Technology- Khandani, Amir E., Adlar J. Kim, and Andrew W. Lo.“Consumer credit-risk models via machine-learning algorithms.”Journal of Banking & Finance 34 (2010)
Consumer Credit Risk Models
«T
raditional
»
Machine-Learning Algorithms
Credit Bureaus data Transactions data Exploratory data
analysis variables (ADS)Model input Forecast model (Decision Tree)
“…current cred
N
it-e
buw
reac
uo
am
nap
lyl
te
icm
s sue
cn
ht
aa
sr
cy
reA
din
t sa
coly
res
sis
[e.g. FICO score]Account Balance
data
Improving current model (higher lift)
Building new predictive models
are ba Path & Graph Analytics techiques
In-database analyses, modeling and scoring of the entire dataset
sed on slowly varying consumer characteristics…”
“… machine-learning forecasts are considerably more adaptive, and are able to pick up the dynamics of changing credit cycles as well as the
– Missed payments
–• Shift in spend from discretionary items to essentials; shift in location of spend from higher-end brands to lower-end brands
– Card balances generally increasing
– Canceling recurring subscription payments, e.g. magazines – Shift in spend of debit to credit
– Changes in amount of direct deposit
• – Changes in pattern of spending activity, e.g. someone fills up their car consistently at 8am prior to arriving and work and all of a sudden starts filling up in the middle of the day (potential indicator of a lost job)
– Starting to pull cash down off of a credit card, particularly telling when it is pulling cash down at particular locations like a Casino
– Increasing debt levels across all debt mechanisms – ...
Reuse Data Preparation
Same events used for Fraud Detection + some specific events
• The probability indicators of Default include:
33
Credit Risk Business Improvement Opportunity
Path Analysis of Account Balance
Find Correlation between Account Balance and Default Risk
B
CC
B
A
B
- 1.5 - 1 - 0.5 0 0.5 1 1.5 Time [day]A
B
C
SAX
(Symbolic Aggregate approXimation)Path Analysis Statistical Analysis Account
Banking Analytics Use cases
Marketing
Path to Churn.Enable you to study your customers’ omni-channel behavior, in order to discoverAbandonment Paths, that are sequences of events/behaviors that
frequently lead to customers churn. These insights allow to improve churn prediction models and are complementary to traditional approaches.
Pre-built Path and Predictive analytics functions
Multi-channel Attribution.Help you quantifying channels effectiveness to drive revenue, in order to identify which channels/ads perform the best, calculate true ROI on a per ad basis and/or run time-sensitive promotions by knowing which ads convert the fastest.
Pre-built Attribution and Path analytics functions
Location Based Offers.Allow you to analyze the locations most frequented by customers and identify the types of spend for each customers and the brand share of that spend. The goal is to improve customer loyalty by providing usage based offers for Credit and Debit Cards and select Merchants to partner with for location based offers.
Pre-built Path, Graph and Statistical analytics functions
Banking Analytics Use cases
Marketing
Abandoned online Purchase. Enable you to understand in detail how customers progress through the online sales process, in order to identity, understand and fix broken processes where customers exit, get stuck or cycle back. Outcomes are higher conversion and efficiency at each step and more revenue from sales, at a lower cost per sale because re-work is reduced.
Pre-built Path and Graph analytics functions
Personalized Recommendation Analysis.Allows you to make product recommendations when you knows very little about the customer (e.g. customer is inactive or holds only 1 product), using individual customer browsing combined with ‘people like you’purchase behaviors. The goal is to improve product penetration amongst segments that are either inactive or only holding one.
Banking Analytics Use cases
Fraud
Path to Fraud. Enable you to analyze cross-channel customer activities to identify common sequence of events leading up to a fraudulent transaction. This new cross-channel fraud prediction path analysis is a substantial improvement vs. the current fraud models and it's complementary to them.
Pre-built Path and Clustering analytics functions
Fraud Networks. Enable you to use graph-based approaches to uncovering anomalies in customers' graph, where the anomalies consist of unexpected
entity/relationship and patterns that are often related with fraud behavior. As with path to fraud, these insights allow to significantly improve fraud prediction models.
Pre-built Statistical, Graph and Path analytics functions
Banking Analytics Use cases
Credit Risk
Connections Risk–Consumer & B2B Networks. Allow you to build and analyze customers network, in order to identify explicit and implicit associations and actionable insights that can be vitally important for credit risk detection.
Pre-built Path, Graph and Predictive analytics functions
Pre-Default Risk. Enable you to identify genuine signs of default pressure, through the analysis and comparison of events, transactions, interactions and changes over time. The key objective is to discover customer behaviors that frequently lead to customers default and translate them in Business Rules.