Unleash the Value of Big Data through Predictive Analytics with SAP HANA
Philip Mugglestone
September 10-13, 2012
Orlando, Florida
• New market forces are changing the landscape and
offering new opportunities
• Predictive analysis can be for everyone in the
business
• SAP HANA is the cornerstone of SAP’s vision for
predictive analytics
Agenda
The Predictive Analytics Landscape
Predictive Analytics with SAP HANA
Customers Benefit from Predictive Analytics with SAP HANA
Predictive Applications for SAP HANA
Agenda
The Predictive Analytics Landscape
Predictive Analytics with SAP HANA
Customers Benefit from Predictive Analytics with SAP HANA
Predictive Applications for SAP HANA
What if…
. . . You could identify hidden revenue opportunities
within your customer base through predictive
analytics?
. . . You could retain your high-value
customers/employees/vendors/partners with the
right retention offers?
. . . Your call center agents could delight customers
with the best next-step recommendations?
. . . You could increase cross-sell and up-sell
effectiveness through cross-channel
coordination?
. . . You could build long-term
customer/employee/vendor/partner relationships
with intelligent interactions?
Extend Your Analytics Capabilities
C
O
M
P
E
T
IV
E
A
D
V
A
N
TA
G
E
Sense & Respond
Predict & Act
Raw Data Cleaned Data Standard Reports Ad Hoc Reports & OLAP Generic Predictive Analytics Predictive Modeling Optimization What happened?
Why did it happen?
What will happen?
What is the best that could happen?
Predictive analysis encompasses a range of analytic techniques
… the exploration and analysis, by automatic or semi-automatic
means, of large quantities of data in order to discover meaningful
patterns and rules.”
Gordon Linoff and Michael Berry
Authors of “Data Mining Techniques”
“
… the process of discovering meaningful new correlations, patterns
and trends by sifting through large amounts of data stored in
repositories, using pattern recognition technologies as well as
statistical and mathematical techniques.”
Gartner Group
“
Challenges
Challenges
Forecasting
Key
Influencers
Trends
Anomalies
Relationships
How do historical sales, costs, key performance metrics, and so on,
translate to future performance? How do predicted results compare with goals?
What are the main influencers of customer satisfaction, customer churn, employee turnover, and so on, that impact success?
What are the trends:
historical / emerging, sudden step changes, unusual
numeric values that impact What are the correlations
in the data? What are the cross-sell and up-sell opportunities? What anomalies
might exist and conversely what groupings or clusters might exist for specific analysis?
Predictive Analytics
Examples
Data mining and predictive have been around for decades.
But new market forces are changing the landscape and offering new opportunities…
Increased business
interest
Now that BI users know what happened, they are asking why and what’s likely to happen next Explosive demand from sales, marketing, and call center analyses… fraud, and government intelligence/security agencies
Increase
d data value (e.g., Big Data)
Exploding data volume Expanding data varieties
Increasing technology performance
Parallel processing, faster CPUs, and
in-memory technologies reduce time and cost of data processing
Big data matters
Transformational business value from data
Drive Better Profit Margins New Strategies and Business Models Operational Efficiencies Business Value
Velocity
Volume
Variety
MobileCRM Data
Planning OpportunitiesTransactions
Customer Sales Order Things Instant Messages Demand InventoryDatabase to Decision Predictive Analytics
In-database predictive and R integrationIntuitive predictive modeling
Advanced data visualization and easy to use exploration
Harnessing Powerful Big Data Analytics
Real-time on massive amounts of structured and unstructured data Complex questions answered in-memory, lightning fast
Deep Hadoop integration with built-in text analysis
SAP Predictive Analytics Vision
For Everyone in the Business
Seamlessly embedded in business user applications Extended into BI clients and reports
Insight into events instantly delivered to dashboards, alerts, and mobile devices
Predictive Analysis - User Personas
SAP PA designer SAP HANA PAL & R IQ SAP PA visualization Embedding Predictive AnalysisIndustry applications LOB applications BI client tools Number of users SAP PA wizard 5 / 20 Large number 50 / 100 500 / 5,000 User personas Business Analyst
Professional Data Analyst
Predictive Analysis process designer / author
Business Analyst / Interactive Consumer Application End User
• Information Consumer
• Interactive Consumer
All personas
Agenda
The Predictive Analytics Landscape
Predictive Analytics with SAP HANA
Customers Benefit from Predictive Analytics with SAP HANA
Predictive Applications for SAP HANA
Predictive Analytics with SAP HANA
Transforming the Future with Insight Today
Unleash the value of Big Data
through the power of SAP HANA
•
Employ in-database predictive algorithms
•
Access 3,500+ open-source algorithms
via R integration for SAP HANA
Intuitively design and visualize
complex predictive models
•
SAP Predictive Analysis software
Bring predictive insight to everyone
in the business
•
Embed within business applications
•
Extend into BI and reports
SAP HANA In-Memory Predictive Analytics
Predictive Analysis Library (PAL)
Native predictive algorithms
In-database processing for powerful and fast results
Quicker implementations
Support for K-Means, K-Nearest Neighbor, C4.5 Decision
Tree, Multiple Linear Regression, Apriori, ABC Classification,
Weighted Score Tables
R Integration for SAP HANA
Enables the use of the R open source environment (> 3,500
packages) in the context of the HANA in-memory database
R integration enabled via high performing parallelized
connection
R script is embedded within SAP HANA SQL Script
Combine the depth and power of in-memory analytics within SAP HANA with the
breadth of R to support a variety of advanced analytic and predictive scenarios
Y
X
SAP HANA In-Memory Predictive Analytics
Predictive Analysis Library (PAL)
Clustering
K-Means
Classification
K-Nearest Neighbor C4.5 Decision Tree
Multiple Linear Regression
Association
Apriori
Classification
ABC Classification Weighted Score Tables
Association
Lite Apriori (A -> B)
Classification
Exponential / Geometric / Logistic / Natural Logarithmic Regression
CHAID Analysis
Time Series
Single / Double / Triple exponential smoothing
Preprocessing
Outlier Detection (Inter-Quartile Range)
SAP HANA SPS3
SAP HANA SPS4
The Predictive
Analysis Library
(PAL) is part of the
Business Function
Library
It resides in the
Calculation Engine
and consists of
functions executing
at the database
layer and is written
in C++
Additional native predictive analysis functions provide for deeper insights and
faster analytic implementations as well as results
R Integration for SAP HANA
What is R?
R is a software environment for statistical
computing and graphics
Open Source statistical programming language
Over 3,500 add-on packages; ability to write your own
functions
Widely used for a variety of statistical methods
More algorithms and packages than SAS + SPSS + Statistica
Who’s using it?
Growing number of data analysts in industry, government,
consulting, and academia
Cross-industry use: high-tech, retail, manufacturing, CPG,
financial services , banking, telecom, etc.
Why do they use it?
Free, comprehensive, and many learn it at college/university
Offers rich library of statistical and graphical packages
R Integration for SAP HANA
R Integration for SAP HANA
Functionality Overview
R integration for SAP HANA enables the use of the R open source
environment in the context of the HANA in-memory database
Establishes a communication channel between HANA and R for fast data exchange
Improved data exchange mechanism supports transfer of intermediate database tables
directly into vector oriented data structures of R
Performance advantage over standard tuple-based SQL interfaces with no need for data
duplication on the R server
Supported by providing the R-operator as a custom operator in the calcModel of the
calculation engine
This allows the application user to embed R script within SQL script and submit entire query
to the HANA database
As the plan execution reaches R-node, a separate R runtime is invoked using Rserve and
input tables of R node passed to R process using improved data transfer mechanism
Predictive Analytics Process in SAP HANA
Step1
Data Loading
Step 2
Data Pre-Processing
Step 4
Visualization
1. Understand the business and figure out the problems
2. Load the SAP or non-SAP
data into SAP HANA
1. Data selection 2. Data cleansing
3. Data transformation 1. Train Predictive Model
(clustering / classification / association / time series etc.)
1. Visualize the model for
better understanding
2. Store the model and
results in SAP HANA
Agenda
The Predictive Analytics Landscape
Predictive Analytics with SAP HANA
Customers Benefit from Predictive Analytics with SAP HANA
Predictive Applications for SAP HANA
Demo
Mitsui Knowledge Industry
Healthcare – Speed Research & Improve Patient Support
Business Challenges
Reduce delays and minimize the costs associated with new drug discovery by optimizing the process for genome analysis
Improve and speed decision making for hospitals which conduct cancer detection based on DNA sequence matching
Technical Implementation
Leveraged the combination of SAP HANA, R, and Hadoop to store, pre-process, compute, and analyze huge amounts of data Provide access to breadth of predictive analytics libraries
Benefits
For pharmaceutical companies, provide required new drugs on time and aid identification of “driver mutation” for new drug targets Able to provide a one stop service including genomic data
analysis of cancer patients to support personalized patient therapeutics
408,000x
faster than traditional disk-based systems in a technical PoC216x
faster by reducing genome analysis from several days to only 20 minutes making real-time cancer/drug screening possible“ ”
Cisco Systems, Inc.
High-Tech – Deliver Predictive Insight in Near Real-Time
The HANA platform at Cisco has been used to deliver near real-time insights to our execs, and the integration with R will allow us to combine the predictive algorithms in R with this near-real-time data from HANA. The net impact is that we will be able to take the capability which takes weeks and months to put together, and deliver just-in-time as the business is changing.
Business Challenges
Provide Cisco sales with real-time insights to make better decisions Better data delivery decisions to improve the business - for
optimization and to drive topline growth
Lack of understanding of underlying sales performance drivers and the seasonality of buying patterns
Technical Challenges
Unable to transform the results of statistical analysis into actionable insights
Benefits
Better respond to customers and deliver tangible business value Reduce time to transform information into insights and improvement in the quality of decision-making on those insights
Ultimately drive higher profitability and growth
5 seconds
Seasonality analysis with any selected filters including country, segments and sales levels
2 seconds
Cluster analysis across multiple quarters for a
combination of sales level and quarter to determine sales achievement level
Bigpoint
Gaming Industry - Predictive Game Player Behavior Analysis
Business Challenges
Increase conversion rates from free paying player Increase the average revenue per paying player Decrease churn – keep paying players playing longer
Technical Challenges
Leverage real-time data processing in SAP HANA and
classification algorithms with R integration for SAP HANA to deliver personalized context-relevant offers to players
Analyze vast amounts of historical and transactional data to forecast player behavior patterns
Benefits
Real-time insights
Per player profitability analysis and increased understanding of player behavior
Increase data volume and processing capabilities to communicate personalized messages to players
“ ”
5,000
events per second loaded onto SAP HANA (not possible before)10-30%
increase in revenue per yearInteractive
data analysis leading to improved design thinking and game planningAgenda
The Predictive Analytics Landscape
Predictive Analytics with SAP HANA
Customers Benefit from Predictive Analytics with SAP HANA
Predictive Applications for SAP HANA
SAP HANA Predictive Ecosystem
SAP HANA Platform
Predictive Analysis Library
(PAL)
SAP HANA Studio
SAP and Custom Applications R Integration for SAP HANA Business Intelligence Clients
R
SAP Predictive AnalysisSAP HANA – Hadoop Support in SAP Data Services 4.1
Extend the “Big Data” Ecosystem to SAP HANA
Files
Web & Others Databases
SAP HANA, Sybase IQ, other
Target Systems
SAP Data Services
SAP Data Services
SAP HANA can integrate with Hadoop via SAP Data Services 4.1
Ability to load and read large volumes of data into and out of SAP HANA from/to
Hadoop via SAP Data Services 4.1
Reading from and loading into Hadoop
Hive Support (table-like access to Hadoop data) Direct HSFS file support through PIG scripts
Intuitively design complex predictive models
Visualize, discover, and share hidden insights
Unleash Big Data with SAP HANA’s power
• Identify energy consumption pattern that can be used for Customer Segmentation • Smart meter data volume is
huge
Using SAP HANA + PAL
K-Means algorithm
Challenge
Solution
Results
• Clustering of smart meter data applying the k-means clustering to >20 million x 96-dimensional
vectors
• High performance clustering computation
Customer Analytics
Customer Revenue Performance Management
Simulation and Business Advice
Perform what-if scenarios to see the impact of pricing and discount changes
What:
Combination of financial and sales data to
get visibility of profitability and margins of a customer, a customer deal or group
Margin visibility – decomposition of margin
to highlight areas of improvements
Comprehensive customer revenue analysis Call for action – take focus customers and
Customer Analytics
Predictive Customer Segmentation & Targeting
Real-time simulation and segment building
More granular targeting leveraging predictive analytics
Improved alignment amongst Sales, Marketing and Product Development to focus on target Customers
What:
Segmentation on a wider and more granular
set of data not requiring SAP CRM
Tracking of successful implementation of
initiatives out of segmentation
All levels of sales can use segmentation
features to define own target customers to improve sales
Predictive segmentation by identifying
common characteristics within a given group
ALERTS
SEGMENTATION SIMULATE FEED SEARCH PERSONALIZE ADMINISTRATOR
?
?
CRPM
OVERVIEW SEGMENT
All customers with revenue greater than 55 milions Explore
1.503 Search Results: „All customers with revenue greater than 55 milions“
Customer Class A Save Query ATTRIBUTES Customer (0/6) Product (1/8) Type (2/5) Type 2 (0/15) Type 3 (1/45) S u b ty p e 3 .2 (4 ) Attribute 4 Attribute 5 Attribute 6 Attribute 7 Product Attribute 2 Attribute 3 Attribute 1 Customer Label 3 Label 4 Label 5 Status Label 1 Class Label 2 Group 4 Group 2 Group 3 Group 1 Type Group 5 Group 6 Type 5 Type 2 Type 3 Type 1 Type 4 Type 6 Subtype 2.5 Subtype 2. 2 Subtype 2.3 Subtype 2.1 Subtype 2.4 Subtype 3.5 Subtype 3.2 Subtype 3.3 Subtype 3.1 Subtype 3.4 Save Query ATTRIBUTES
Product Group (0/7) Sales Area (1/8)
Distribution Chain (2/5) DChain 2 (0/15) DChain3 (1/45) D C h a in 3 .2 (4 ) Sales Group Customer Attribute 6 Attribute 7 Sales Area Country Quarter Sales Group Product Group PGroup 5 PGroup 6 PGroup 7 PGroup 2 PGroup 3 PGroup 1 PGroup 4 Area 5 Area 2 Area 3 Distribution Chain Area 4 Area 6 Area 7 DChain5 DChain 2 DChain 3 Sales Org DChain 4 DChain6 DChain 2.5 DChain 2. 2 DChain 2.3 DChain 2.1 DChain 2.4 DChain 3.5 DChain 3.2 DChain 3.3 DChain 3.1 DChain 3.4
Customer Class Revenue Margin Purchasing Bu... Country Region
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
Customer N... A $ 25.534...20 % $ 555.555.999... USA Wash...
All Manufacture Customers 30 %
10% 20%
0
50,000 M Class A Class B Class C
S o m e M e a n in g fu l L a b e l
Performance and Insight Optimization Services
Enabling complex use cases with SAP HANA
Back end
Sophisticated aggregation and cleansing algorithms High volume industry specific data relationship algorithms
Industry-specific predictive modeling
Proprietary models for retail, banking, utilities, manufacturing, and beyond
Industry-specific optimization
Algorithms and processes
Examples of innovative
business solutions advanced by
SAP HANA:
Tax Compliance and
discovery engine for public
sector
Affinity insight and
forecasting for retail
High performance
compliance engine for
chemical companies
Startups Powered by SAP HANA
Predictive and Analytics / Security
AlertEnterprise Security Convergence solutions combine Situational Intelligence with Command and Control capabilities for Critical Infrastructure Protection.
Geospatial & Visual
Analytics
Next generation geospatial and visual analytics solutions transform massive volumes of real-time disparate data into intuitive visual displays. Space-Time Insight’s situational intelligence software helps visualize operations with absolute clarity, analyze the cause of a problem and
determine how to prevent another occurrence, and act instantly to address any situation.
Predictive and Analytics / Financial
The Taulia Invoicement® Suite includes a Dynamic Discounting module and self-service vendor portal. Together, these tools reduce Accounts Payable support overhead, improve supplier relations and generate significant cash savings for corporations.
Social and Predictive Analytics
From campaigns to communities, #NextPrinciples empowers anyone to monitor, manage and act on social media interactions across channels.
Agenda
The Predictive Analytics Landscape
Predictive Analytics with SAP HANA
Customers Benefit from Predictive Analytics with SAP HANA
Predictive Applications for SAP HANA
Companies can no longer focus solely on delivering the
best product or service. To succeed, they must:
•
Uncover hidden customer / employee / vendor /
partner trends and insights
•
Anticipate behavior and take proactive action
•
Empower your team with intelligent next steps to
exceed customer expectations
•
Create new offers to increase market share and
profitability
•
Develop and execute a customer-centric strategy
•
Target the right offers to the right customers through
the best channels at the most opportune time
• New market forces are changing the landscape and
offering new opportunities
• Predictive analysis can be for everyone in the
business
• SAP HANA is the cornerstone of SAP’s vision for
predictive analytics
Thank you for participating.
Please provide feedback on this session by
completing a short survey via the event
mobile application.
SESSION CODE: 0212
© 2012 SAP AG. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Excel, Outlook, PowerPoint, Silverlight, and Visual Studio are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, System z10, z10, z/VM, z/OS, OS/390, zEnterprise, PowerVM, Power
Architecture, Power Systems, POW ER7, POW ER6+, POW ER6, POW ER, PowerHA, pureScale, PowerPC, BladeCenter, System Storage, Storwize, XIV, GPFS, HACMP, RETAIN, DB2 Connect, RACF, Redbooks, OS/2, AIX, Intelligent Miner, WebSphere, Tivoli, Informix, and Smarter Planet are trademarks or registered trademarks of IBM Corporation. Linux is the registered trademark of Linus Torvalds in the United States and other countries. Adobe, the Adobe logo, Acrobat, PostScript, and Reader are trademarks or registered trademarks of Adobe Systems Incorporated in the United States and other countries. Oracle and Java are registered trademarks of Oracle and its affiliates.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiW in are trademarks or registered trademarks of Citrix Systems Inc.
HTML, XML, XHTML, and W3C are trademarks or registered trademarks of W3C®,
World Wide Web Consortium, Massachusetts Institute of Technology.
Apple, App Store, iBooks, iPad, iPhone, iPhoto, iPod, iTunes, Multi-Touch, Objective-C, Retina, Safari, Siri, and Xcode are trademarks or registered trademarks of Apple Inc. IOS is a registered trademark of Cisco Systems Inc.
RIM, BlackBerry, BBM, BlackBerry Curve, BlackBerry Bold, BlackBerry Pearl, BlackBerry Torch, BlackBerry Storm, BlackBerry Storm2, BlackBerry PlayBook, and BlackBerry App World are trademarks or registered trademarks of Research in Motion Limited.
Google App Engine, Google Apps, Google Checkout, Google Data API, Google Maps, Google Mobile Ads, Google Mobile Updater, Google Mobile, Google Store, Google Sync, Google Updater, Google Voice, Google Mail, Gmail, YouTube, Dalvik and Android are trademarks or registered trademarks of Google Inc.
INTERMEC is a registered trademark of Intermec Technologies Corporation. Wi-Fi is a registered trademark of Wi-Fi Alliance.
Bluetooth is a registered trademark of Bluetooth SIG Inc.
Motorola is a registered trademark of Motorola Trademark Holdings LLC. Computop is a registered trademark of Computop Wirtschaftsinformatik GmbH.
SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP BusinessObjects Explorer, StreamWork, SAP HANA, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries.
Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of Business Objects Software Ltd. Business Objects is an SAP company.
Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL Anywhere, and other Sybase products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of Sybase Inc. Sybase is an SAP company.
Crossgate, m@gic EDDY, B2B 360°, and B2B 360° Services are registered trademarks of Crossgate AG in Germany and other countries. Crossgate is an SAP company. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary.
The information in this document is proprietary to SAP. No part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of SAP AG.