• No results found

Predictive Analytics and the Big Data Challenge

N/A
N/A
Protected

Academic year: 2021

Share "Predictive Analytics and the Big Data Challenge"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Predictive Analytics and

the Big Data Challenge

Andrei Grigoriev, MBA, MSc

Sr. Director, Custom Development EMEA

SAP

(2)

What is Predictive Analytics

Predictive analytics is about analyzing known facts and making predictions about

unknown events.

Analyzing – algorithms

Known facts – (big) data

(3)

Arrow of Time

Past

Cause

Future

Effect

Entropy

(randomness)

increases

Observer

(4)

Exabyte

Big Data – Will Be Just Data Soon

Big Data is about managing and analyzing large

(5)

Big Data – Are There Limits?

In the context of this paper: information about as many relevant events as possible

with the highest possible resolution (granularity)

I believe there is no theoretical limit, i.e. indefinite data resolution is possible. Will

we have enough energy to deal with that – that is not in the context of this paper.

(6)

Practical Considerations

We only need to predict with a reasonable accuracy – i.e. a good prediction means

we always gain something. For example, price of shares.

What if everyone is able to predict?

Example: two people betting, both predict same results, no gain but commission is

paid – resources drained, either betting will stop or hyper inflation will happen.

Not sure what effect it will have on the financial system but there are areas where

benefits are clear – Life Sciences.

(7)

Hardware and Software Innovations

Technology that allows the processing of massive quantities of real-time data in the main

memory of the server to provide immediate results from analyses and transactions.

HW Technology Innovations



Multi-core architecture



Massive parallel scaling



Cheap, commodity servers



Huge data throughput

performance



Dramatic decline in

price/performance

Software Technology Innovations

Row and

column store

Compression

Partitioning &

parallelization

No aggregate

tables



Column = Fast queries



5 – 30x ratio



Analyze large data sets



Complex computations



Parallel processing



Flexible modeling

(8)

Life Sciences – Challenge

Creating better drugs and treatment is becoming more of a mathematical and

engineering challenge.

Next generation sequencing is making genome data commodity but a lot more

innovation should happen in algorithms and high performance computing to make

the most of that.

We need to get results faster but we also need to be able to ask deeper and

broader questions, look at many more scenarios before making a decision and

analyze data from multiple sources.

(9)

Increased Data Value

Drug

Pathway

A molecular pathway is a signaling cascade in a cell with proteins as key components Compound designed to cure diseases

GENOMICS

PROTEOMICS

METABOLOMICS

Today 3500 known diseases caused by DNA changes (expected to be 7000)

(10)

Genome Sequencing

Annotation and Analysis

Raw DNA

Reads

Mapped

Genome

Discovered

Variants

Follow-up and

Validation

Patient

Samples

Sequencing

Alignment

Variant Calling

Sequencing Service/Lab

e.g. Biologist

Computational Pipeline

e.g. Bioinformatician

Computational Analysis

e.g. Clinicians AND Researchers

(11)

Big Data in Life Sciences

Research &

Development

Planning

Procure-ment

Storage &

Delivery

Production

Quality

Assurance

Sales &

Marketing

Analysis of next generation sequencing data

Real-time complaint and sales reporting

Analysis of LIMS and recipe data

Predictive analytics High Throughput Screening Analysis of patents and documents Margin Simulation with Raw Material Prices Sales & Operations Planning Drug serialization Real-time Analysis of process engineering

data Social Media

Analytics Customer Segmen-tation Acceleration Predictive Customer Segmen-tation

(12)

Thank you

(13)
(14)

Abstract

Predictive analytics is about analyzing known facts to make predictions about

unknown events.

What if we knew absolutely everything that ever happened and every bit of that data

was available instantaneously – would that enable us to make more accurate

predictions?

With examples from Life Sciences and Genes Expressions research this

presentation explores the impact of Big Data and In-Memory technologies on

(15)

Biography

Andrei is Senior Director at SAP with expertise in big data, in-memory computing

and analytics. He has over 16 years of diverse international career in development,

product management and organizational leadership.

Andrei frequently speaks at industry events and conferences on big data, business

intelligence, analytics. He hosts annual SAP Life Sciences Innovations Forums. He

lectured and presented at leading universities in the UK and Ireland.

(16)

Disclaimer

This presentation outlines our general product direction and should not be relied on in making

a purchase decision. This presentation is not subject to your license agreement or any other

agreement with SAP. SAP has no obligation to pursue any course of business outlined in this

presentation or to develop or release any functionality mentioned in this presentation. This

presentation and SAP's strategy and possible future developments are subject to change and

may be changed by SAP at any time for any reason without notice.

This document is provided without a warranty of any kind, either express or implied, including

but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or

non-infringement. SAP assumes no responsibility for errors or omissions in this document,

(17)

© 2013 SAP AG or an SAP affiliate company.

All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice.

Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. National product specifications may vary.

These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.

SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries.

(18)

© 2013 SAP AG oder ein SAP-Konzernunternehmen.

Alle Rechte vorbehalten.

Weitergabe und Vervielfältigung dieser Publikation oder von Teilen daraus sind, zu welchem Zweck und in welcher Form auch immer, ohne die ausdrückliche schriftliche Genehmigung durch SAP AG nicht gestattet. In dieser Publikation enthaltene Informationen können ohne vorherige Ankündigung geändert werden.

Einige der von der SAP AG und ihren Distributoren vermarkteten Softwareprodukte enthalten proprietäre Softwarekomponenten anderer Softwareanbieter.

Produkte können länderspezifische Unterschiede aufweisen.

Die vorliegenden Unterlagen werden von der SAP AG und ihren Konzernunternehmen („SAP-Konzern“) bereitgestellt und dienen ausschließlich zu Informationszwecken. Der SAP-Konzern übernimmt keinerlei Haftung oder Gewährleistung für Fehler oder Unvollständigkeiten in dieser Publikation. Der SAP-Konzern steht lediglich für Produkte und Dienstleistungen nach der Maßgabe ein, die in der Vereinbarung über die jeweiligen Produkte und Dienstleistungen ausdrücklich geregelt ist. Keine der hierin enthaltenen Informationen ist als zusätzliche Garantie zu interpretieren.

SAP und andere in diesem Dokument erwähnte Produkte und Dienstleistungen von SAP sowie die dazugehörigen Logos sind Marken oder eingetragene Marken der SAP AG in Deutschland und verschiedenen anderen Ländern weltweit. Weitere Hinweise und Informationen zum

References

Related documents

This study was designed to prospectively determine the impact of a multimodality interventional bronchoscopy approach on an objective measurement of functional sta- tus, quality

If all of your Amazon EC2 instances in a particular Availability Zone are unhealthy, but you have set up instances in multiple Availability Zones, Elastic Load Balancing will

Neka od postojećih mjerila kojima se može odrediti dio čimbenika koji utječu na fragmentaciju poljoprivrednih gospodarstava u Republici Hrvatskoj (oblik i veliči - na parcela)

If you are affected by a nuisance and you are unable to resolve it yourself, then you may be able to make a formal complaint to your local council or take action yourself in

The policy provides 3 levels of lifetime insurance cover for cats subject to certain terms and conditions being met.. Significant features

The objective of the investigation should be to identify and locate, both horizontally and vertically, significant soil and rock types and ground water conditions present within a

I We also consider a noisy variant with results concerning the asymptotic behaviour of the MLE. Ajay Jasra Estimation of

Previous studies have reported estimates of gaming revenue from casino-style games added to existing race tracks. Other reports and studies have examined the potential revenue