Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
/
What does the future hold for
predictive analytics?
Einat Shimoni
EVP and senior analysts
STKI “IT Knowledge Integrators”
[email protected] [email protected]
It's tough to make
predictions, especially
about the future
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Analytics – as always – a HOT topic
80
76
71
68
62
53
53
53
53
50
44
32
29
21
12
1
1
1
םיטקיורפה ימוחת
,
ב ךנוגראב ולחה רשא
-2013
/
ל םיננכותמ
-2014
Source: STKI inquiry barometer, 2014
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Evolution of analytics
Passive
Classic
DW
Proactive
Classic
DW
Self Service
and
Discoveries
Analytics
& Insights
Cognitive
Insights
Deep use of semantics, text analytics,
NLP and machine-learning to provide
new wisdom. Real time analysis
Business users gaining control over BI (use of Self service tools).
DW updated more frequently but is still in the classical model.
Advanced Visualization
More use of predictive and analysis tools by business
users. Some analysis of unstructured data in an
external “big-data style” data mart
BI insights linked to operational processes (i.e, marketing lists to call service agents;
risk analysis leads to operational process). Classic DW, structured data only. IT
doing most BI work
Pull-only model (need to extract reports from it). IT is doing most of BI work. Classic DW model
(single version of the truth), updated ~once a day. Structured data only
IT
focus
Business
focus
Structured
data only
Unstructured
data
Reports
Insights
3
Letting go
Enabling
experiments
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
The data sandbox
A data sandbox, in the context of big data, is a
standalone datamart
, scalable and
developmental platform used to
explore
an organization's
rich information sets
through
interaction and collaboration
.
A data sandbox is primarily explored by data science teams. Data sandbox platforms provide
the computing required for data scientists to tackle typically
complex analytical workloads
.
4
What
are
we
looking for?
I don’t know,
but it’s going
to be amazing!
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Pattern spotting
Events detection
Proactive
Data Warehouse architecture Phase 1: Co-existence
Analytic platform for
external, unstructured data
Text analysis
In
ter
nal
tr
ans
act
ional
da
ta
Ex
ter
nal
da
ta
Insights from external data
Data Science
INFORMATION
REPOSITORY
“Bureaus” that analyze and
track social media as an
external service:
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Pattern spotting
Events detection
Proactive
Data Warehouse architecture Phase 1: Co-existence
Analytic platform for
external, unstructured data
Text analysis
In
ter
nal
tr
ans
act
ional
da
ta
Ex
ter
nal
da
ta
Insights from external data
Data Science
INFORMATION
REPOSITORY
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Data Warehouse architecture Phase 2: Virtual DW/Hybrid BI
Analytic platform for
external, unstructured data
Text analysis
Ext
ernal
d
at
a
Insights from external data
Data Science
The virtual Data Warehouse
INFORMATION
REPOSITORY
Metadata
Permissions
Caching
Part of the data can be kept here
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Data Warehouse architecture Phase 3: OLTP + OLAP
Analytic platform
for external,
unstructured data
Text
analysis
Ext
ern
al
d
ata
Insights from external data
Data
Science
The virtual Data Warehouse
INFORMATION
REPOSITORY
Metadata – semantic layer
Same database for both analytical and transactional data
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Small data = the new big data
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
The 4 V’s
Source: IBM
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Veracity
Big Data Veracity refers to the biases, noise and abnormality
in data. Is the data that is being stored, and mined meaningful
to the problem being analyzed. Inderpal feel veracity in data
analysis is the biggest challenge when compares to things like
volume and velocity
Source:
http://inside-bigdata.com/2013/09/12/beyond-volume-variety-velocity-issue-big-data-veracity/
11
You don’t know the value
of your data
until you reach a
discovery or by using it
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Wanted: Data Scientist
12
Data Scientist
The Hottest Job You Haven't Heard Of
•
Salary: $140K - $200K
•
Major staff shortage:
•
McKinsey: By 2018, the U.S alone could face a shortage of
140,000-190,000 people (2008-2018: 10 years cycle for next gen.
graduates)
•
Gartner: By 2015, big data demand will generate 1 million jobs in
G1000 but only one- third of those jobs will be filled
•
Informationweek:
18%
of big data-focused companies want to
increase staff by
30%
in the next two years,
53%
expect it will be
hard
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Data Scientist
13
Skills (cross-disciplines):
Structured & unstructured data
(also from real-time streams)
Java programming
Statistics
Machine-learning algorithms
NLP
Business concepts (MBAs)
Computer Science
Statistics
Kaggle: data scientists outsourcing via competitions
14
Thousands of experts from 100 countries and 200 universities
Einat Shimoni’s work Copyright@2013
Do not remove source or attribution from any slide, graph or portion of
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Wisdom is the application of Knowledge
Data
Information
Knowledge
Wisdom
“To attain knowledge, add things everyday.
To attain wisdom, remove things every day.”
― Laozi
Discrete elements like words, numbers, names
Linked elements with concepts
Applied
Knowledge
Organized Information
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
“What’s the
difference between
information and knowledge?
It’s like the difference between
knowing
Julia Roberts’ phone number
and
Knowing Julia Roberts
”
- Woody Allen
16
Galit Fein & Einat Shimoni’s work/ Copyright@2014
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Pattern spotting
Events detection
Proactive
New analytics category
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Do you know this artist?
David Mccandless:
Infographic artist.
“My pet-hate is pie charts.
Love pie. Hate pie-charts”
18
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
His works of art
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Why do we care so much about sentiment?
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph
Text analytics
•
Automatic categorization /Content Analysis:
•
IBM ICA, Vivisimo
•
Integrators/ BI players solutions
(i.e, Opisoft, Matrix, Taldor, Ness…)
•
Sentiment analysis:
•
Radian6 (Salesforce)
•
FocalInfo
•
SAP
•
SAS
•
Tracx (Israeli startup)
•
New social listening in Microsoft dynamics CRM
•
Search players:
•
Attivio
•
Melingo
•
HP (Autonomy)
Galit Fein and Einat Shimoni’s work/ Copyright@2014
Do not remove source or attribution from any slide, graph or portion of graph