What is big data?
Raul F. Chong
Senior program manager – Big data, DB2, and Cloud
•
The world is changing…
•
What is big data?
•
The IBM big data platform
•
Examples of big data solutions
•
Big data solutions and the cloud
•
Setting up a Hadoop cluster on the cloud
•
The world is changing…
•
What is big data?
•
The IBM big data platform
•
Examples of big data solutions
•
Big data solutions and the cloud
•
Setting up a Hadoop cluster on the cloud
2 Billion internet users
4.6 Billion mobile phones
Is it really?
•
The world is changing…
•
What is big data?
•
The IBM big data platform
•
Examples of big data solutions
•
Big data solutions and the cloud
•
Setting up a Hadoop cluster on the cloud
What is big data?
Big data are datasets that grow so large
that they become awkward to work with
using on-hand database management tools.
Difficulties include capture, storage, search,
sharing, analytics, and visualizing.
Information is growing at a phenomenal rate
as much data and content over coming decade
2009
800,000 petabytes
2020
35 zettabytes
44x
•
About
80%
of the world’s data is unstructured
•
It may be data we’ve been collecting before, but could not
process
•
Data in movement - streams
•
Twitter / Facebook comments
•
Stock market data
•
Sensors: Vital signs of a newly-born
•
Data at rest - oceans
•
Collection of what has streamed
•
Web logs, emails, social media
•
Unstructured documents: forms, claims
•
Structured data from disparate systems
•
The world is changing…
•
What is big data?
•
The IBM big data platform
•
Examples of big data solutions
•
Big data solutions and the cloud
•
Setting up a Hadoop cluster on the cloud
Big Data Enterprise Engines
Big Data Enterprise Engines
IBM Big Data Solutions
InfoSphere BigInsights
InfoSphere Streams
Developers
End Users
Administrators
Big Data User Environments
Big Data User Environments
Client and Partner Solutions
Open Source Foundational Components
Hadoop
HBase
Pig Lucene
Jaql
A
G
E
N
T
S
IN
T
E
G
R
A
T
IO
N
In
fo
rm
a
tio
n
S
e
rv
e
r
Marketing Warehouse Appliances Data Warehouse Database Content Analytics Business Analytics Master Data Mgmt InfoSphere Warehouse Netezza InfoSphere MDM DB2 Cognos & SPSS Unica Data Growth Management InfoSphere Optim ECMBig Data Platform
Data Warehouse
Enterprise
Integration
Traditional Sources
New Sources
IT
Structures the
data to answer
that question
IT
Delivers a platform to
enable creative
discovery
Business
Explores what
questions could be
asked
Business Users
Determine what
question to ask
Monthly sales reports
Profitability analysis
Customer surveys
Brand sentiment
Product strategy
Maximum asset utilization
Big Data Approach
Iterative & Exploratory Analysis
Traditional Approach
Structured & Repeatable Analysis
•
The world is changing…
•
What is big data?
•
The IBM big data platform
•
Examples of big data solutions
•
Big data solutions and the cloud
•
Setting up a Hadoop cluster on the cloud
Make risk decisions based on
real-time transactional data
Multi-channel customer
sentiment and experience
analysis
Predict weather patterns to plan
optimal wind turbine usage, and
optimize capital expenditure on
asset placement
Identify criminals and threats
from disparate video, audio,
and data feeds
Detect life-threatening
conditions at hospitals in
time to intervene
19 © 2011 IBM Corporation