Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.
The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
1
Oracle
Big Data Management and Analytics
Balaji Yelamanchili
Senior Vice President, Product Development
Business Analytics and Big Data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data
Disruption Digitization
× Datafication
Key Success Factors
Simplify access to all data
Discover and predict, fast
Govern and
secure all data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 5
de Persgroep
Customer 360
Improve prospecting Better offers
Increase Share
Oracle Big Data Strategy
BIG DATA MANAGEMENT
BIG DATA ANALYTICS
BIG DATA APPLICATIONS
BIG DATA INTEGRATION DATA
CAPITAL
Connect And Govern Any Data
Simplify Access To All Data Discover And Predict, Fast
Accelerate Data-
Driven Action
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Actionable Events
Data
Reservoir Data Factory Data
Warehouse BI and Reporting
Discovery Lab
Actionable Information Actionable
Insights
Data Streams
Execution Innovation
Discovery Output Events
& Data
Oracle Big Data Architecture
Oracle Company Confidential 7
Enterprise Data
Web & Social Data
Event Engine
Breakthrough Innovation: Oracle Big Data Discovery
Oracle Big Data Discovery automatically transforms raw data in Hadoop into interactive, in-memory business intelligence, with no ETL; inspired by Endeca
Spend your time finding insights by starting with data as-is
Radically simplify data prep and Analysis process
Announcement:
OpenWorld
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Big Data Discovery
Omri Traub
9
Oracle Confidential – Internal/Restricted/Highly Restricted
Not Easy to Get Analytic Value from Hadoop Data Reservoir
• Existing analytic tools fall short
– Too much effort on upfront data preparation – Manual exploration for understanding new
data sets
– Depend on ETL to cleanse data and make it ready
– Assume questions known in advance
• Only point solutions emerging
– Separate data wrangling, visualization – Leads to constant context switching
• Need end-to-end capabilities
• Native Hadoop tools are complex
– Pig, Oozie, Sqoop, Hive, Spark
• Specialized skills are scarce
– Programming languages
(e.g. Map Reduce, Python, Scala)
– Statistics and machine learning
– Command line interfaces
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Analytics. Requires a Fundamentally New Approach
11
Explore Transform Discover
Find
An intuitive, interactive and visual user interface
for anyone to quickly find, explore, transform and analyze
data in Hadoop
then share results for
enterprise leverage
Oracle Big Data Discovery. The Visual Face of Hadoop
Explore
Transform Discover
Find
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
• Navigate a rich catalog of all data in the Hadoop cluster
• Familiar search and guided navigation for ease of use
• Access data set summaries, annotation and
recommendations
• Provision your own data through self-service upload
• Data is automatically enriched with extracted locations, terms, sentiment
• Browse personal big data projects and those shared by the community
13
Easily Find Relevant Data Sets
• Understand shape of the data. Visualize attributes by type
• Machine learning algorithms sort attributes by
importance
• View attribute statistics, data quality and outliers
• See statistical correlations between attribute
combinations
• Evaluate whether a data set is worthy of further
investment
Explore the Data and Understand Potential
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
• Interactive, intuitive user-driven data wrangling
• Library of data transformations to replace values, convert types, collapse, reshape, pivot, group, custom tag, merge and much more
• Data enrichments for inferring location and language. Theme, entity and sentiment
enrichments for text
• Preview results, undo, commit and replay transforms
• Run on sample data in memory or full data set in Hadoop
15
Transform and Enrich Data to Make it Ready
• Mash up different data sets for deeper perspectives
• Filter through data with powerful search and
intuitive guided navigation
• Drag and drop from a rich library of interactive
visualizations to compose discovery dashboards
• Publish blended data sets back to Hadoop
• Share projects, bookmarks and snapshots with team members for collaboration
Analyze the Data to Discover New Insights
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Hadoop
• Oracle R Distribution 1
• Oracle R Advanced Analytics for Hadoop 2
• SAS High Performance Analytics
1 Included with BDA
2 Included w/Oracle Big Data Connectors
Oracle Database
• Oracle Advanced Analytics Option
• SAS High Performance Analytics
Statistical & Predictive Analytics
Bring the Analytics to the Data
Share Results and Publish for Enterprise Leverage
• Share and collaborate with the team
– Share projects, bookmarks and snapshots then collaborate and iterate
• Publish back to Hadoop
– Transforms and enrichments may be applied to original data sets in Hadoop
– Publish blended data sets back to HDFS
• Leverage results in other tools
– Publish data to Hadoop in format optimized for advanced analytic tools (e.g. ORAAH)
– Hadoop compliant BI tools (e.g. OBIFS) can burst out to the masses
– Leverage any native Hadoop tooling (e.g. Pig, Hive, Impala, Python, etc)
– Integrate BDD data sets with DWH to secure, govern and optimize for query performance (e.g.
Oracle Big Data SQL)
Oracle Big Data Discovery plays well with the Big Data ecosystem
Explore
Transform Discover
Find
Share & Collaborate
raw data
transformed data
data reservoir (HDFS)
Publish
data warehouse business intelligence advanced analytics
other hadoop tools
Leverage
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Big Data Discovery
Walkthrough
19
Oracle Confidential – Internal/Restricted/Highly Restricted
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Actionable Events
Data
Reservoir Data Factory Data
Warehouse BI and Reporting
Discovery Lab
Actionable Information Actionable
Insights
Data Streams
Execution Innovation
Discovery Output Events
& Data
Oracle Big Data Architecture
Oracle Company Confidential 37
Enterprise Data
Web & Social Data
Event Engine
Breakthrough Innovation: Oracle Big Data SQL
One fast query , on all your data.
Oracle SQL on Hadoop, NoSQL and beyond, with a Smart Scan service – as in Exadata – and the security and certainty of Oracle Database.
In production
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Big Data SQL
Neil Mendelson
39
Oracle Confidential – Internal/Restricted/Highly Restricted
Barriers to Big Data Adoption
Complexity
• Skills
– Lack tools and training to exploit Big Data
– IT Operations ability administer and manage Big Data
• Integration
– Adding Big Data to existing architecture is complex – Too much effort required in data preparation
• Security
– No clear route to governance or enforcement
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Management
Hadoop + NoSQL + Relational…
41
The Power of Oracle SQL
Wide variety of ‘Big Data’ types
Structured data
Numeric, string, date, …
Unstructured data
LOBs, Text, XML, JSON, Spatial, Graph, Multimedia
Rich SQL Analytic Functions
Ranking, Windowing, LAG/LEAD, Aggregate, Statistical, Linear
Regression, Correlations, Cross Tabs, Hypothesis Testing,
Distribution Fitting, …
What gives Exadata extreme performance?
Oracle Database 12c
SQL
Offload Query to Exadata Storage Servers
Small data subset quickly returned
Hadoop & NoSQL
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Introducing Oracle Big Data SQL
43
Massively Parallel SQL Query across Oracle, Hadoop and NoSQL
Oracle Database 12c
Offload Query to Exadata Storage Servers Small data subset
quickly returned
Hadoop & NoSQL Offload Query to
Data Nodes
SQL
data subset
SQL
Apply Advanced Security on Hadoop & NoSQL
Same security policies apply to Hadoop & Relational
Oracle Database 12c
Small data subset quickly returned
Hadoop
Redacted data subset
SQL
JSON data unconverted
in Hadoop JSON
Customer data in Oracle
DBMS_REDACT.ADD_POLICY(
object_schema => 'txadp_hive_01',
object_name => 'customer_address_ext', column_name => 'ca_street_name',
policy_name => 'customer_address_redaction', function_type => DBMS_REDACT.RANDOM,
expression => 'SYS_CONTEXT(''SYS_SESSION_ROLES'', ''REDACTION_TESTER'')=''TRUE'''
);
• Redaction
• Virtual Private Database
• Fine-grain Access Control
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Big Data SQL
Walkthrough
45
Oracle Confidential – Internal/Restricted/Highly Restricted
Easily access all data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Easily query complex
data
Build a 360° view
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Database provides rich security
policies
Apply those same
policies to data
sourced from Hadoop
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Customer identities
are safe.
• OBIEE
– Query & Report on Hadoop, NoSQL &
Relational
– Certified with 12c & Big Data SQL
• Endeca
– Use and extend your existing investment
Oracle Business Analytics on Big Data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance Foundation
Big Data Integration & Governance Capabilities
Oracle Company Confidential 53
Real-Time Data Movement
– Low impact data staging in Hadoop – Continuous data availability
Data Transformation
– Bulk data movement
– Pushdown data processing
Data Federation
– Query Hadoop SQL via JDBC
Data Quality & Verification
– Fix quality at the source – Verify data consistency
Metadata Management
– Lineage and Impact Analysis – Business Glossary Semantics
Oracle Data Integrator (Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast Load
Oracle GoldenGate (Movement)
Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator (Federation)
Veridata
(Online Data Verification)
ETL Offload &
Machine Learning
Continuous Availability
The race is on
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Discovery Lab in a Box
Big Data Appliance X4-2
• 6 Node Starter Rack
– 2 * 8 Core Intel Xeon E5 Processors/Node
– 384 GB / 3 TB (64 GB Memory / expandable to 512 GB/Node)
– 288 TB (48TB Disk space/Node)
Integrated Software
• Oracle Linux, Oracle Java VM
• Oracle Big Data SQL*, Oracle Big Data Connectors*
• Cloudera Distribution of Apache Hadoop – EDH Edition
• Cloudera Manager
• Oracle R Distribution
• Oracle NoSQL Database
55
* Licensed separately
21 %
Cost Savings
33 %
Faster Time to Value
• Ready to Go – just add Data
• Pre-configured, Tuned
• Integrated Management
• Run any 3 rd party software
• Install Exalytics into a BDA Starter Rack
– In-memory Database & Analytics
– OBIEE Certified with 12c & Big Data SQL
– Install Exalytics into BDA Starter Rack
Cloud Platform: Big Data Analytics
Big Data Service
• Integrated with DBaaS – SQL on Hadoop
• Hadoop 2.0 Cluster
• NoSQL Service – for key value data
• Persistent Data Reservoir – in Storage Service
• Single tenant or multitenant
• IaaS offerings for performance/QOS – commodity with NAS, Big Data Appliance
Big Data Discovery
• The Visual Face of Big Data
• Business user and data scientist collaboration
• Self-service data discovery and exploration to separate signal from noise
• Fully managed infrastructure by Oracle Cloud operations
• Hadoop scalability and cost economies
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 57
How To Get Started
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BIG DATA MANAGEMENT
BIG DATA ANALYTICS
BIG DATA APPLICATIONS
BIG DATA INTEGRATION CREATE VALUE
FROM DATA
1. Transform the business
How To Get Started
BIG DATA MANAGEMENT
BIG DATA ANALYTICS
BIG DATA APPLICATIONS
BIG DATA INTEGRATION CREATE VALUE
FROM DATA
1. Transform the business
2. Lay the foundation
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 59
How To Get Started
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BIG DATA INTEGRATION
1. Transform the business 2. Lay the foundation
3. Discovery Lab
BIG DATA ANALYTICS
BIG DATA APPLICATIONS
CREATE VALUE FROM DATA
BIG DATA
MANAGEMENT
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 61