• No results found

Big Data, Platforms, Tools, and Advanced Analytics Applications and Capabilities What do you Need

N/A
N/A
Protected

Academic year: 2021

Share "Big Data, Platforms, Tools, and Advanced Analytics Applications and Capabilities What do you Need"

Copied!
65
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Data, Platforms, Tools, and Advanced Analytics

Applications and Capabilities – What do you Need

(2)
(3)

Global Digital Data Growth

Growing leaps and bounds by

40+%

YoY!

Structured Data

Unstructured Data

2009 = .8 Zetabytes

= .08 ZB Structured Data

= .72 ZB Unstructured Data

2020 = 35 Zetabytes

= 3.5 ZB Structured Data

= 31.5 ZB Unstructured Data

*

Chart conservatively assumes a constant 9:1 ratio of unstructured data vs. structured data (based upon IDC’s estimate that 90% of all digital data is unstructured).

Chart does not reflect IDC’s projection that unstructured data is currently growing twice as fast as structured data at the rate of 63.7% vs. 32.3% CAGR.

(1 Zetabyte = 1 Trillion Gigabytes)

LEGEND

(4)

4 Copyright © 2011, Oracle and/or its affiliates. All rights

Global Drivers for Analyzing Place-Events-People

BIG Data – Terabytes, Petabytes, Exabytes, Zettabytes, Yottabytes

– Sensors, RFID, VIDEO, LIDAR, Raster, 3D, Terrain and City Models

– Tagged Data, Social Media, Semantics, Ontologies

– SDIs, INSPIRE, Linked Open Data –- Persistent Relationships

BIG Hardware:

– CLOUD Platforms – Public and Private

– Cheaper, more powerful – Clusters of Commodity Servers, Virtualization: = Greener

– Massively parallel database machines (Software Enablement) – e.g. Hadoop, Oracle

BIG Software –

– Real Time Analytics –Biggest value from fastest response – Streams and Events –-

Internet of Things; Spatially Aware

– CyberSecurity

– Location Enable All Applications: ERP, CRM, Business Intelligence, Public Sectors

– Engineered Systems – Fully installed and tested (Labor Cost is now Dominant Factor)

– Support Standards for Interoperability – W3C, ISO, OGC, Wide Range

(5)

Data Volume & Variety Explosion Continues

-Terabytes, Petabyes, Exabytes, Zettabytes

Sensors, RFID, LIDAR, Raster, 3D,

Terrain and City Models, SDIs

New data products for consumers,

mobility, defense, intelligence, land and

water mgmt, transportation,

environment, agriculture, and

constituent services

Terrain Models and 3D for planning,

maintenance, emergency response,

tourism

Tagged Data , Semantics , Ontologies

-- Location is a Powerful Organizing

Principle

Integrate Social Media (

Video

, Audio,

Text, Wikis, Facebook, Imagery) with

Spatial

(6)

6 Copyright © 2011, Oracle and/or its affiliates. All rights

Data Velocity: Spatially-aware Real-Time

Streams / Events / “Internet of Things”

– Ultra-high throughput

(1 million/sec++) and

microsecond latency

– Detect patterns in the flow

of events and message

payloads, Complex Event

Processing-CEP

– Filtering, correlation, and

aggregation across event

sources

Business Intelligence in

Real Time

Track Moving Objects – Cars, UAVs

Real-Time

Business Intelligence

(7)

Volume

– It’s about 100s TeraBytes and Petabytes...

BUT could be smaller volumes – size is in the eye of beholder

Value

– At volume low value data can highlight high value

patterns, trends and insights of significant commercial value

Definition: What is

Big Data

Anyway?

‘The 4 Vs’ of Big Data?

Variety

– Highlights the new types of data from the ‘internet

of things’ that can be stored and analysed

Velocity

– Relates to streams of high frequency data arriving

(8)

8 Copyright © 2012, Oracle and/or its affiliates. All rights

Detecting Signal in Noise = New Insight

Signal

Noise

...…

Insight

What’s the

effectiveness of a

re-employment

campaign? How

has it effected the

economy

Can I track certain

suspicious

individuals?

What do citizens

think about

policies &

services?

Can we detect

issues and trends

before liabilities

increase?

How can sensor

data improve

maintenance

efforts for key

(9)

1. Fraud Prevention

2. Maintenance &

Utilities

3. Constituent

Sentiment

4. Threat Identification

5. Economic Analysis

6. Healthcare

7. Regulatory Compliance, Licensing &

Enforcement

8. Open Government

9. Tax Collections

Big Data in Public Sector

(10)

10 Copyright © 2012, Oracle and/or its affiliates. All rights

Big Data Poses Big Challenge For Military

Intelligence

The amount of data generated is reaching epic

proportions

• Number of sensors deployed on the ground or aloft on

unmanned aerial vehicles (UAVs) and satellites continues to

soar, as does the ability to keep this asset in place for longer

periods

• Volume of data being gathered for intelligence, surveillance

and reconnaissance (ISR) is already huge. In Afghanistan, ISR

acquisition systems gather more than

53TB

of data every day

.

Source: Terry Coslow, Defense Systems, 29 March 2012

Throughout the defense industry, there’s a concerted effort to deal with

so-called big data. There is a recognition that it is simpler to resolve the problems

with technology than to boost the staff of skilled analysts.

“There’s a gap between our growing

ability to collect data and our limited

ability to process that data. We are

collecting

1,500%

more data than we did

5 years ago. At the same time, our head

count has barely risen.”

Gen. Robert Kehler, commander of the

U.S. Strategic Command, late 2011

(11)

What Can Federal Agencies Do With Big Data?

Combining Information on:

Example: Weapons of Mass Destruction Tracking

People

Goods

Money

Materials

Transportation

To give analysts awareness

never before possible

(12)

12 Copyright © 2011, Oracle and/or its affiliates. All rights

Real-time Data Streams

External Data Sources

Transactional &

Operational Systems

Contents Repository

Databases

Web resources

Blogs, Mails, news

Search, Presentation, Report,

Visualization, Query

Example: DHS Risk Analysis

Enterprise Data Management Infrastructure

GeoSpatial

Documents

Secu

red

Historical

Records

POIs

Demographics

Customer Data

Call Records

SMS

Workflow Initiation

Real-time Dashboards

Console Alerts

EV Grid Management

Automatic

Responses and

Publishing

(13)

Maintenance and Utilities

Structured, but massive data sets

Condition Based Maintenance (CBM+) relies on sensors to collect information about the vehicle condition

The condition data is combined with other data such as operational use and cost data

Fleet-wide data sets need to be analyzed for trends that lead to increased costs, but analysts do not always

know what they are looking for

Big Data can help direct the analysts to the areas of most “bang-for-the-buck”

Utilities can use these to spot trends in the massive time series data from millions of smart meter readings

and lower costs or prevent blackouts

(14)

14 Copyright © 2012, Oracle and/or its affiliates. All rights

Non-Defense & Security Uses

National Institutes of Health

Disease outbreak detection & tracking

USDA

Food Stamp Fraud detection/tracking

Use Cases

Housing and Urban Development

(15)

Big Data, Analytics: For Your Domain:

What Tools & Platforms are Most Useful to You?

SCALE: Terabyte, Petabyte, Exabyte Databases

– Hadoop, Map Reduce, Many Open Source tools

– NoSQL, Graph, Relational, Semantics, Ontologies,

– Cyber Security Attacks, Multi-Level Security

– Versioning, Compression, Archiving

– Analytics Software as part of Platforms

HOW do you identify all the technologies you need

HOW to get ALL those technologies in an COTS Platform

(16)

16 Copyright © 2012, Oracle and/or its affiliates. All rights

Hadoop, MapReduce and Related Technologies

Hadoop, HDFS, Hive,

MapReduce, HBase

Mahout, Sqoop,

Zookeeper, Flume,

Oozie, Pig, Whirr

NoSQL, Key Value Stores

What are these? When and why are they important?

For what Use Cases?

What Problems are YOU trying to solve ?

(17)

ACQUIRE

ORGANIZE

ANALYZE

DECIDE

In

-D

at

aba

s

e

Analy

tic

s

Data

Warehouse

Hadoop

(MapReduce)

Oracle Big Data

Connectors

Oracle Data

Integrator

Analytic

Applications

HDFS

Enterprise

Applications

Oracle NoSQL

Database

Oracle Integrated Solution Stack for Big Data

(18)

18 Copyright © 2011, Oracle and/or its affiliates. All rights

Big Data Processing & Query Characteristics

Batch-Oriented

Real-Time

Process data to use

Deliver a service

Bulk storage

Fast access to specific record

Write once, read all

Read, write, delete update

(19)

Big Data Storage Options: Batch vs. Real-Time

Hadoop Distributed File

System (HDFS)

NoSQL Database

(Oracle )

File System

Database

Parallel scanning

Indexed storage / Key Value Pairs

No inherent structure

Simple data structure

High volume writes

High volume random reads and writes

Batch Oriented

Real-Time

(20)

20 Copyright © 2012, Oracle and/or its affiliates. All rights

NoSQL & RDBMS – Used Alongside Each Other

RDBMS

– High value, high density, complex data

– Complex data relationships

– Schema-centric

– Designed to scale up & out

– Lots of general purpose

features/functionality

High overhead ($ per operation)

NoSQL architectures

– Low value, low density, simple data

– Very simple relationships

– Schema-free, unstructured or

semi-structured data

– Distributed storage and processing

– Stripped down, special purpose

data store

(21)

Flexible Key-Value Data Model

ACID transactions

Horizontally Scalable

Highly Available

Elastic Configuration

Simple administration

Intelligent Driver

Commercial grade software and support

Features

Oracle NoSQL Database Enterprise Edition

Scalable, Highly Available, Key-Value Database

Application

Storage Nodes

Datacenter B

Storage Nodes

Datacenter A

Application

NoSQL DB Driver

Application

NoSQL DB Driver

Application

(22)

22 Copyright © 2012, Oracle and/or its affiliates. All rights

Model data in terms of relationships

Flexible schema evolves easily by adding new

relationships

Supports querying and discovery by graph

patterns and traversal

Enables graph analytics such as reachability,

connectivity, transitivity, same as, proximity,

centrality…

Oracle Has Been Shipping Graph Databases

For over 10 YEARS

Graph Database

Why Have Graph Databases Become Very Popular

:California

:USA

:NorthAmerica

:

partOf

:

partOf

:

partOf

owl:TransitiveProperty

rdf:type

:partOf

Query: SELECT ?x ?y

FROM

WHERE { ?x :partOf ?y }

(23)

23 Copyright © 2012, Oracle and/or its affiliates. All rights

Buzzwords For Apps using Graph Technology

What terms to look for:

Semantic Web

W3C RDF/OWL/SPARQL

RDF – Resource Description Framework

Graph Data Management

Knowledge Discovery

Knowledge Mining

Inferencing / Reasoning

Social Network Analysis (SNA)

Ontologies

Taxonomy/Terminology Mgmt

Property Graphs

Sentiment Analysis

Faceted Search

Text Mining

NoSQL Database

Big Data

(24)

24 Copyright © 2012, Oracle and/or its affiliates. All rights

Using Graphs for Enterprise Applications

Enterprise metadata

framework for enterprise data &

public cloud

Semantic terms link instance

data with other resources and

apps

Linked resources enable

interoperability between apps

Apps

Instance Data

(25)

RDF / OWL for Enterprise Integration

Index

Content Mgmt

BI Server

Data Warehouse

Machine Generated Data

RDF metadata layer

(integrated graph metadata)

Transaction Systems

Big Data Appliance

Subscription Services

Human Sourced

Information

Social Media

Event Server

Data Servers

Data Sources / Types

(26)

26 Copyright © 2012, Oracle and/or its affiliates. All rights

Big Data Sources

Applications

End-user and Developer Environments

Streaming

Services

Data Services

Statistics

Text

Analytics

Graph

Analytics

Spatial

Data Mining

Natural Lang.

Processing

Structured

Data

Unstructured

Data

App Services

Developers

Data Integration

Business Users

Business Intelligence

Data Scientists

Discovery

Event

Processing

Data Management

NoSQL

Hadoop

Relational

Web-log

Sessionization

and

Enrichment

Sentiment

Analysis

Social Media

Statistics

Mining

Supporting Breadth of Enterprise Data

JDeveloper

Dashboards

Reference

Architecture

Sound and Video

Images

Compression

Security & Encryption

Vertical

Applications

Horizontal

Applications

ODBC

JDBC

(27)

• Proven scalability for billions of relationships,

terabytes of data

• Standards-based native reasoning

• Supports open source 3

rd

party & Oracle tools

• Graph querying with SPARQL & SQL languages

• Compression, parallelism, and partitioning to

enhance query, load, and inference performance

• DoD strength security: top secret,

compartmented

Key Features

Oracle: RDF, OWL Semantic Graph

(28)

28 Copyright © 2012, Oracle and/or its affiliates. All rights

Graph Alignment of Complex Workflows

Data Sources

Contents Repository

Databases

Web resources

Blogs, Mails, news, RSS feeds

Extract, Transform, Load

•Feature/term Extraction

•Semantic categorization

•Transformation to RDF triple

Extracted Entities &

Relationships

RDF

Intelligence Ontology

SQL/SPARQL

Search, Presentation,

Report, Visualization, Query

National Intelligence Scenario

Enterprise Data

Spatial

Documents

Person: Abduwali Abdukhadir Muse Nationality: Somalian Country: UK Group: Al Shabab Ideology: Islamist Person: ? Nationality: Pakistani Country: Pakistan Group: ? Person: Chehab Abdoulj amid Bouyaly

Country: Morocco Group: al Qaeda Currently resides Member of Currently resides Member of Supports Supports Link ? Link ? Member of Currently resides Has Has

(29)

Allied Nation Intelligence Service

Oracle Spatial and Graph: Social Analysis

Objectives

Profile suspects through telephone, email

and social network communications

Produce “data products” for analysts

Solution

RDF Graph modeling of the social network:

people, groups and places of interest

Graph analytics discover relationships

among individuals & meaning of

pseudonyms, aliases, codes, terminology

Find & label “same-as” relationships

Discover of ~100 million relationships / month

Tagging for 600 TB / 10b triples graph

Top-secret , compartmented security for data

Standards-based tools: W3C RDF & SPARQL

(30)

30 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.

Core Inference Features and Semantic Indexing

Forward chaining based inference engine in the database

Various native rulebases provided

E.g., RDFS, OWL 2 RL, SNOMED (subset of OWL 2 EL), SKOS

Validation of inferred data

User-defined rules, Proof generation

Performance enhancements

Parallel Inference, Incremental Inference, Compact data structures

Optimized owl:sameAs handling

Semantic document indexing

– Programmable API to plug-in 3

rd

party extractors (for any data) into the database.

(31)

More functions in Oracle Database Release

Inference

– Native OWL 2 EL inference support

– User defined inferencing

Allows generation of new RDF resources

Temporal reasoning, Spatial reasoning

Web service callouts

– Ladder Based Inference

Fine grained security for inference graph

– Performance optimization for user defined rules

– Integration with TrOWL*, an external OWL 2 reasoner

(32)

32 Copyright © 2012, Oracle and/or its affiliates. All rights Point Flexibility Point

Data Integration

Metadata Layer

Unified content metadata for

federated resources

Validate semantic and

structural consistency

Apps:

Metadata warehouse

Master Data Management

Taxonomy/vocabulary store

Media metadata repository

Text Mining & Entity

Analytics

Find related content &

relations by navigating

connected entities

“Reason” across entities

Apps:

Documents summary

repository

Documents catalogue

Social Media

Analysis

Analyze social relations

using curated metadata

Blogs, wikis, forums, IM,

calendars, video, voice

Apps:

Social graphs

Recommender systems

HR skills database

Oracle Spatial and Graph:

Graph Use Cases

Network

Analysis

Graph analysis of road,

utility and multimodal

networks, drive-time

polygons

Apps:

Trade area management

Service delivery

optimization

(33)

Intelligence,

Law

Enforcement

Web and Social

Network

Solutions

Health Care and

Life Sciences

Transportation

Network

Management

Financial

Services

Media, Content

Management

Build graphs in

social

networks for

Threat

Analysis

Semantically

index results

of

Text Mining

Semantic ally

link data sets

for

Integrated

Justice

system

Model a graph

of

relationships

for

Social

Network

Analysis (SNA)

Query graph

patterns for …

Activity

Analysis

Attribute

Clusters

Terminology

management

of

EHR systems

Semantic

concept

alignment

of

clinical,

research and

operational

data

Information

discovery

using

inferencing

rules

Fleet

management

Multimodal

transportation

planning

Routing and

Tracking

Facilities mgt.

Assets

monitoring &

maintenance

Outage

management

Network

planning

Coverage area

analysis

Model “same as”

relationships

for

Fraud

detection

Semantic ally link

data sets for

complete

results in

Compliance

Management

Link relevant

metadata for

Risk

Assessment

Query graph

patterns in

Media

Metadata

to

find media of

interest

Semantically

relate media

metadata to

Repurpose

Content

Oracle Spatial and Graph

(34)

34 Copyright © 2012, Oracle and/or its affiliates. All rights

Oracle Spatial and Graph

“Points”

“Lines”

“Polygons”

Rasters

Topologies

3D

f1

f2

n1

n2

e1

e2

e3

e4

Network Graphs

Web Services

(OGC)

Geocoding

Routing

RDF Semantic Graphs

Oracle Spatial and Graph - Types

“Points”

Web Services

(OGC)

SPARQL End Point

Geocoding

Routing

(35)

Ontology-driven Geospatial Applications -

Connect Actionable Knowledge

Simple Features

GeoRaster

Topology

Networks

Gazetteers

RDF & OWL

Metadata

Environmental

Monitoring

Famine

Relief

National Mapping

Private Cloud

Disaster

Response

Spatial

Data

Geographic

Names

Raster

Data

Data Integration

National Map schemas

Geographic names

Temporal

Naïve Geography

(36)

36 Copyright © 2011, Oracle and/or its affiliates. All rights

It’s Not Just about “Linking”

Integrate domains of knowledge

through common vocabularies (ie

SKOS)

Manage relationships between

collections of images and

associated metadata

RDF as flexible and extensible data

model supports powerful search

and end-user discovery of related

content

Rich platform for data integration,

data repurposing, and better quality

control and classification

(37)

Automatic Data Optimization

Active

Frequent Access

Occasional Access

(38)

38 Copyright © 2012, Oracle and/or its affiliates. All rights

ILM: Hot/Cold Data Classification

Enhanced Insight into Data Usage: “heat map”

Recently inserted,

actively updated

Infrequently updated,

Frequently Queried

Retained for long term analytics and

compliance with corporate policies

and regulations

ACTIVE

FREQUENT

ACCESS

DORMANT

Block and Segment level

statistics on last Read and last

Update

(39)

ILM: Automatic Compression & Tiering

Usage based and custom compression and tiering

ALTER TABLE orders

ILM ADD CompressionPolicy

COMPRESS Partitions for Query

AFTER 90 days from creation

;

ALTER TABLE sales

ILM ADD MovePolicy

TIER Partitions TO ‘Archive_TBS’

ON OrdersClosedPolicy;

(40)

40 Copyright © 2012, Oracle and/or its affiliates. All rights

Data Compression

Reduce storage footprint, read compressed data faster

Hot Data

1

110101010101010

0110101010101101

0001011011000110

1001010000010011

1000101010110100

1011010010110001

0100100111110010

0100001000101010

1101000

1

010101011101010011010111

0000101000101101110101010

0101001001000010001010101

1010010110100111000010100

1001010000100100001000101

0101110011010

Warm Data

1010101011101010

0110101110000101

0001011011101010

1001010010010000

1000101010110100

1011010011100001

0100100101000010

0100001000101010

1101001

1010101011101010011010111000010100010110111 0101010010100100100001000101010110100101101 0011100001010010010100001001000010001010101 11001101110011000111010 Archive Data

1010101011101010

0110101110000101

0001011011101010

1001010010010000

1000101010110100

1011010011100001

0100100101000010

0100001000101010

1101001

1010101011101010011010111000010100010110111010101001 0100100100001000101010110100101101001110000101001001 0100001001000010001010101110011011100

3X

Advanced Row Compression

10X

Columnar Query Compression

15X

(41)
(42)

42 Copyright © 2012, Oracle and/or its affiliates. All rights

Connecting: CYBERSECURITY is Major Challenge

Information Security and Privacy

Monitoring

Access Control

Encryption & Masking

Monitoring

Configuration Management

Audit Vault

Total Recall

Access Control

Database Vault

Label Security

Advanced Security

Secure Backup

Data Masking

Encryption &

Masking

Blocking & Logging

(43)

Oracle Database Security Solutions

Comprehensive Security Platform

Database Firewall and

Network Monitoring

Audit Consolidation

Reporting and Alerting

DETECTIVE

Redaction and Masking

Privileged User Controls

Encryption

PREVENTIVE ADMINISTRATIVE

Sensitive Data Discovery

Configuration Scanning

(44)

44 Copyright © 2012, Oracle and/or its affiliates. All rights

Redacting Data Challenges

Redact data in

applications, queries

and reports

Secure sensitive

personal information

Avoid changing

applications, queries

and reports

(45)

Soc. Sec. #

115-69-3428

DOB

11/06/71

PIN

5623

Policy

enforced

redaction of

sensitive data

Redacting Sensitive Data

Mask Application Data Dynamically

Call Center

Operator

Payroll

Processing

(46)

46 Copyright © 2011, Oracle and/or its affiliates. All rights

Connecting: Tools to Find Connections:

Discovery & Predictive Analysis

Problem Classification

Sample Problem

Anomaly Detection

Given demographic data about a set of customers, identify

customer purchasing behavior that is significantly different from

the norm (Fraud?) . For Sensors, Events – problem? Good News?

Association Rules

Find the items that tend to be purchased together and specify

their relationship – market basket analysis

Clustering

Segment demographic data into clusters and rank the probability

that an individual will belong to a given cluster. For customers /

govt constituents, what do they want to know about?

Feature Extraction

Given demographic data about a set of customers, group the

attributes into general characteristics of the customers

(47)

Analysis with R

Transparently leverage Hadoop for

High Performance Analytics to

Oracle Big Data Appliance

(

Oracle R Connector for Hadoop

)

Function push-down

data transformation &

statistics

(48)

48 Copyright © 2012, Oracle and/or its affiliates. All rights

Predictive Analytics with R, BI, Data Mining

Function push-down –

data transformation & statistics

R workspace console

Oracle statistics engine

OBIEE, Web Services

No changes to

the user

experience

Scale to large

data sets

Embed in

operational

systems

Development

Production

Consumption

(49)

Information Discovery and Hadoop

Data Variety -

structured and

unstructured data

No fixed pre-defined

model required

Massive scalability

Batch execution

Technical experts

develop analyses

In-memory analytics

Interactive query

response times

Targets business users

Deep Large-Scale

Processing

Fast Self-service

Information Discovery

Populates

Informs

Complementary Big Data Capabilities

Oracle Endeca

Information Discovery

(50)

50 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Copyright © 2012, Oracle and/or its affiliates. All rights

50

Cross-referenced the

relationships between 17000

genes and five major cancer

types across 20 million

medical publication abstracts

Cross-referenced genes from

60Million patients and miRNA

for a simulated 900Million

population.

Understanding additional

layers of the pathways these

genes operate in and the drugs

that target them is expected to

help researchers in their work

Big Data in Healthcare

Find relationship between gene to cancer interaction

Use

Case

2012

Government

Big Data

Solution

(51)

Oracle In-Database Analytics Platform

XML

Relational

OLAP

Spatial

Data Layer

RDF

Media

Parallel Processing Engine

Oracle R

Enterprise

Oracle

Data Mining

Text and

Search

Spatial

Analytics

SQL

Analytics

RDF Graph

Analytics

(52)

52 Copyright © 2012, Oracle and/or its affiliates. All rights

Big Data Apparent Disconnect

Two Separate Developer Worlds

MapReduce

Solutions

Distributed

File Systems

NoSQL

Flexible

Specialized

Developer

Centric

SQL

Trusted

Secure

Administered

Transaction

(Key-Value Stores)

DBMS

(OLTP)

DBMS

(DW)

Advanced

Analytics

Extract

Transform

and Load

(ETL)

Unstructured

Data

Structured

Data

Acquire

Organize

Analyze

Acquire

Organize

Analyze

Traditional Environments

Big Data Environments

Hadoop, HDFS, Hive,

MapReduce, HBase

Mahout, Sqoop,

Zookeeper, Flume,

Oozie, Pig, Whirr

(53)

Oracle’s Approach To Big Data

Provide all Technologies, Engineered Together

MapReduce

Solutions

Distributed

File Systems

NoSQL

Flexible

Specialized

Developer

Centric

SQL

Trusted

Secure

Administered

Transaction

(Key-Value Stores)

DBMS

(OLTP)

DBMS

(DW)

Advanced

Analytics

Extract

Transform

and Load

(ETL)

Unstructured

Data

Structured

Data

Acquire

Organize

Analyze

Acquire

Organize

Analyze

Traditional Environments

Big Data Environments

Oracle Exadata

Database Machine

Oracle

Exal

y

tics

In

-Memory

M

achine

Oracle

Big Data

Appliance

Hadoop, HDFS, Hive,

MapReduce, HBase

Mahout, Sqoop,

Zookeeper, Flume,

Oozie, Pig, Whirr

(54)

54 Copyright © 2011, Oracle and/or its affiliates. All rights

Secure

Big Data and Cloud Computing

Next Generation Enterprise Data Platform

CIO

Drive Business Efficiency

and Agility

Deliver Bottom Line Savings

Private

Cloud

Big Data

Drive Business Value

Deliver Top Line Growth

Business

Value

(55)

Public Clouds and Private Clouds

Used by

multiple

tenants on a

shared basis

Hosted and

managed by

cloud service

provider

Exclusively

used by a

single

organization

Controlled and

managed by

in-house IT

Lower

upfront

costs

Outsourced management

OpEx

Lower

total

costs

Greater control over security, compliance, QoS

CapEx & OpEx

Trade-offs

Public Clouds

IaaS

PaaS

SaaS

I

N

T

R

A

N

E

T

Private Cloud

IaaS

PaaS

SaaS

I

N

T

E

R

N

E

T

IaaS

PaaS

IaaS

PaaS

Apps

SaaS

(56)

56 Copyright © 2012, Oracle and/or its affiliates. All rights

Exadata

Big Data Appliance

Exalytics

Oracle Complete Big Data Platform

ACQUIRE

ORGANIZE

ANALYZE

DECIDE

In

-Da

ta

b

ase

A

n

al

yti

cs

Data

Warehouse

Cloudera’s

Distribution

including Apache

Hadoop

Oracle Big Data

Connectors

Analytic

Applications

Alerts,

Dashboards,

MD-Analysis, Reports,

Query

Web Services

BI Abstraction

Applications

Oracle NoSQL

Database

Oracle Data

Integrator

Big Data

Appliance

Exadata

Exalytics

Oracle

Advanced

Analytics

InfiniBand

InfiniBand

Oracle

Database

Open Source R

(57)

Time to Build

Optimizations

Maintenance

(58)

58 Copyright © 2011, Oracle and/or its affiliates. All rights

Required Skills for MapReduce Development

Java

Hadoop Framework

Parallel Algorithms

& Manage Large

Clusters of

(59)

SHUFFLE /SORT SHUFFLE /SORT MAP MAP MAP MAP SHUFFLE /SORT REDUCE REDUCE SHUFFLE /SORT SHUFFLE /SORT REDUCE REDUCE REDUCE INPUT 2 INPUT 1 OUTPUT 2 OUTPUT 1 MAP MAP MAP MAP MAP REDUCE REDUCE REDUCE MAP MAP MAP MAP MAP MAP REDUCE REDUCE MAP MAP MAP MAP MAP REDUCE REDUCE REDUCE

Big Data: Batch-Oriented Processing using

Hadoop: Map – Reduce –

Simple

Example

(60)

60 Copyright © 2011, Oracle and/or its affiliates. All rights

HW/SW Efficiencies: But Labor Costs Growing -

Complete, Engineered System Needed

Services

Hardware

Software

Procurement Leveraging

$100B

$200B

$300B

$400B

$500B

(61)

Oracle Big Data Appliance

Optimized and Complete

– All you need to store and integrate big data

Integrated with Oracle Exadata

– Analyze all your data

Easy to Deploy

– Risk Free, Quick Installation and Setup

Single Vendor Support

(62)

62 Copyright © 2012, Oracle and/or its affiliates. All rights

Oracle Exadata Database Machine

Fastest

Data Warehouse & OLTP

Best Cost/Performance

Data Warehouse & OLTP

Optimized Hardware (per rack)

Processor: up to128 Intel Cores and 2 TB DRAM

Network: 880 GB/Sec Throughput

Storage: 5 TB Flash and up to 336 TB Disk

Software Breakthroughs

Exadata Smart Storage Grid

Smart Flash Cache

Hybrid Columnar Compression

Parallel Scale-Out Database and Storage

Scales from ¼ Rack to 8 Full Racks

(63)

Oracle Exalytics In-Memory Machine

First engineered system

for analytics

Visual Analysis without

limits

Smarter analytic

applications

(64)

64 Copyright © 2012, Oracle and/or its affiliates. All rights

Exadata

Big Data Appliance

Exalytics

Oracle Complete Big Data Platform

ACQUIRE

ORGANIZE

ANALYZE

DECIDE

In

-Da

ta

b

ase

A

n

al

yti

cs

Data

Warehouse

Cloudera’s

Distribution

including Apache

Hadoop

Oracle Big Data

Connectors

Analytic

Applications

Alerts,

Dashboards,

MD-Analysis, Reports,

Query

Web Services

BI Abstraction

Applications

Oracle NoSQL

Database

Oracle Data

Integrator

Big Data

Appliance

Exadata

Exalytics

Oracle

Advanced

Analytics

InfiniBand

InfiniBand

Oracle

Database

Open Source R

(65)

Cloud

Computing

Oracle Engineered

Systems

References

Related documents

- SAS Data mining tools help you to analyze Big data - It is an ideal tool for Data mining, text mining

Big Data Analytics, Social Media Marketing, International Law, Start-up Management, SAP-ERP, Strategic Digital Marketing, Financial Modelling, Wealth Management,

MAIN HYDRAULIC CONTROL VALVE LINES & FITTINGS-UPPER SECONDARY CONTROL VALVE LINES AND FITTINGS - UPPER HYDRAULIC REMOTE CONTROL INSTALLATION.. JOYSTICK CONTROL HAND LEVER

This study showed that, although pre- and post-RT stenosis was a prognostic factor for patients’ survival, complete circumference involvement rather than a higher radiation dose

The physical work environment is appropriate to the services offered and conducive to the safety, privacy, and comfort of the clients, psychologist(s) and support staff. The

Second team honorees included junior catcher Mike Meeuwsen of Grand Rapids, Mich., and sophomore second baseman Matt Klein of DeWitt, Mich.. Ruby and Labbe were also named

You can choose from a wide range of psychology courses, but you may also choose courses from another programme or university or you can spend a semester studying at a partner

The obligation, we believe, falls upon the suspended attorney to try to make arrangements to allow the paralegal to keep his/her position without violating the professional