• No results found

Evolution from Big Data to Smart Data

N/A
N/A
Protected

Academic year: 2021

Share "Evolution from Big Data to Smart Data"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Information is Exploding

EVERY MINUTE

120 HOURS

VIDEO UPLOADED TO YOUTUBE

50,000

APPS DOWNLOADED

204

MILLION E-MAILS

EVERY DAY

Intel Corporation 2015

(3)

The Data is Changing

3

Performance

Optimized

Capacity

Optimized

Data Type

Structured

Unstructured

Record Size

Kilobytes or less

Megabytes to

Terabytes

Data Updates

Frequent

Rare/never

Access

Frequency

Heavy

Light

Metadata

Fixed

Variable

Scale Required

Up to Terabytes

Exabytes

Copyright 2014 IDC

“Unstructured Data accounts for 70-80% of storage capacity growth”

Ashish Nadkarni, IDC

(4)

1.

Scale Out

2.

Software Defined

3.

Smart Data

4

(5)

Scale-Out Economics

• Start Small – Scale Large

Start from a single node (TBs) but

have the ability to scale to multiple

independent nodes (PBs)

RAIN Architecture

Granular Resource

Scaling

Add CPUs and storage

independently as needed

Take advantage of decreasing

storage costs and increased storage

densities

(6)

Software Defined Storage

Tokyo New York. London

Deep Archive

Analytics

Modern Apps

Object

HyperStore Smart Storage Platform

HDFS

File

100% S3

Always On

Smart Protect

Multi Datacenter

Smart Policies

Enterprise Grade

(7)

The Era of

Smart Data

Storage

7

DATA STORAGE = problem

SMART DATA STORAGE = solution

Active

Timely Insight

Meaning

Actionable

Business Value

DATA

INFORMATION

Passive

Delayed Analytics

Static Data

OBJECT STORE

HYPERSTORE ANALYTICS

(8)

Smart Data –

Analytics

in Place

8

Consumer Activity

(Events, GPS, WiFi)

Social Media

Device Tracking and Logs

Result of Analysis

Cloudian HyperStore

I N T E R N E T O F T H I N G S

B I G DATA

Fast

Efficient

Better business decisions

Event processing

platform

Benefits

 Faster time-to-decision

 Analyze more – allows for efficient bulk data analysis in place  No redundant storage of data

 HyperStore scales out with your data – adding nodes for I/O  Take advantage of multi-core CPUs – makes sense for

MapReduce

 Can feed smarter data to subsequent analytic systems

Analytics

COST EFFICIENT

(9)

9

Cloudian & Hortonworks

YARN : Data Operating System

Script Pig Search Solr SQL Hive/Tez, HCatalog NoSQL HBase Accumulo Stream Storm Others In-Memory Analytics, ISV engines 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N Batch Map Reduce

Linux Windows On-Premise

Cloud

HDFS

S3 Native File System (URI scheme: s3n)

• HDFS Shell Commands

• File I/O Operations

• Mass Upload

• ETL with Pig

• Standard Map Reduce

• Analysis with Hive

(10)

Availability

Peer to peer storage system

Locality – Data Center Locality

Can enforce constraints on the location of Hadoop data and

maintain locality of reference for Hadoop

Hadoop can be run on storage nodes

Efficiency

Erasure Coding for efficient bulk data storage

Scale Cluster on demand as needed – dynamic rebalance

Multi-part uploading to improve large object uploads

Rich metadata

Example Pig can load filtered data directly from Cloudian

HyperStore without passing for HDFS

A = LOAD 's3n://BUCKET' USING CloudianStorage();

B = FILTER A BY (time >= '2015/02/16') AND (time <= '2015/02/20');

10

(11)

11

Use Cases

Hadoop for Internet of Things

Clickstream data Sentiment data

Server log data

Sensor data

Analysis of what people click on – Individual web pages and in what order.

Clickstream analysis can reveal how users research products and also how they complete their online purchases.

 Internet Marketing  Online Commerce

Unstructured data on opinions, emotions, and attitudes from sources like social media posts, blogs, online product reviews and customer support

interactions.

Organizations use sentiment analysis to understand how the public feels about something and track how those opinions change over time.

 Retail

 Media & Entertainment

Large enterprises build, manage and protect their own proprietary, distributed information networks. Server logs are the computer-generated records that report data on the operations of those networks.

When there is a problem, its one of the first places the IT team looks for a diagnosis.

 IT Organizations  Customer Support

From refrigerators and coffee makers to energy-measuring smart meters, sensor data is everywhere. It is created by the machinery that runs assembly lines and the cell towers that route our phone calls. It is net new data that is increasing exponential in the information age.

 Manufacturing  Industrial

(12)

Smart Support

12

Cloudian Support

HyperStore

Appliances

Hadoop

Cluster

HyperStore

Appliances

S3n://bucket/…

Smart Support

Smart Support

Analytics

CUSTOMER

CLOUDIAN

Telemetrics

Data

(13)

Cloudian

HyperStore

Platform

(14)

Multi tenancy & QoS

Requests per Min Storage Bytes Storage Objects Inbound Bytes/Min Outbound Bytes/Min

HyperStore Software Defined Storage

Tenant A Tenant B Tenant C Tenant A Tenant B Tenant C Storage Policies Tiering Access Control

Data Placement Data Access

(15)

Your

Choice

of Deployment

Pre-Configured Software-Defined Storage Arrays

• Density optimized

appliance with

PB-scale architecture

• Seamless scalability

on demand

• 8 storage nodes in 8U

• Hot plug everything

Stand-alone Software

OR

• 1U and 2U models

• Scales from 24TB to

multiple PBs

• Dedicated,

all-in-one, on-premises

storage

HSA Series

FL3000 Series

• Efficient data protection with

compression, replication and

erasure coding

• On-premises S3 with full

support for all S3 ecosystem

apps

• Dynamic data tiering

• Hadoop-ready

• Geo-replication

• Multi-tenant QoS controls

• Self-healing and

(16)

CLOUDIAN HYPERSTORE SMART DATA STORAGE

STORAGE OPERATIONS

Smart Protect

Proactive Repair

Smart Tiering

Smart Scale

Software Defined

Forever Storage Platform

Real Time Analytics

Search & Discovery

Smart Support

SMART

STORAGE PLATFORM

SMART

STORAGE ANALYTICS

SMART

(17)

HYBRID

CLOUD

WEBSCALE

SIMPLICITY & ECONOMICS

OPEN

ARCHITECTURE

ENTERPRISE

READY

1c per

GB per

Month

Visit Us:

Booth 415

References

Related documents

Tend to be larger and more focused on ‘traditional videogaming’ (i.e. home console / PC) More Japanese games companies in this group – your language skills can be beneficial to

 R­A Seniors presented their Titan JAMs last Tuesday, April 5. Thanks to the  

Contingency management for alcohol dependence Treatment • Contingency Management Treatment Ingredient • Abstinence- contingent Reinforcers • Frequency • Immediacy of

[r]

The central theme of the book is the second coming of Jesus Christ (Rev. 19) just as the central feature of the Gospels was the first coming of Jesus Christ.[Walvoord, page 942]

A prompt and thorough investigation is essential to an effective defense of an EEOC Charge.  Conduct your own

REQM occupies an important position in project management of software outsourcing, as requirement is the foundation and starting point of project, no matter

undertaken concurrently. A purposive sample {N = 28) of nurses and nursing assistants who provided care in nursing homes participated in a series of four focus groups. The