• No results found

EMC BACKUP MEETS BIG DATA

N/A
N/A
Protected

Academic year: 2021

Share "EMC BACKUP MEETS BIG DATA"

Copied!
40
0
0

Loading.... (view fulltext now)

Full text

(1)

EMC BACKUP

MEETS BIG DATA

Strategies To Protect Greenplum,

Isilon And Teradata Systems

(2)

Agenda

 Big Data: Overview, Backup and Recovery

 EMC Big Data Backup Strategy

 EMC Backup and Recovery Solutions for Big Data

– EMC Greenplum

– Teradata

– EMC Isilon

(3)

2000s

(Content and Digital Asset Management

)

1990s

(RDMBS, Data Warehouses, etc.) 2010s

(Hadoop, MapReduce, NoSQL, Cloud

)

V OLU ME OF IN FORM A TI ON

LARGE

SMALL

MEASURED IN

TERABYTES

1TB = 1,000GB

MEASURED IN

PETABYTES

1PB = 1,000TB

WILL BE MEASURED IN

EXABYTES

1EB = 1,000PB

Big Data – Volume, Velocity and Variety

(4)

Energy

Big Data Transforms Business

Every Industry Will Feel Impact

Healthcare

Government Cyber Security

Insurance

Financial Telecom Retail

Advertising Gaming

(5)

Big Data Backup And Recovery

 Challenges

– Data Deluge

– Performance And Scale

– Data Islands

 Big Data Backup

– The Journey Has Just Begun…

(6)

EMC Big Data

Backup Strategy

(7)

End-To-End Solutions

Backup Software & Storage Designed To Work Together

Improves

–Performance: Up To 50%

Faster Than Traditional Solutions

–Simplicity: One User

Interface—Common Policy

Management

–Predictability: Proven Best

Practices—Over 20,000

Installations

INTEGRATED

ARCHITECTURE

Storage

Software

(8)

Data Domain Provides The Foundation With Backup Software

EMC Big Data Backup & Recovery Strategy

CIFS, NFS,

NDMP, DD Boost

Ethernet

Virtual Tape

Library (VTL) over

Fibre Channel

Big Data

systems

Backup

App /

Utility

Data Domain

System

(9)

A history of industry firsts

EMC Data Domain: Leadership and Innovation

First deduplication

NAS

First deduplication

volume replication

Largest

deduplication

array

First deduplication

directory replication

First deduplication

virtual tape library

First deduplication

nearline storage

Fastest backup

controller

Cascaded

replication

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

First

distributed

processing

First long-term

retention

system for

backup and

archive

First inline

deduplication

for compliant

archiving

(10)

PURPOSE BUILT

BACKUP

APPLIANCES

O P E N S Y S T E M S +

MAINFRAME CAPACITY

2011 Total Market

927,474 TB

Source: IDC, Worldwide Purpose-Built Backup Appliance 2012–2016 Forecast and 2011 Vendor Shares , Doc #234489, Apr 2012.

Above: Worldwide Supplier Capacity for 2011, Total PBBA Market

EMC IBM Others HP Quantum Symantec

64.7%

EMC

(11)

Second Friday Full Backup

B C D E F L G H

Store More Backups In A Smaller Footprint

Data Deduplication: Technology Overview

A B C D E F G H I J

Friday Full Backup

A B C D A E F G

Mon Incremental A B H

Tues Incremental C B I

Thurs Incremental A C K

Weds Incremental E G J

Backup Estimated

Data Logical Reduction Physical

Monday Incremental 100 GB 7–10x 10 GB

Tuesday Incremental 100 GB 7–10x 10 GB

K L

Wednesday Incremental 100 GB 7–10x 10 GB

Thursday Incremental 100 GB 7–10x 10 GB

Second FRIDAY FULL 1 TB 50–60x 18 GB

TOTAL 2.4 TB 7.8x 308 GB

FRIDAY FULL 1 TB 2–4x 250 GB

(12)

Over one year of retention in 3U of Data Domain deduplication

storage

Retain: Store More for Longer with Less

Week 1

Backup Cumulative Estimated Physical

Data Logical Reduction

April 14 3.4 TB 10x 326 GB

April 21 4.6 TB 13x 364 GB

April 28 5.8 TB 14x 402 GB

May 31 10.6 TB 19x 554 GB

June 30 15.4 TB 21x 706 GB

TOTAL 15.4 TB 21x 706 GB

April 7 2.2 TB 8x 288 GB

Week 2

Week 3

Month 1

Month 2

Month 3

First Full 1 TB 4x 250 GB

(13)

Data Integrity: Data Invulnerability Architecture

Other

RAID 6

NVRAM

Snapshots

End-to-end data verification

Checksum

Deduplication, write to disk

Verify

Self-healing file system

Cleaning

Expired data

Defrag

Verify

Deduplication

Local Compression

RAID

File System

Generate

Checksum Verify

Data

Verify the file

system metadata

integrity

Verify user data

integrity

Verify stripe

integrity

End-to-end data verification

(14)

Network-Efficient Replication

for True Disaster Recovery

95–99% cross-site bandwidth reduction

Source:

Remote sites

Destination:

Data Center Hub

Supports hundreds

of remote sites

1–5%

1–5%

1–5%

Archive data

Backup data

Data Domain

DD890

Data Domain system

Flexible replication

 One-to-many

 Many-to-one

 Bi-directional

 System-to-

system

 Cascaded

Home

DB

WAN

Home

Data Domain system

Data Domain system

(15)

Data Domain Boost Software

 Distributes parts of deduplication process to

backup server or application clients

 Speeds backups by up to 50 percent

 Enables more efficient resource utilization

 Provides application control of Data Domain

replication process

 Supports majority of backup software market and

native utilities in industry leading databases

Boost DD

(16)

Industry’s Most Scalable Inline Deduplication Systems

DD160 DD620 DD640 DD670 DD860 DD890 DD990

Speed (DD Boost) 1.1 TB/hr 2.4 TB/hr 3.4 TB/hr 5.4 TB/hr 9.8 TB/hr 14.7 TB/hr 31.0 TB/hr

Speed (other) 667 GB/hr 1.1 TB/hr 2.3 TB/hr 3.6 TB/hr 5.1 TB/hr 8.1 TB/hr 15.0 TB/hr

Logical capacity 40–195 TB 83–415 TB 0.32–1.6 PB 0.6–2.7 PB 1.4–7.1 PB

5.7–28.5 PB

1

2.9–14.2 PB 5.7–28.5 PB

13– 65 PB

1

Usable capacity Up to 3.98 TB Up to 8.3 TB Up to 32.2 TB Up to 55.9 TB Up to 142 TB

Up to 570 TB

1

Up to 285 TB Up to 570 TB

Up to 1.3 PB

1

• DD Boost

• DD Encryption

• DD Extended Retention

Small

Enter./RO

BO

Midsize Enterprise

Large Enterprise

• DD Replicator

• DD Retention Lock

• DD Virtual Tape Library

1

With DD Extended Retention software option

Data Domain Software Options

(17)

Overcome Your Big Data Backup And Recovery Challenges

Data Domain And Big Data

Challenge Solution

Data deluge High speed inline deduplication

Performance Up to 248 TB can be backed up in < 8 hours (31 TB/hr)

Scale Protects up to 65 PB of logical capacity in a single system

Data islands Simultaneously supports NFS, CIFS, VTL, DD Boost and NDMP

(18)

EMC Backup

Solutions For

Big Data

(19)

Big Data Systems

 EMC Greenplum

 Teradata

 EMC Isilon

(20)

EMC Greenplum

(21)

EMC Greenplum: Purpose-built For Big Data

 EMC Greenplum

– A Shared Nothing, Massively Parallel Processing (MPP) Data Warehouse System

 EMC Greenplum Data Computing Appliance (DCA)

– Built Using Standard Hardware Components

– Modular For Easy Scaling

 Core Principle Of Data Computing

– Move The Processing Dramatically Closer to the Data and to the People

Fast Data

Loading

Extreme Performance

& Elastic Scalability Unified

Data Access

(22)

Platform Independence

Delivers Choice and Flexibility

Software-Only

Ideal for Q/A or DR

Virtualized Infrastructure

Ideal for Test & Development

Data Computing Appliance

Ideal for Production Environments

(23)

Greenplum Backup & Recovery

Requirements

1. Backup As New Data Is

Added To Database

2. Backup Database Data And

Configuration Info

3. Recover Data To A Consistent

Point In Time

(24)

Efficient Greenplum Backup

With Data Domain Systems

 Only Certified Backup And Recovery

Solution For Greenplum

 Flexible Deployment – NFS Or DD Boost

 10 To 30x Reduction In Backup Storage

Requirements

 Network-efficient, Encrypted Replication

 Backup All Segment Servers In Parallel

Data Domain

System

Greenplum

Data Computing

Appliance

DD Boost

or NFS

(25)

DD Boost for EMC Greenplum

 Under The DBA’s Direct Control

 Faster, More Efficient Backup And DR

 Simplified Configuration And Administration

Collection Replication

Data Domain

System

Greenplum DCA

DD Boost

Data Domain

System

Local

Data Center Disaster

Recovery Site

WAN

Native

Utility

DD Boost

Database Admin

(26)

Teradata

(27)

Teradata – Data Warehousing

 Database Software, Enterprise Data

Warehousing, Data Warehouse

Appliances, And Analytics

 Tape Based Backup Not Meeting

Requirements

(28)

Teradata Backup Requirements

1. Teradata Backup Archive

Restore (BAR) Certified

2. Coordinates Across All The

Components Of The System

3. Minimizes The Time Data

Stays In Read-only Mode

(29)

Efficient Teradata Backup

With Data Domain Systems

 Certified By Teradata As The

Deduplication Platform Of Choice

 Significantly Enhances Database

Backup And Recovery

– Store More Backups Onsite Longer

– Multisite Disaster Recovery

– Ultra-safe Storage for Reliable Recovery

Data Domain

System

Teradata

Backup

Server

(30)

Teradata And EMC – The Relationship

 Teradata An EMC OEM Partner

 Teradata Provides Implementation And Customer

Support Services

 Teradata Certified Data Domain systems as a BAR

―Advocated Solution‖

– Best Practices Package for Backup and Recovery

– Optimized for Teradata Environments

– Scalability Fits EDW and ADW Needs

(31)

Teradata BAR And Data Domain

 Faster, More Efficient Backup And DR

 Teradata BAR Coordinates Fast Data Extraction

 Flexible Deployment

– DD Boost (NetBackup) or VTL (IBM TSM and BakBone NetVault)

Data Domain

System

Teradata

Data Domain

System

Local

Data Center Disaster

Recovery Site

WAN

DD Boost

DD Boost

NetBackup w/

DD Boost

(32)

EMC Isilon

(33)

EMC Isilon – Scale Out Storage

 Broad Adoption

 Over 2,100 Global Customers

 OneFS - Innovative Scale Out,

Operating Environment

 Industry Leading, Linearly

Scalable Performance

(34)

With NetWorker And Data Domain

Efficient Isilon Backup

 Leverage NetWorker Unified Backup

and Recovery Software

– Industry Standard NDMP

– Advanced Integration With Data Domain

– Ability To Exclude Certain Data Types

 3 Deployment Options

(35)

Option 1

NetWorker Storage Node over IP

 NDMP Backup Over IP To NetWorker Storage Node

 Advanced Integration With DD Boost

– Faster, More Efficient Backup

– NetWorker Control Of Data Domain Replication

Isilon

LAN

Storage Node

with DD Boost

Data Domain

System

NetWorker

Control Path Data Path

Data Path

Data Domain

System

LAN WAN

DD Boost

(36)

Option 2

Data Domain NDMP Tape Server over IP

 Backup Over IP Directly To Data Domain

– Via Data Domain NDMP Tape Server Capability

 With EMC NetWorker Or 3rd Party Backup Software

Isilon

LAN LAN

Data Domain

System

Control Path Data Path

Backup

Software

NetWorker

Data Domain

System

WAN

(37)

Option 3

Isilon Backup Accelerator Over FC

 Backup Over Fibre Channel To Data Domain

– Via Isilon Backup Accelerator Node

 Leverage Existing SAN Infrastructure

Isilon with Backup

Accelerator

LAN FC SAN

Data Domain

System

Control Path Data Path

Backup

Software

NetWorker

Data Domain

System

WAN

(38)

Big Data Backup – We’ve Got You Covered!

 Big Data Backup And Recovery

– Critical As Well As An Unaddressed IT Challenge

 EMC Data Domain Deduplication Storage

System Ideal For Big Data Backup And

Recovery

 Integrated Backup/Recovery Solutions

Available For EMC Greenplum, Teradata And

EMC Isilon

(39)

Contact Your EMC Or Partner Sales Rep for Next Steps

Learn More About EMC Backup: emc.com\backup

Join The Conversation: www.thebackupwindow.com

EMC BACKUP

MEETS BIG DATA

(40)

References

Related documents

services. People are potentially be hesitant because they are afraid they may need a larger procedure and then will not have the funds. May end up going in and then find out they

Whether grown as freestanding trees or wall- trained fans, established figs should be lightly pruned twice a year: once in spring to thin out old or damaged wood and to maintain

The main wall of the living room has been designated as a &#34;Model Wall&#34; of Delta Gamma girls -- ELLE smiles at us from a Hawaiian Tropic ad and a Miss June USC

The concept of flight testing a new technology such as the da Vinci parachute is similar; together they are a good topic for further class discussion..

Twenty-Third European Conference on Information Systems (ECIS), Münster, Germany, 2015 12 Concerning the technological dimension, prior research highlighted the flexibility of a

A few countries did not experience much developments in inequalities, though average declines in real disposable incomes were particularly sharp at all levels of the

Potential explanations for the large and seemingly random price variation are: (i) different cost pricing methods used by hospitals, (ii) uncertainty due to frequent changes in

Players can create characters and participate in any adventure allowed as a part of the D&amp;D Adventurers League.. As they adventure, players track their characters’