• No results found

EMC Data de-duplication not ONLY for IBM i

N/A
N/A
Protected

Academic year: 2021

Share "EMC Data de-duplication not ONLY for IBM i"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

EMC Data de-duplication

not ONLY for IBM i

(2)

EMC’s focus is

IT

Infrastructure

EMC is a

TECHNOLOGY

(3)

EMC Portfolio

Cloud Infrastructure and Services

2004

2005

2006

2007

2008

2009/2010

2003

Services Virtualization/ Data Mobility Resource Management Content Management Availability/ Archiving Consumer/ Small Business

Documentum Ask Once Document Sciences

X-Hive Rainfinity ProActivity Acartus Captiva VMware Akimbi Illuminator Indigo Stone Dolphin Interlink Internosis

Astrum Smarts nlayers Voyence

BusinessEdge Geniant

Infra

Legato

Avamar

Kashya

Dantz Mozy Pi Iomega

WysDM

Conchango

Data Domain

FastScale ConfigureSoft Information Security Authentica Network Intelligence RSA Valyd Tablus Verid Archer Kazeon Data Warehouse

Big Data GreenplumIsilon

(4)

Having Great Technology is Not Enough

(5)

Backup System Infrastructure

Every backup environment has a bottleneck.

It may be a VERY FAST bottleneck, but it will determine

the maximum throughput obtainable with your

system.

(6)

What is deduplication ?

Data deduplication (often called "intelligent compression") is a method of reducing

storage needs by eliminating redundant data.

Only one unique instance of the data is actually retained on storage media, such as

disk or tape.

Redundant data is replaced with a pointer to the unique data copy.

(7)
(8)

EMC DataDomain Mission …

Make this

(9)
(10)

Customer Example: 20x Footprint Reduction

One DD System

180TB stored

8TB of disk used

20x Reduction

Replicated off-site

Red Line = Amount of data written to Data Domain (virtual storage)

Green Line = Disk Space Consumed (physical storage)

(11)

What are the reasons you do not have

de-duplication yet ?

Cost (disk is more expensive than tape)

Flexibility (only SAN support)

Performance (disk I/O bottleneck)

Data Safety & Reliability (only one copy on disk)

(12)

Many vendors offer the same – is it true ?

What needs to compared ?

Speed (backup/restore)

Flexibility (various protocols)

Scalability (upgrade options)

Supportability (IBM i, Open Systems, Mainframe)

Simplicity (management, maintenance)

Size (space, de-dupe ratio)

Efficient replication (bandwidth reduction)

Data Safety (the most important), Encryption

Support (good people, local people)

Cost (compare all costs)

(13)

Why EMC Data Domain for IBM i?

Feature

Benefit

Integrate with Ease and Flexibility  Data Domain presents IBM 3584-L32 tape library/libraries and LTO3

(3580-TD3) drives via fibre channel to IBM i hosts

Supportability

BRMS and IBM i Native Commands Support

Retain More Backups with

De-Duplication

As IBM i data is compressible, it is in the de-duplication wheelhouse

Store weeks of full backups on disk in a minimal footprint for rapid

database restores

Recover Data Reliably and

Efficiently

De-duplication with replication drives WAN-efficient disaster recovery

EMC’s Data Domain Data Invulnerability Architecture ensures reliable

recovery – DIA should resonate well with IBM I customers

Improve Performance

Unrivaled 8+ TB/hr aggregate and 1+ TB/hr single-stream, inline

de-duplication

Single-stream throughput capabilities are important to understand when

considering DB2 backups

Data Domain allows for greater parallelization of backup and restores

Simplify Infrastructure

Dedicate Virtual Resource to Each Application – Each LPAR can have

dedicated virtual drives

(14)

Deduplication Statistics for IBM i

Outliers apply

In general the same concepts apply to IBM i environments

as any other environment in terms of data de-duplication.

The following de-dupe ratios were discovered during

the test process:

Banking: 21×

Retail: 24×

Shipping: 52×

(15)
(16)

Performance: CPU-Centric versus Spindle-Bound

Th

rou

gh

pu

t M

B/

s

50

6,000

Number of Disk Spindles

50

100

150

200

Data Domain

Fibre Channel

SATA

Most

deduplication

(17)

Price / Performance: CPU-centric wins over time

Source:

http://seagate.com/docs/pdf/whitepaper/economies_capacity_spd_tp.pdf

Improve price / performance along with CPUs

Keep price competitive with tape automation

Alternative

Speed through spindle count

Huge amounts of wasted disk space

(18)

Data Domain – Data Flow

Appliance-based

Disk Systems

#

#

#

#

#

#

#

#

#

#

Hash Table / Previous Stored Versions

(19)

Data Domain Core Focus

Speed

SISL

(Stream Informed

Segment Layout)

Deduplication

Storage

(20)

Stream Informed Segment Layout (SISL)

SISL

Summary Vector

Memory-based structure to

help quickly identify new segments

Segment Locality

Data layout to maximize probability

of locating duplicates

(21)

Data Invulnerability Architecture (DIA)

Four key elements of the Data Domain Data Invulnerability Architecture:

End-to-end verification

Fault avoidance and containment

Continuous fault detection and healing

File system recoverability

(22)

Data Domain Basics

Easy Integration with Existing Environments

Replication

1 – CIFS

2 – NFS

3 – NDMP

4 – OST

5 – DD Boost

DD890 Appliance

Control Tier

Target Tier

DR Tier

6 – VTL

LAN

SAN

WAN

Backup

Backup

Backup & Archive

Applications

DD890 Appliance

10 and 1 Gb Ethernet; 4 and 8 Gb Fibre Channel

Up to 285 TB usable capacity with disk shelves

Deduplicating file system

(23)

Industry’s Most Scalable Inline Deduplication

Systems

DD140

DD610

DD630

DD670

DD860

DD890

Global

Deduplication Array

DD Archiver

Speed (DD Boost) 490 GB/hr

1.3 TB/hr

2.1 TB/hr

5.4 TB/hr

9.8 TB/hr

14.7 TB/hr

26.3 TB/hr

9.8 TB/hr

Speed (other)

450 GB/hr

675 GB/hr

1.1 TB/hr

3.6 TB/hr

5.1 TB/hr

8.1 TB/hr

10.7 TB/hr

4.3 TB/hr

Logical capacity

9–43 TB

40–195 TB

84–420 TB

0.6–2.7 PB

1.4–7.1 PB

2.9–14.2 PB

5.7–28.5 PB

5.7–28.5 PB

Raw capacity

1.5 TB

Up to 6 TB

Up to 12 TB

Up to 76 TB

Up to 192 TB

Up to 384 TB

Up to 768 TB

Up to 768 TB

Usable capacity

0.86 TB

Up to 3.98 TB

Up to 8.4 TB

Up to 55.9 TB

Up to 142 TB

Up to 285 TB

Up to 570 TB

Up to 570 TB

Software options:

DD Boost, DD Virtual Tape Library, DD Replicator,

DD Retention Lock, and DD Encryption

(24)
(25)

Replication Topologies

Entire “Collection”

Source

Destination

BOOST Backup Image

(26)

Multi-Site Protection for Remote Office

Remote Sites

Data Center Hub

1-5%

1-5%

1-5%

Archive Data

Backup Data

Data Domain System

Home

DB

WAN

Home

DIR A

Data Domain System

Data Domain System

DB

(27)
(28)

DD Encryption Software

Industry’s first encryption of deduplicated data at rest

Protects against loss of disk or system

Inline encryption provides immediate

protection while preserving deduplication

Works with all protocols and applications

Uses RSA BSAFE® FIPS 140-2 validated

cryptographic libraries

Replicate encrypted data

Security officer role for dual authentication

Requires one admin user and one security

officer role user for lock, passphrase, and

disable functions

Inline: deduplication and

encryption before storing

(29)

DD Boost Software

Distributes parts of deduplication process to backup server

Supports majority of backup software market

Symantec NetBackup and Backup Exec

EMC NetWorker

Speeds backups by up to 50%

Process more backups with existing resources

20–40% less overall impact to backup server

80–99% less LAN bandwidth

Enables Data Domain replication management from the backup

application

(30)

Data Domain Archiver

Data Domain Controller

Active Tier

Archive Tier

Backups

90 days

7 years +

(31)

EMC Deduplication Makes it Better

• Faster

• Greater Scalability

• More Efficient

(32)

References

Related documents

EMC® Avamar® backup and recovery software with integrated source/global data de-duplication solves the challenges associated with traditional backup, enabling fast,

Data Domain systems provide network-efficient replication for disaster recovery, remote office data protection, and multisite tape consolidation.. Data Domain replicates only the

The production database in Hopkinton was cloned using EMC TimeFinder ® and then backed up to an EMC Data Domain ® DD660 appliance using Oracle Recovery Manager (RMAN).. Data

Avamar utilizes patented global data de-duplication technology to identify redundant data segments at the source, reducing daily backup data by up to 500x before it is

Data Domain systems provide network-efficient replication for disaster recovery, remote office data protection, and multisite tape consolidation.. Data Domain replicates only the

Topics covered in this paper include: • Aspects of information growth and the resulting strain on backup and replication • Current data de-duplication technology and deployment

Together, EMC Data Domain systems with Data Domain Boost technology and NetVault Backup deliver an efficient, high-performance and robust solution for backup and recovery

EMC delivers proven, industry leading backup and recovery solutions including EMC Avamar, EMC NetWorker, and EMC Data Domain deduplication storage systems that quickly and.. 1