y
Today’s Backup Challenges
y
Benefits of Deduplication
y
Source and Target Deduplication
y
Introduction to EMC Backup Solutions
–
Avamar, Disk Library, and NetWorker
y
Data growth is unavoidable
y
Exponential growth in backup
–
Typically represents a factor of 4-30x
plus production capacity
–
Daily, weekly, and monthly full
backups kept for months or years
y
New requirements to keep more
data for longer periods
–
Cost for management, media, and
offsite storage costs multiply
y
24x7 data center reality
–
No good time to run backups
–
Bandwidth limitations
–
Virtualization drives consolidation
AMOUNT OF DIGITAL INFORMATION
CREATED AND REPLICATED EACH
YEAR
Source: IDC White Paper, "The Diverse and Exploding Digital Universe”, March 2008 – Sponsored by EMC
Digital
Information
Inform ation Grow th ≈6 0% CA GR ≈60% CAGR173
billion gigabytes1,773
billion gigabytes (1.773 zetabytes) 2006 2007 2008 2009 2010 2011 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005Today’s Backup Challenges
“The process of detecting and identifying the unique data segments within a
given set of information, enabling the elimination of redundancy when stored
or moved.”
Before: total segments = 39
After: Unique segments = 6
Data Set 3
Data Set 2
Data Set 1
Deduplication
A
B
C
D
Unique data stored on disk, available for immediate recovery
Only unique data segments are backed up
A
B
C
D
Data already backed up, so only a unique ID pointer is stored (20 bytes)
E
E
New data segment identified and backed upn
First Instance
o
Duplicate Instance
p
Modified Instance
A
B
C
D
A
B
C
D
B
C
D
E
May 2007 May 2007 June 2008
Data Deduplication: How it Works
DEDUPLICATION AT TARGET
DEDUPLICATION AT SOURCE
Source
y
Client software agents identify repeated
sub-file data segments at the source
y
Only new, unique segments are
transferred across the network and stored
to disk
y
Shorter backup window, reduces daily
impact on physical/virtual infrastructure
Target
y
Backup application sends native data to a
target storage device
y
Data is deduplicated once it reaches the
target – during or after the backup
y
Found in VTLs or LAN B2D appliances
y
Transparency to backup application offers
users a “plug and play” experience
Network Network
IMMEDIATE OR SCHEDULED
DEDUPLICATION
IMMEDIATE DEDUPLICATION
AT SOURCE
Immediate at the source—before data
is sent across the network
y Data is deduplicated at source (client)
y Ideal for slow, congested infrastructure (e.g. remote offices, VMware)
y Leverages existing network links and infrastructure for fast, daily full backups
Network
Network
When Can Data Deduplication Occur?
SOURCE AND TARGET DEDUPLICATION
Immediate—while the backup is
running
y Content is deduplicated while backup happens y Ideal for when the backup window is not a
limiting design factor, and for optimizing capacity
Scheduled—after some or all backup is
complete
y Content stored in original format, dedupe later y Well-suited for optimal performance in tight
These factors apply for all backup deduplication technologies
Data
deduplication
performance is
tied to a
number of
factors—
even small
variations can
have a
significant
impact
Factors that Impact Data Deduplication Ratios
y
Type of data
– Duplication in user-generated data is greater than from natural sources – Encrypted and compressed data are not ideal candidates for dedupe – More user created content = higher deduplication ratio
y
Data change rate
– Small data change rates = more duplicate data in subsequent backups – Less change = higher deduplication ratio
y
Retention policy
– Longer retention increases chances data will be repeatedly backed up – Longer retention policy = higher deduplication ratio
y
Ratio of full backups to incremental backups
– More full backups increase the amount of data being repeatedly backed up
REPLICATE AFTER
DEDUPLICATION
Backup deduplication
Without deduplication
y
No reduction in local backup storage
y
No reduction in replication time nor
bandwidth
y
No reduction in offsite storage
Leveraging deduplication
y
Reduced local backup storage
y
Reduced replication time and
bandwidth
y
Reduced offsite storage
OFFSITE REPLICATION WITHOUT
DEDUPLICATION
Primary Site Remote Site Primary Site Remote Site
Data Deduplication Impact
Remote Replication and Bandwidth Requirements
y
Lowers infrastructure costs
–
Reduces backup infrastructure
requirements
–
Reduced power, cooling, and floor
space
y
Enables longer backup retention
periods
–
Less data is easier and less costly to
manage
–
Meets regulatory requirements
y
Improves data protection
–
Daily full backup now achievable
–
Disk-based backup also speeds
restore times
y
Improved security
–
Disk eliminates risks of lost tapes
DEDUPLICATION
Disk Library Family
y
Backup-to-disk solution, now
with the power of policy-based
data
deduplication
y
Works with existing backup
applications and infrastructure
y
Flexible solutions from small
to large environments
y
High performance, direct tape
creation, and HA architecture
EMC Avamar
y
Complete backup and
recovery solution
y
Dedupes at the source
and globally
y
Single step recovery
y
Integrated HA (RAIN)
y
Flexible deployment
(e.g. Data Store, virtual
appliance, SW only)
EMC NetWorker
y
Industry-leading
backup and
recovery software
y
Integration with both
Avamar and Disk
Library
NetWorker
Avamar
Disk Library
EMC Data Deduplication Backup Solutions
•
Full-featured backup solution
– Software and hardware with data deduplication
•
Source-based, global data deduplication
– Reduces data at source (client)
– Reduces data globally (at backend disk)
•
Fast, daily full backups
– Up to 10x faster daily full backups – Leverages existing infrastructure
•
Integrated high availability and reliability
– RAIN for high availability and fault tolerance
– Avamar server and data recoverability verified daily
•
Flexible deployment options
– Avamar software – Avamar Data Store
– Avamar Virtual Edition for VMware environments
Avamar Data Store
Scalable, turnkey solution
for small offices to datacenters
EMC Avamar
EMC Avamar: Real-World Results
Data Type
Amount of
Primary Data
Backed Up
Amount of
Data Moved
Daily
Daily
De-duplication
Ratio
Windows file systems 3,573 GB 6.1 GB 586:1
Mix of Windows, Linux, and UNIX file
systems 5,097 GB 11.7 GB 436:1
Engineering files on NAS (NDMP backups) 3,265 GB 24.2 GB 135:1
Mix of 20 percent databases, 80 percent
file systems (Windows and UNIX) 9,583 GB 80.0 GB 120:1
Mix of Linux file systems and databases 7,831 GB 104.2 GB 75:1
Source: EMC
Avamar Daily Full Backups vs. Traditional Daily Full Backups
Avamar Success Story: Corporate Express
Before Avamar
y
Storage demands were rapidly increasing
y
Tape library was reaching slot capacity and upgrading
was not ideal due to age and maintenance costs
y
Needed to control costs and simplify data management
y
Backup and disaster recovery was time consuming
With Avamar
y
Reduced stored data by more than 50%, from 92 TB
to 44 TB
y
Achieved significant financial savings
y
Enabled disk-based backups to be completed in 30
minutes, compare to 6 hours in the past for tape
y
Reduced restoration times for business-critical data
from 24 hours to minutes
Time Shortened, Costs Reduced for Remote Office/Branch Office Backup
“We were blown
away by the simplicity of the management interface and the comprehensive capabilities offered by Avamar. After carrying out a proof of concept, we clearly understood the
benefits Avamar would bring to our business.”
•
Virtual tape libraries and LAN
backup-to-disk platforms
•
Policy-based deduplication
•
IP or SAN connectivity
•
IP replication of deduplicated
content
•
Industry-proven CLARiiON
back-end
–
High performance
–
5-9s high reliability
Disk Library Family
Up to 8 TB/hr performance 4–674 TB scalability Hardware compression Energy-efficiency options
Consolidated media management IP or SAN connectivity
IP replication
Data Deduplication Capabilities for All Platforms
EMC Disk Library Family
•
DL3D 1500
– 4–36 TB capacity
– Up to 720 GB/hour performance (SAN)
•
DL3D 3000
– 8–148 TB capacity
– Up to 1.44 TB/hour performance (SAN)
•
Policy-based data deduplication
– Select ‘Immediate’ or ‘Scheduled’deduplication – Optimize for storage utilization or for backup
performance
•
Replication of deduplicated content for HA
– Up to 10 sources to one target
– Data encryption—128-bit AES—with ability to turn on/off
DL3D 3000
8 Gigabit Ethernet ports for CIFS/NFS 4 Fibre Channel SAN ports (VTL) 4 TB upgrades 3-year Enhanced warranty DL3D 1500 6 Gigabit Ethernet ports for CIFS/NFS 2 Fibre Channel SAN ports (VTL) 4 TB upgrades 3-year Enhanced warranty
New LAN-based backup-to-disk platforms with Data Deduplication
•
Based on proven CLARiiON CX3-80 arrays
– Single or dual engine systems
•
Over a PB usable compressed capacity
– 1 TB SATA drives; up to 930 drives
•
Enhanced system throughput
– Hardware compression
– First and only end-to-end 4 Gb/s solution
•
Policy-based data deduplication
– Optimize performance; reduce storage and replication costs
•
Energy-efficient
– Automatic drive spin-down and low-power drives
DL4000 Series
Industry’s only virtual tape library,
built from the ground up with
4 Gb/s components
Industry’s Most Popular SAN VTL—Now with Deduplication
DL4000 Series
Disk Library Family
Success Story: Oil & Gas
Time Shortened for Backup and Restore
Before Disk Library
y
Not meeting backup windows
y
Needed to speed restores
With Disk Library
y
Provided flexibility and control to increase performance
y
Increased overall performance to meet backup windows
y
Provided simplicity, reliability, and more efficient management
y
Generated significant cost savings
Oil & Gas
Disk Library 3000 policy-based de-duplication provides the flexibility and control to optimize ingest performance and overall
EMC NetWorker
y
Centralized control of traditional and
next-generation backup
– Combining today’s technologies with tomorrow’s in a common framework
y
Industry-leading global data
deduplication
– Reduces backup storage by up to 50x and data moved by up to 500x—ideal for VMware environments
y
Broad backup to disk
– Disk library integration, replication, snapshot management, continuous data protection, and NAS backup to disk
y
Enterprise performance
– Securely backups and reliable recoveries
y
Better recoverability from tape backups
– Future-proofed Open Tape Format with better recoverability from damaged tape media
Complete Backup and Recovery from EMC
NetWorker Server and Management Console
EMC NetWorker and Deduplication
y
NetWorker client and Management
Console communicate with Avamar
y
Avamar appears as a NetWorker dedupe
node enabled via client properties
y
NetWorker manages metadata and data
sent to the dedupe node
Avamar
Storage NodeServer and Storage
Dedupe Node
NetWorker Clients
•
Integrated deduplication
–
Select source or targeted based on need
–
Optimize dedupe for the greatest benefit
•
Source using NetWorker client—
integrated with Avamar
–
Managed via NetWorker for client config,
schedules and policies, monitoring, and
reporting, full indexing, etc.
•
Target using EMC Disk Libraries
–
DL 1500/3000 for LAN backup-to-disk or
VTL
–
DL 4000 and optional policy-based
deduplication
y
Keep you backup infrastructure
running smooth with EMC Data
Protection Advisor
DL3D 1500/3000 DL 4000
NetWorker Clients
NetWorker
Avamar Data Store
EMC NetWorker and Deduplication
Manage Source and Target Data Deduplication
NetWorker Success Story: Retail
NetWorker integrated with Avamar provides an efficient solution for the centralized management of data deduplication and backup
Retail
Centralized Backup Management, Increased Efficiencies, Reduced Costs
Before NetWorker
y
85% of the environment was virtualized on VMware
y
Restores unreliable and difficult to manage
With NetWorker
y
Provided integration with Avamar to deduplicate the
VMware environment, reducing the size of file system
backups
y
Offered centralized backup management
•
Depends on:
–
Application and data type
–
Service-level requirements
–
Current backup challenges and
environment
•
EMC tools are available to help
you understand the benefits of
each solution
–
Deduplication analyzer tools
–
TCO tools
–
Backup, e-mail, and file system
assessments
Let Us Help You Determine the Right Solution
Which EMC Deduplication is Right for You?
•
Comprehensive, integrated set of deduplication solutions
– Avamar, Disk Library, NetWorker…
– Saves money and drives efficiencies throughout backup recovery lifecycle
•
Only vendor that can deliver a deduplication solution for any customer need
– From refresh-to-redesign of the backup and recovery infrastructure – Tailored to the size of your company, specific need, and budget