EMC BACKUP AND
RECOVERY SOLUTIONS
Next Generation Data Protection
Horia Constantinescu
horia.constantinescu@emc.com
Cluj
EMC Backup Recovery Systems Division
•
Division HQ: Santa Clara, CA
•
10 R&D locations
–
2,000 employees
•
Data protection storage systems
–
More than 60,000 systems installed
–
More than 45,000 customers
–
More than 15,000 PB under protection
worldwide
•
Global sales, support, and services
EMC Backup and Recovery Market
Position
•
Avamar
–
#1 deduplication backup software worldwide
–
8,000 installations
–
4,400 customers
•
Data Domain
–
#1 deduplication storage worldwide
–
12,000 installations
–
5,100 customers
•
Disk Library
–
#1 virtual tape library (VTL) worldwide
–
>$1B in sales
•
NetWorker
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Data Protection and
Recovery Software
Purpose Built
Backup Applicances
Tape Automation
Protection Storage Disrupts Backup
Market
Source: IDC Tape Automation Worldwide
Source: IDC
•
Tape Marginalized
Purpose-Built Backup Appliances
Open Systems + Mainframe
EMC
IBM
HP
Oracle
Quantum
Sepaton
FalconStor
Dell
Others
Source: IDC Purpose Built Backup Appliances, April 2011. Above: Worldwide Supplier Revenue, Total PBBA Market
2010 Total Market
$1.69B
Backup and Recovery Market Leadership
0
200
400
600
800
1000
1200
1400
1600
1800
EMC Symantec
IBM
Oracle Quantum
CA
Data Protection &
Recovery Software
Purpose-built Backup
Appliances
Tape Automation (est.)
Source: IDC 2010 factory revenue worldwide for Tape Automation, Purpose-Built Backup Appliances (PBBA), Data Protection & Recovery Software (DPRS). Note: PBBA and
$
Mi
lli
on
Backup and Recovery Architectures:
In Transition from Tape to Disk
Backup/Recovery
Architecture
on premise
off premise
Conventional
(Tape-centric)
Transformational
(Disk-centric)
Backup/Media
Manager
Onsite Backup
Storage
Disaster Recovery
Storage
Application Backup
Clients
Deduplication backup software and system
Backup software
VTL
VTL/Tape
Backup software
Tape
Tape
Deduplication storage
Backup software
Home
DB
Data Protection Management Software
NetWorker
NetWorker
NetWorker
Data Domain
Avamar
Disk
Library
Deduplication Impact on Data Size
Deduplication
10–30 times less data stored versus fulls plus incrementals with typical retention policies
Second Friday Full Backup
B C D E F L G H
Data Deduplication: Technology
Overview
Store more backups in a smaller footprint
A B C D E F G H I J
Friday Full Backup
A B C D A E F G
Mon Incremental
A
B
H
Tues Incremental
C
B
I
Thurs Incremental
A
C
K
Weds Incremental
E
G
J
Backup
Estimated
Data
Logical Reduction Physical
Monday Incremental
100 GB
7–10x
10 GB
Tuesday Incremental
100 GB
7–10x
10 GB
K L
Wednesday Incremental 100 GB
7–10x
10 GB
Thursday Incremental
100 GB
7–10x
10 GB
Second FRIDAY FULL
1 TB
50–60x 18 GB
TOTAL
2.4 TB
7.8x
558 GB
Regular storage array
1:1
LZ compression
~ 2:1
Single instance storage
~ 3:1
Fixed block
~ 3:1
Variable
segment
~20:1
It’s Not All Deduplication Out There
File level
Fixed blocks, snapshots
Whitespace reduction
Backup target, variable segment
Deduplication significantly reduces:
•
Replication WAN bandwidth
•
Power
•
Heat
•
Cooling
Centralized Management
FILE SYSTEMS
AND SERVER
RECOVERY
APPLICATION
SUPPORT
REMOTE AND
BRANCH OFFICES
VIRTUALIZATION
SAP
Oracle
Microsoft
Tape
Cloud
DEDUPLICATION
EMC STORAGE PLATFORMS
Avamar
Data Domain
NetWorker
Symmetrix
Centera
EMC Avamar and EMC Data Domain
Retain, replicate, recover
Deduplicate everything
without changing anything
Simplify backup, archiving, and disaster
recovery with easy integration across
workloads, infrastructures, and backup software
Never back up the
same data twice
Revolutionize your backup by moving less
data to solve your toughest VMware, NAS,
remote office, and desktop/laptop backup
challenges
Data Domain Deduplication
Storage Systems
Data Domain Basics
Easy integration with existing environment
Replication
CIFS, NFS,
NDMP, DD Boost
Ethernet
Virtual Tape
Library (VTL) over
Fibre Channel
DD890 appliance
Control Tier
Target Tier
Disaster Recovery Tier
2U
2 to 10 ports
10 and 1 Gigabit
Ethernet; 8 Gb/s Fibre Channel
RAID 6
Up to 285 TB usable capacity with shelves
2 TB or 1 TB 7.2K rpm SATA HDD in shelf
File system
NVRAM
N+1 fans and redundant, hot-plug power supplies
DD890 appliance
Data Integrity:
Data Invulnerability Architecture
Other
RAID 6
NVRAM
Snapshots
End-to-end data verification
Checksum
Deduplication, write to disk
Verify
Self-healing file system
Cleaning
Expired data
Defrag
Verify
Deduplication
Local Compression
RAID
File System
Generate
Checksum
Verify
Data
Verify the file
system metadata
integrity
Verify user data
integrity
Verify stripe
integrity
Network-Efficient Replication for True
Disaster Recovery
Lowers WAN costs; improves service level agreements
95–99% cross-site bandwidth reduction
Source:
Remote sites
Destination:
Data Center Hub
Supports hundreds
of remote sites
1–5%
1–5%
1–5%
Archive data
Backup data
Data Domain
Global Deduplication Array
Data Domain system
Flexible replication
One-to-many
Many-to-one
Bi-directional
System-to-system
Cascaded
Home
DB
WAN
Home
DD Boost Software
•
Distributes parts of deduplication process to backup
server or application clients
–
Licensable software works across Data Domain portfolio
•
Supports majority of backup software market
–
EMC Avamar and NetWorker
–
Symantec NetBackup and Backup Exec
•
Speeds backups by up to 50 percent
•
Process more backups with existing resources
–
20–40 percent less overall impact to backup server
–
80–99 percent less LAN bandwidth
•
Enables Data Domain replication management from the
backup application
DD Boost for OpenStorage Deployment –
NetWorker
Clients
Server
Primary
storage
Backup/
media
server
Onsite
Retention
Storage
Offsite
Disaster
Recovery
Storage
Retention/ Restore Replication DR BackupArchive to tape
As required
WAN
•
Data Domain OST plug-in runs on the NetWorker storage node
•
Benefits - Increases management simplicity and flexibility
–
Optimized data transfer over IP
–
Single pane: manages backups and replication, retention of copies
individually
–
Eliminates overhead associated w/ VTL setup / management
Onsite
Retention
Storage
Offsite
Disaster
Recovery
Storage
Data Domain
Data Domain Replicator
•
Network-efficient and encrypted
•
Transfers only compressed,
deduplicated data over the WAN
•
Consolidate up to 270 remote
sites into a single system
Additional Data Domain Software
Options
Data Domain Virtual Tape
Library
•
Easily integrates with Fibre
Channel
•
Emulates multiple tape libraries
•
Supports open systems and
IBM i operating environments
Data Domain Encryption
•
Inline encryption of data at rest
•
Satisfies internal governance
rules and compliance regulations
•
Protects against theft or loss of
a physical system
Data Domain Retention Lock
•
File locking to satisfy IT
governance
and compliance policies
DD Archiver Overview
Cost-optimized long-term retention
•
Data Domain system for backup and archive
–
Active tier: short-term data protection; less than 90 days
–
Archive tier: scalable long-term retention; multiple years
•
High-throughput deduplication storage
–
Up to 9.8 TB/hr
•
Cost optimized for long-term retention
–
Up to 570 TB usable, 28.5 PB logical capacity
–
Low cost per gigabyte while maintaining high throughput
–
Fault isolation of archive units for long-term recoverability
•
Leverage existing Data Domain system advantages
–
Supports DD Replicator and DD Retention Lock software
options
–
Data Domain Data Invulnerability Architecture to ensure data
Industry’s Most Scalable Inline
Deduplication Systems
DD140
DD610
DD630
DD670
DD860
DD890
Global
Deduplication Array
DD Archiver
Speed (DD
Boost)
490 GB/hr
1.3 TB/hr
2.1 TB/hr
5.4 TB/hr
9.8 TB/hr
14.7 TB/hr
26.3 TB/hr
9.8 TB/hr
Speed (other)
450 GB/hr
675 GB/hr
1.1 TB/hr
3.6 TB/hr
5.1 TB/hr
8.1 TB/hr
10.7 TB/hr
4.3 TB/hr
Logical capacity
9–43 TB
40–195 TB
84–420 TB
0.6–2.7 PB
1.4–7.1 PB
2.9–14.2 PB 5.7–28.5 PB
5.7–28.5 PB
Raw capacity
1.5 TB
Up to 6 TB
Up to 12
TB
Up to 76 TB
Up to 192
TB
Up to 384
TB
Up to 768 TB
Up to 768
TB
Software options:
DD Boost, DD Virtual Tape Library, DD Replicator,
DD Retention Lock, and DD Encryption
EMC Avamar and EMC Data Domain
Retain, replicate, recover
Deduplicate everything
without changing anything
Simplify backup, archiving, and disaster
recovery with easy integration across
workloads, infrastructures, and backup software
Never back up the
same data twice
Revolutionize your backup by moving less
data to solve your toughest VMware, NAS,
remote office, and desktop/laptop backup
challenges
Data Domain Deduplication
Storage Systems
Why Avamar?
Deduplication backup system
Up to…
•
95% reduction in data moved
•
90% reduction in backup times
•
50 % reduction in disk impact
•
95 % reduction in NIC usage
•
80 % reduction in CPU usage
Solve Backup Problem Areas
Avamar ideally suited for unique requirements of most challenging
environments
NAS
Speeds daily full NDMP
backups
VMware
infrastructure
Deliver even greater
efficiencies
VMware ESX
Remote offices
and branch
offices
Extend data center
best practices
Desktops/
laptops
Network
B
A
C
C
D
D
B
A
B
A
C
B
A
C
B
C
D
D
A
B
A
B
A
C
D
1.
Divides data into sub-file segments
2.
Determines if segments are unique or duplicate
3.
Backs up only data that is unique
4.
Sends data compressed and encrypted
Data Deduplication is the Enabler
12.5
25
37.5
50
50
50
50
12.5
13.13
13.75
14.38
2.5
2.5
2.5
68
79
89
101
43
43
43
0
10
20
30
40
50
60
70
80
90
100
110
Day 1 (250
VMs)
Day 2 (500
VMs)
Day 3 (750
VMs)
Day 4 (1000
VMs)
Day 5 (1000
VMs)
Day 6 (1000
VMs)
Day 7 (1000
VMs)
TB Protected
TB Scanned
Total Time (Min)
Backing up 1000 VMs – 50 TB in 43 Min
Real World Results
Avamar daily full backups vs. traditional daily full backups
Data Type
Amount of Primary
Data Backed Up
Amount of Data
Moved Daily
Windows file systems
3,573 GB
6.1 GB
Mix of Windows, Linux, and UNIX file systems
5,097 GB
11.7 GB
Engineering files on NAS (NDMP backups)
3,265 GB
24.2 GB
Mix of 20% databases, 80% file systems (Windows and UNIX)
9,583 GB
80.0 GB
Mix of Linux file systems and databases
7,831 GB
104.2 GB
Source: EMC
Avamar Server
"
Verified
checkpoint
Utility and
spare node
Parity across
storage nodes
Recover Reliably
•
Avamar fault tolerance for reliable protection and access
•
Redundant Array of
Independent Nodes (RAIN)
architecture
•
Grid architecture for online
scalability and performance
•
Daily Avamar server integrity
checks
•
Data recoverability verified
daily
•
RAID protection from disk
Avamar for Enterprise Applications
•
Integration with Data Domain offers speed and scale
•
Accelerates backup of larger,
high-change-rate environments
•
Combined product IP
–
Data Domain Boost, Data Domain
Stream Informed Segment Layout
(SISL) Scaling Architecture, and
Data Invulnerability Architecture
–
Avamar client plug-ins, VMware
integrations and GUI
•
All Avamar features and
functions persists in the
integrated model
Enterprise applications
VM
NAS
DB
SharePoint