NAME: Yoel Ben-Ari
TITLE: VP Business Development , GH Israel
Archive Before Backup
EMC recommended practice
Archive
valuable information to tiered infrastructure
Backup to disk
active production information
Retrieve
from archive or
recover
from backup
1
2
3
Archive process
Backup/recovery
process
application
Primary
2
3
3
Customer Challenges: Long-Term
Retention
•
Requirements:
–
Backup throughput and
deduplication
–
Long-term retention model
•
Backup platforms today assume
months of retention, not years
–
Throughput/capacity ratio
–
No logical isolation of old data
•
NAS and cloud
–
Throughput too slow for backups
–
Wrong deduplication, expensive
•
Tape still has a place to live
Today’s platforms are for backup or archive, not both
Archive unit 0
Archive unit 1
Archive unit 2
Data Domain Archiver
First long-term retention platform for backup and archive
Data Domain Controller
Active tier
Backup/archive
servers
File servers
and users
System Overview
•
Common Data Domain controller, management, and namespace
•
Tiered storage: active and archive tiers
•
Periodically migrates aging data from active tier to next archive unit
•
When full, archive units are sealed for fault isolation, but remain online
DATA
DATA
DATA
Data Movement to Archive Tier
Active tier
Archive unit 3
Ready active
Ready target
•
Data movement policy based on last-modified time
•
Only “ready target” unit may receive tiered data (one way)
•
Data movement process runs periodically, subject to throttle
Archive unit 2
Archive unit 1
Archive unit 0
Data Domain Controller
Data
Active
Single File System with Scalable Namespace
Active tier
Archive unit 3
Archive unit 2
Archive unit 1
Archive unit 0
Ready sealed
Standby sealed
•
All data is visible to users, and file system metadata is in active tier
•
Access through CIFS, NFS, and Data Domain Boost
•
Trade-off: Much larger capacity vs. Access times
–
Archive units could be swapped out of memory and may experience
slower response times (less than 1 minute)
Data
Standby sealed
Ready active
Ready target
CIFS, NFS, and Data
Domain Boost
Data access delay
Data Integrity:
Data Invulnerability Architecture
Other
RAID 6
NVRAM
Snapshots
End-to-end data verification
Checksum
Deduplication, write to disk
Verify
Self-healing file system
Cleaning
Expired data
Defrag
Verify
Deduplication
Local Compression
RAID
File System
Generate
Checksum
Verify
Data
Verify the file system
metadata integrity
Verify user data
integrity
Verify stripe integrity
Fault Isolation
•
When target storage unit becomes full, it is sealed with metadata and contents
•
Failure of storage unit impacts only its contents
•
As last resort, sealed units may be connected to a new controller to have its contents
read (join a new system)
Archive unit 1
Archive unit 2
Data Domain Controller
Active tier
Archive unit 3
Archive unit 0
Data Domain Controller
Disaster Recovery Configuration
Active tier
Archive unit 0
Archive unit 1
Archive unit 2
Archive unit 3
Remote Disaster
Recovery Site
Active tier
Archive unit 0
Archive unit 1
Archive unit 3
Data Center
WAN
Unit-to-unit collection replication
Archive unit 2
Single-unit recovery
Why DD Archiver?
Long-Term Retention of Backup and Archive Data
•
Cost-effective scalability (570TB usable; up to 28.5PB logical)
•Fault isolation of archive units ensure long-term data retention
•Modular upgrades and migrations
High Throughput Deduplication Storage
•