Welcome To
Disaster Recovery 2.0
Presented by Bob Graney
Food for Thought
“To go forward, you must backup.”
-
Cardinal rule of computing
“
If it wasn’t backed-up,
then it wasn’t important.”
Agenda
• Realities of Disaster Recovery Today
• Industry Trends / Statistics
• Disaster Recovery Components
• External Waves
- Financial Tsunami - Going Green
Questions Posed
• Disk based backup versus tape based. Do I still need tape?
• Replication options: host, network, application specific and array
based solutions. Which is the right approach for my organization?
• How can snapshots be used to improve recovery times?
• Why is virtualization having such a big impact on DR strategies?
• What infrastructure systems need to be “self healing” to ensure
application recovery?
• How do WAN accelerators fit into the DR strategy? Do they work in
all scenarios (live replication versus replicating backups offsite)
DR Realities
• Production environments have grown more complex
with increasing availability requirements
- 24 X 7 “always on” Internet apps
- Work from home Î increased availability for all apps
- Days of KO backup job @ 6:00 PM and send the tapes offsite the next morning are Going or Gone!
DR Realities
• Has your DR capability kept current?
- Are you trapped in a tape based recovery strategy? - Are critical applications missing from DR strategy?
• Does your test program validate recovery objectives?
- Can you hit the defined RTOs?
- Time for declaration, travel etc. included? Are they realistic? - Good mix of exercises across all critical applications?
DR Proclamations
1. Tape is dead – Long live Disk!
• Not quite dead yet but
only
disk for critical DR
o Tape may be only option for single site and SMBs
o Still needed for long term retention (archive) requirements o Also good for less critical systems with RTO in days/weeks
2. Infrastructure needs to be resilient
• “Recover AD” better not be Step One of DR plan
• Has the tape catalog been recovered at the DR site?
3. Work from Home has/will be the dominant work
area recovery platform
The Need for Resiliency
Disaster R
e
cov
e
ry
D
isaster R
eco
v
e
ry
Op
e
rat
io
n
a
l R
e
s
ili
en
c
y
Oper
ational Res
ili
enc
y
Long-Term Immediate SPOF Redundant Degraded Performance Time of Last Backup No Data Loss Outage Duration Environment Data CurrencyService Level Near
Equivalent
Shared Alternate Site Dedicated
Cost Containment Return on Investment Budget Full Recovery
& Roll Forward Data Base Restartable
THEN NOW
Disaster R
e
cov
e
ry
D
isaster R
eco
v
e
ry
Op
e
rat
io
n
a
l R
e
s
ili
en
c
y
Oper
ational Res
ili
enc
y
Long-Term Immediate SPOF Redundant Degraded Performance Time of Last Backup No Data Loss Outage Duration Environment Data CurrencyService Level Near
Equivalent
Shared Alternate Site Dedicated
Cost Containment Return on Investment Budget Full Recovery
& Roll Forward Data Base Restartable
THEN NOW
…
DR 2.0 Planning
• It is the maintenance that kills you
- How manual is your update process?
• Do you have a tool or database in place?
- Is there a current application / server map?
o All applications or specific to certain data centers? Just “in scope” or just DR?
DR 2.0 Testing
• Is there a documented Test schedule? Enforced?
- Different examples of two tests per year
• [1] mainframe & [1] one distributed application • [1] big (almost all) & [1] little (NetBackup, one DB)
• [1] 50% of apps & [1] the other 50%
- Another approach
- All apps with defined DR solution are tested every 18 months
• Plans updated/enhanced as a result of testing?
• Are walkthroughs done effectively?
DR 2.0 Data Center Availability
Tier One Tier Two Tier Three Tier Four
Building Type Tenant Tenant Stand-Alone Stand-Alone
Delivery path (Power)
One One One Active
One Passive
Two Active
Delivery path (Cooling)
One One One Active
One Passive Two Active Redundant Components N N+1 N+1 2(N+1) or S + S Concurrently Maintainable No No Yes Yes Site Availability 99.67% 99.75% 99.98% 100.00% Hours of IT downtime due to site 28.8 hrs 22.0 hrs 1.6 hrs .8 hrs
The Uptime Institute, Inc. has developed a four Tier classification approach to facility infrastructure functionality that addresses the need for a common benchmarking standard. Availability considerations on site infrastructure should use this standard.
DR 2.0
• Migration from Hot Site vendors to Internal
- Coincides with migration to DR replication solutions - Co-Lo facilities as part of Internal strategy
• Outsourcing getting multi-faceted
- Infrastructure support
- Software as a Service (SaaS) - Cloud Computing
DR 2.0 Cloud Computing
• It started with SaaS
- Salesforce.com / LDRPS
• Cloud vendors now looking to
provide scalability & performance
flexibility
• Vendors
– Web 2.0 (Google, Amazon)
– Traditional IT (IBM, MS)
DR 2.0 Cloud Computing
Cloud – DR Considerations
• Risk Profile has changed
- Risk is diversified by moving app/data off corporate network
• Negotiate/define DR SLAs in contract
• Consider compliance requirements
DR 2.0 WAN Accelerators
• Who are the players?
- Riverbed has mindshare – Also Cisco, Bluecoat, Juniper
• Types of environments
- Branch to Data Center vs. Data Center to Data Center - Application Specific WAN Optimizers
• Common apps are CIFS, HTTP • Also Oracle, SQL
• Drivers to implementation
- Data Center consolidation
- Software As a Service (SAAS) - Branch environments
DR 2.0 Case Study
International Law Firm
Critical Applications
MS Exchange Files / Doc Mgmt
DR 2.0 Case Study
Wan Accelerators
Eliminated Resynch bottleneck Kept Bandwidth requirements down Reduced data loss (RPO)
DR 2.0
• Work from Home Deployment
- Supplement / replace fixed workarea recovery
• Major Deployments
- VPN - Citrix
DR 2.0 Recovery Tier Chart
Tier Criticality RTO RPO Investment
0 Self-Healing Immediate PoF1 Very High PoFor
Intra-Day2 Intra-Day
or LC3
4 Non-Critical 96 hrs - 1 Week LC Low 5 Deferrable As time allows LC $'s ATOD*
* At Time of Disaster Required Moderate High Moderate to High 3 LC 24 to 48 hrs 48 to 96 hrs
Application Tier Examples
1 Mission Critical Critical 2
To be determined by BIA activity
DNS, LDAP, Active Directory, Authentication
<24 hrs
1 Point-of Failure – data protected to the time of disaster
2 Intra-Day – data protected periodically during the business day
DR 2.0
• High Availability Candidates
- Directories/Authentication – Active Directory (AD)
– Lightweight Directory Access Protocol (LDAP) - Network Routing
– Domain Name Service (DNS)
• Disaster Recovery Candidates
- Firewall and other security devices - Data backup system
DR 2.0 Virtualization Defined
Virtualization is the creation of a virtual layer between the actual physical element and the virtual interface. The virtualization layer shields the user from hardware differences and masks changes to the actual element.
There are four areas of IT where Virtualization is making inroads:
- Network Virtualization: Established technology including the use of VLANS, VPNs - Server Virtualization: Great fit for DR/Test/Dev environments. Production also. - Storage Virtualization: Gaining traction in the market. It is the pooling of physical
storage from multiple devices into what appears to be a single storage device. - Desktop Virtualization: The latest hot topic in this space. Tremendous potential to
simplify workarea recovery, but requires network bandwidth.
Bottom Line: Server Virtualization is a game changer.
Storage Virtualization is just starting.
DR 2.0 Server Virtualization
• 1,038 VMware customers from North America,
Europe and Asia Pacific
- 45 % of respondents cited business continuity as main driver for virtualization deployment.
DR 2.0 Virtualization
Server Virtualization
- Vendors: VM Ware, MS Hyper V, XenSource
- DR Variations: V2V, P2V, V2P
- Primarily Windows – moving slowly into Linux
- Watch out for management issues
DR 2.0 Virtualization Case Study
International Manufacturing Company
- Outsourced iSeries platform with RTO 24 hours - Windows environment with no DR capability
Tier Criticality RTO Servers
0 Self-Healing Immediate 4 4 Medium < 7 Days 5 5 LOW < 30 Days 14 6 < 24 hrs 3 High < 3 days 1 Critical Very High 2 < 4 hrs 19 61
DR 2.0 Virtualization Case Study
Cost PHYSICAL VIRTUAL
Capital Year 1 Servers/Midrange Processors $607,700 $111,600 Storage Requirements $520,033 $520,033 Software $25,575 $48,825 Tape Recovery $27,300 $27,300 Network Equipment $68,735 $68,735
Implementation Manpower Cost $0 $0
Data Center Buildout $128,400 $23,100 BC/DR Support Manpower $200,000 $200,000
Workarea $0 $0
SubTotal Capital $1,577,743 $999,593 Other (miscellaneous) 5% $78,887 $49,980 Total Capital $1,656,630 $1,049,573
Operating Expense PHYSICAL VIRTUAL
Servers/Midrange Processors $4,290 $1,500 Storage Requirements $0 $0 Software Maintenance $4,830 $4,830 Hardware Maintenance $183,565 $10,638 Tape Recovery $0 $0 Network $117,270 $117,270
Dedicated Space (Colo racks) $0 $0
Dedicated Space - Leased $0 $0
Power Cost $120,870 $30,038
Facility Maintenance & Support $18,108 $3,177 BC/DR Support Manpower $100,000 $100,000
DR Plan Development Manpower $0 $0
Test/Exercise Manpower $0 $0
Equipment Support Manpower $100,000 $100,000
Workarea (qship or mobile) $0 $0
Email Recovery (If Outsourced) $0 $0
Subtotal Expense $648,933 $367,453
Miscellaneous 2% $12,979 $7,349
Total Operating Expense $661,912 $2,700,434,144
International Manufacturing
Company
DR 2.0 Disk vs. Tape
DR of critical apps must be on disk
- Tape adds too much risk
- Tape means 1 to 2 days of data loss. Is that acceptable? - Cost differential between Tape / Disk is closing
DR 2.0 Data Replication
Replication options:
- Array based
SAN (also NAS)
iSCSI is gaining traction on Fiber Channel
- Host (Server to Server) - Application specific
DR 2.0 Data Replication
Replication vendors:
- Array based
• SAN – EMC, Hitachi, IBM • NAS – NetApp, BlueArc - Host
• DoubleTake, NeverFail - Application specific
DR 2.0 Data Backup
Advances in Data Backup Systems
- Data De-duplication
20-50x or more compression ratio often achieved
- Continuous Data Protection - Snapshots
Surviving The Meltdown
Current financial environment will drastically
impact us all.
What can the BCP / DR professional do about it?
Option 1: Hide in a corner and wait for 2010
Surviving The Meltdown
• Tighten up plans
• Identify Top Two Risks & address them
- Reallocate funds - Gap Analysis
• Install $$$ saving technologies
- Virtualization
- WAN Accelerators - IP based replication
Going Green
• By 2008, Gartner estimates that 48% of all IT budgets
will be spent on energy alone. *
• The energy used to power the nation’s data centers
doubled between 2000 and 2006, and could double
again in another five years.**
*Network World – 10 Ways to make your data center more efficient ** Federal Times.com - Huge savings seen in power-hungry data centers