• No results found

Best Practices. World Wide Technology

N/A
N/A
Protected

Academic year: 2021

Share "Best Practices. World Wide Technology"

Copied!
46
0
0

Loading.... (view fulltext now)

Full text

(1)

Disaster Recovery Disaster Recovery Best Practices

WWT Educational Webcast WWT Educational Webcast

Ed Levens David L. Jones

World Wide Technology EMC

(2)

Questions are Encouraged

You can ask questions during the q g

presentation by using the link provided

in the Webcast Viewer.

(3)

Your Success Drives Ours

Relentless Focus on People, Process & Partnerships

Strong Partner Relationships

Over 1,000 Talented Employees

Proven Processes

Nearly $3 Billion in Revenues

Strong Credit Line - $350MM + Key Contract Vehicles: VHA HPG

Key Contract Vehicles: VHA, HPG

ITES-2H, GSA, SEWP

(4)

Our Focus

Technology Solution

Unified

Communications

Integrated voice, video and data networks can

lower costs and provide employees with productivity benefits.

Security Adaptive threat response that stops network threats before they stop your business .

Mobility Maintain your competitive advantage through the freedom and flexibility of wireless networks.

Data Center Intelligent storage architectures can help reduce

expenses; increase agility for changing priorities;

(5)

Disaster Recovery Best Disaster Recovery Best Practices

David L. Jones EMC

EMC

(6)

Agenda Agenda

ƒ Today's Reality Today s Reality

ƒ IT Business Continuance and Disaster Recovery Considerations

ƒ Technology Choices

ƒ EMC RecoverPoint

ƒ Questions?

(7)

Unfortunately disasters do happen

Unfortunately, disasters do happen…

(8)

Unfortunately disasters do happen Unfortunately, disasters do happen…

Of all the organizations surveyed…

55% had an incident that disabled their primary

d t t

data center

60% of these had a regional backup site that was also disabled by the incident

When systems go down the losses add up

When systems go down, the losses add up

(9)

Types of Disasters Types of Disasters

Type of Disaster Example

Nature / Man-Made Katrina / 9/11

S / /

Sudden / Time to Prepare Earthquake / Hurricane

Building / Local Area / Region Fire / Power Outage / Flood

(10)

Most Frequent Impacts to IT Availability Most Frequent Impacts to IT Availability

Disasters represent a fraction of Environmental issues

Server 30%

Application Software

30%

30%

40%

Client Application Software

5%

Network S/W

5% 1 % 15%

5%

(11)

Dilbert Does Disaster recovery …

Dilbert Does Disaster recovery …

(12)

Definitions Definitions

ƒ Business continuance / COOP describes the processes and procedures an p p organization puts in place to ensure that essential functions can continue during and after a disaster

ƒ Disaster recovery is the process, policies and procedures related to y p , p p

preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster

ƒ High availability is a system design protocol and associated implementation g y y g p p

that ensures a certain absolute degree of operational continuity during a

given measurement period

(13)

Continuity of Operations Policy (COOP) Continuity of Operations Policy (COOP)

It i th li f th U it d St t t h i l

It is the policy of the United States to have in place a

comprehensive and effective program to ensure continuity of essential Federal functions under all circumstances.

As a baseline of preparedness for the full range of potential emergencies, all Federal agencies shall have in place a viable COOP capability which ensures the performance of their

essential functions during any emergency or situation that may

essential functions during any emergency or situation that may

disrupt normal operations.

(14)

Agenda Agenda

ƒ Today's Reality Today s Reality

ƒ IT Business Continuance and Disaster Recovery Considerations

ƒ Technology Choices

ƒ EMC RecoverPoint

ƒ Questions?

(15)

Business Continuance – EMC / WWT Approach

Build on our understanding of our customers, their business / mission, and their critical processes and objecti es

Business Continuance EMC / WWT Approach

their critical processes and objectives

Capitalize on our long pedigree in designing, building and managing business/mission-critical systems for the Data Center

Technology gy

Business

Continuance

(16)

IT Considerations IT Considerations

ƒ Management buy in and commitment is critical g y

ƒ Know the regulations specific to your agency or organization

ƒ Conduct a risk assessment and identify critical priorities

ƒ Determine response for different disaster scenarios

ƒ Establish clearly defined roles & responsibilities for personnel

E t bli h ff ti i ti h l

ƒ Establish effective communication channels

ƒ Maintain necessary resources, tools, and supplies

ƒ Testing! Testing! and more Testing! Testing! Testing! and more Testing!

ƒ Disaster recovery must be included as part of every process

(17)

IT Considerations

ƒ Disaster recovery must become part of the IT mind set not an after y p thought

ƒ High availability and disaster recovery go hand in hand

ƒ Define Architectures that build disaster recovery in from the beginning

ƒ Define Architectures that build disaster recovery in from the beginning

ƒ Application Development

ƒ Infrastructure Design

ƒ QA QC Test and Development

ƒ QA, QC, Test and Development

ƒ Make use of industry recognized processes and architectures

ƒ ITIL, MOF, MSA / WSSRA, etc…

ƒ Recovery of applications without user interruption is nirvana but

(18)

IT Considerations IT Considerations

ƒ Recovery Point Objective (RPO) – The last saved data that the y j ( ) restarted application will reflect following the recovery. Also, a measure of the amount of time for which work may be lost in the event of an unplanned outage at the primary site.

P i d t b k ti di k t di k li ti

ƒ Period tape backup vs. continuous disk-to-disk replication

ƒ Synchronous vs. Asynchronous

ƒ Recovery Time Objective (RTO) - The time that will pass before an infrastructure is available In order to reduce RTO data must be infrastructure is available. In order to reduce RTO, data must be online and available at another site.

ƒ Distance – Data must be recovered on undamaged hardware outside

the disaster zone Required distance between primary and recovery

the disaster zone. Required distance between primary and recovery

sites should be based on likely regional threats.

(19)

Agenda Agenda

ƒ Today's Reality Today s Reality

ƒ IT Business Continuance and Disaster Recovery Considerations

ƒ Technology Choices

ƒ EMC RecoverPoint

ƒ Questions?

(20)

Business Requirements should Drive T h l O ti

Technology Options

Business C

Infrastructure Alt ti

Considerations Alternatives

RTO Cold Site RTO=Days

Warm Site RPO

P t ti GAP Isolation

Active Active Hot Site

Protection GAP Active-Active

RTO=Zero

(21)

Data Center Design and Architecture Data Center Design and Architecture

ƒ Data Center design should be a high priority to ensure all the aspects of Data Center design should be a high priority to ensure all the aspects of power, cooling, access and security have been core to the design

ƒ The distance between data centers will change the options that you have for the deployment of a disaster recovery strategy for all the services IT for the deployment of a disaster recovery strategy for all the services IT provides

ƒ Cold Site, Hot Site, Bunkers, Fully Active / Active

ƒ This is business decision first

ƒ This is business decision first

ƒ Make effective use of and leverage your existing facilities

ƒ Leveraging disaster recovery assets can provide maximum value BUT can also extend time to recovery or RTO y

ƒ This choice will impact the technology decisions and options that are

(22)

Reference Architectures

Reference Architectures

(23)

Virtual and Physical Considerations Virtual and Physical Considerations

ƒ Server, Storage and Network Virtualization cam maximize resources and , g streamline operations and disaster recovery

ƒ Server virtualization is mature and there are many choices

ƒ VMware

ƒ Microsoft HyperV

ƒ Citrix / Zen

ƒ Cisco “California”

ƒ Storage virtualization is mature but not as widely deployed

ƒ Storage virtualization is mature but not as widely deployed

ƒ EMC Invista

ƒ HDS Array based

ƒ NetApp VSeries

ƒ O h Other

ƒ Network virtualization is a developing technology

(24)

Virtual and Physical Considerations Virtual and Physical Considerations

ƒ Disaster recovery considerations for virtualized environments y

ƒ Physical to Virtual

ƒ Virtual to Physical

ƒ Physical to Physical

ƒ Virtual to Virtual Virtual to Virtual

ƒ Consolidated disaster recovery using virtualization technologies can maximize resources

ƒ “DR in a box”

ƒ Maximum utilization of disaster recovery resources

ƒ Virtualization can present management challenges

ƒ Virtual to Physical Mappings

M i f id i ibili

ƒ Management infrastructure must provide visibility

ƒ Server

ƒ

(25)

Understanding Data Consistency Understanding Data Consistency

Applications and data are Order Entry CRM Applications and data are

interrelated (Federated)

All data movement must be stopped/started at the same

DB

stopped/started at the same point in time

To restart applications you must have all the data—not parts of it

DB DB

have all the data not parts of it Recovery requires dependent- write consistency across all volumes and systems

SCM

volumes and systems

(26)

Infrastructure Services Infrastructure Services

ƒ Without Disaster recovery enabled infrastructure most other Disaster y recovery efforts will fail

ƒ Core services like Networks, DNS, Directory Services, etc… are required for all of the other process that run in the Data Center

ƒ VPN and remote access services can be your best ally in the event of disaster and must be core to your plans

ƒ Management infrastructure will play a role in conducting root cause g y g analysis ONLY if it is available

ƒ In most cases infrastructure services are COTS based and have been designed to provide availability using a geographically distributed scale out model

out model

ƒ Vendor selection and partnership is key in this area because most

(27)

Applications Applications

ƒ Applications are very rarely standalone Applications are very rarely standalone

ƒ Multi-tired applications (WEB, App Server, Database) will almost always require all tiers to operate

ƒ Most applications will not work if the required infrastructure is not also part of the plan

ƒ Data consistency between the tiers makes recovery much easier and more timely

N t k b d S ft b d l d b l i i th t

ƒ Network based or Software based load balancing is the most common method for making WEB and Application tiers resilient

ƒ Applications that require persistent data storage may have additional

i t

requirements

ƒ

(28)

Applications – An example via email pp p

ƒ Email IS NOT a standalone application Email IS NOT a standalone application

ƒ An enterprise class email implementation will usually consist of at least the following:

ƒ Main email data servers

ƒ Main email data servers

ƒ SMTP (Inbound and outbound mail)

ƒ Integration point with a directory server

ƒ Blackberry Blueberry Strawberry you get the point Blackberry, Blueberry, Strawberry, you get the point…

ƒ WEB based email front end

ƒ Real Time Collaboration – SharePoint, DB system, IM, etc…

ƒ Multiple Infrastructure touch points – DNS, WINS, VPN, etc… p p , , ,

ƒ External Vendors – Cellular provider

(29)

Databases Databases

ƒ Different types of databases require different kinds of disaster recovery yp q y solutions

ƒ Read only / Data warehousing

ƒ Transactional

ƒ Most common types of disaster recovery solutions in the database space are

ƒ Oracle GRID/RAC based or scale out implementations - Clustering

ƒ Storage replication with application tie in

ƒ Storage replication with application tie in

ƒ Data Base level replication

ƒ Most disaster recovery solutions for databases require a tight integration with the application tier solution in order to ensure transaction level

with the application tier solution in order to ensure transaction level

recovery

(30)

Storage / Data Protection

Daily backup Daily recovery points—from tape or disk

Storage / Data Protection

y p

Snapshots

Any point in time

Significant point in time

Daily recovery points from tape or disk More frequent disk-based recovery points All recovery points

Significant point in time

Database checkpoint

Pre-app patch

Post-app patch

Database checkpoint

Quarterly close

Any user- configurable event

Significant points in time Any point in time

Continuous Data

Protection in time

Snapshot

(31)

Storage / Data Protection Storage / Data Protection

ƒ Creating remote and local copies of your data is a must for disaster C eat g e ote a d oca cop es o you data s a ust o d saste recovery

ƒ The replication of storage data is a complex process that requires

knowledge of what is being stored, detailed performance analysis and

t k i t l i

network impact analysis

ƒ Synchronous vs. Asynchronous

ƒ It’s all about distance

ƒ Adaptive solutions can provide dynamic RPO

ƒ Application level consistency is paramount

ƒ Many types of storage replication technologies exist

ƒ Array Based – Usually locks you into storage array choices

ƒ

(32)

Storage / Data Protection Storage / Data Protection

ƒ A data replication solution that allows the flexibility of applying different p y pp y g RPO policies to both storage and in turn applications is key

ƒ Ability to prioritize RPO application by application

ƒ Create tiered model based on business requirements

Data Back p is here to sta and ha ing a rob st back p AND restore

ƒ Data Backup is here to stay and having a robust backup AND restore environment is crucial

ƒ Tape

ƒ Backup to Disk (VTL & CDP) Backup to Disk (VTL & CDP)

ƒ Offsite storage of backup data

ƒ Data Security

ƒ Date protection can reside on many tiers consolidating it’s management

ƒ Date protection can reside on many tiers consolidating it s management

is key

(33)

Vendor Choice is Critical

ƒ Disaster recovery IS complex Disaster recovery IS complex

ƒ Disaster recovery spans internal IT organizations and specific technology disciplines

ƒ Management by In is critical for success

ƒ Disaster recovery involves many internal and external partners

ƒ Partnering with vendors is key as are the partnerships between your

ƒ Partnering with vendors is key as are the partnerships between your

vendors!

(34)

Agenda Agenda

ƒ Today's Reality Today s Reality

ƒ IT Business Continuance and Disaster Recovery Considerations

ƒ Technology Choices

ƒ EMC RecoverPoint

ƒ Questions?

(35)

Data Replication Pain Points in Heterogeneous E i

Environments

Application platform

Application- consistent

Local site Remote site

Application response time

Oracle Exchange SQL Oracle Exchange SQL

Application platform support

consistent recovery

Corruption protection

SAN SAN

SAN

Disaster-recovery testing

Communications Existing cost

infrastructure

cost

GDA1

(36)

Slide 35

GDA1

Added host platform support to graphic in red, change back to normal, updated title.

Content: please adjust build as appropriate -- all the boxes should flow in with a slight delay between each.

Gary Archer, 1/9/2008

(37)

RecoverPoint Concurrent Local and Remote (CLR) D t P t ti

(CLR) Data Protection

PRODUCTION SITE DISASTER RECOVERY SITE PRODUCTION SITE DISASTER RECOVERY SITE

Cluster Passive Node Cluster

Active

Node RecoverPoint

appliances

Tape Backup Manager Standby

Disaster Recovery Server

SAN SAN/WAN SAN

Replication Data Flow

Tape Library

RecoverPoint Replication Services Local

Journal Storage Groups

and Logs

Remote Journal

Replicated Storage Groups and Logs

Performance architecture True CDP data protection for applications

Out-of-band design leveraging intelligent host and fabric interfaces*

–Supports CLARiiON write splitting on CX3 and CX4 arrays

p pp

–All writes stored in Journal with application bookmarks for recovery –Supports Microsoft Volume Shadowcopy Service (VSS) and VDI APIs

(38)

Journaling for Application-Aware Recovery Journaling for Application Aware Recovery

Journal Includes Data Plus Metadata Time/date

– Identifies the time image was saved

Bookmarks:

Bookmarks:

– System-generated group bookmarks

ƒ e.g., Volume Shadowcopy Service (VSS) backup

– User-generated bookmarks – Other EMC product bookmarks p

ƒ

EMC Replication Manager

– System-event-generated bookmarks – Microsoft SQL Server

ƒ

Microsoft Virtual Device Interface (VDI) operations

Mi ft E h – Microsoft Exchange

ƒ

Microsoft VSS

(39)

Grouping for a Consistent View Grouping for a Consistent View

Allows application recovery to be pp y tiered by service level

– Multiple volumes per group

– Mixed recovery point objectives within

same infrastructure OE Group 1 CRR

Provides independent replication controls

– Recover by group, locally or remotely St t/ t b

Group 2 CRR

CRM CDP

– Start/stop by group

Enables grouping of optimization

– Importance – Resource usage

Group 3

E-mail CRR

CDP CRR SCM

g

– Recovery point and recovery time

objectives

(40)

Grouping for Federated Environments Grouping for Federated Environments

Each tier has different service level 1: Linux (Web OE)

agreements

– Consistency groups per tier – Operational recovery of tier

P ll l i t ti

1: Linux (Web OE)

Consistency group

2: Windows (CRM)

Parallel consistency across tiers

– Federated environments

– Recover to a known point for all applications

Di t f ti li ti

– Disaster recovery for tier or application – Spans operating systems, applications,

storage, and servers

Enables advanced functions

Consistency group

– Full environment cloning

– Application upgrade testing 3: UNIX (SCM, Financials…)

(41)

RecoverPoint/Cluster Enabler (R P i t/CE)

(RecoverPoint/CE)

Each named cluster group’s g p associated devices reside in a single RecoverPoint consistency group of the same name

RecoverPoint RecoverPoint

WAN

Supports Microsoft Cluster Server on Windows Server 2003 and Microsoft Failover Cluster on Windows Server

2008 E t i d

2008 Enterprise and Datacenter Editions

File Share Witness with RecoverPoint/CE

installed

CG1: Devices for

Cl t G 1

Cluster Group1

(42)

VMware Infrastructure 3.5—Value and

I ti

Consolidate and t i

Innovations

3 U d t contain servers

Optimize your infrastructure

Manage and

Management and Automation

Infrastructure O ti i ti

Business C ti it

Desktop

M t

Software Lif l

3

Converter +

VDI ACE

Lab Manager Workstation Site

Recovery Manager Update

Manager

Manage and secure desktops

Maximize continuity and

uptime

Optimization Continuity Management Lifecycle

Virtual

2

VMotion High Availability +

Consolidated Backup Distributed

Resource Scheduler (DRS)

Storage VMotion DPM

uptime Automate your

virtual labs

Virtual Infrastructure

Resource Management

Availability VirtualCenter + Mobility Security Scheduler (DRS)

Virtualization

1 VMware Virtual Machine File System

Virtual SMP

(43)

VMware Site Recovery Manager Integration VMware Site Recovery Manager Integration

Simplifies and automates disaster recovery

workflows PRODUCTION RECOVERY

– Setup, testing, and failover

Makes disaster recovery a property of the virtual machine (VMware Distributed Resource

Scheduler and High Availability)

APP OS

APP OS

APP OS

APP OS

APP OS

APP OS

APP OS

APP OS

Provides central management of recovery plans from VirtualCenter

Turns manual recovery processes into automated recovery plans

automated recovery plans

Four EMC products integrated with VMware Site Recovery Manager

– SRDF family – MirrorView MirrorView

– Celerra Replicator

– RecoverPoint

(44)

Agenda Agenda

ƒ Today's Reality Today s Reality

ƒ IT Business Continuance and Disaster Recovery Considerations

ƒ Technology Choices

ƒ EMC RecoverPoint

ƒ Questions?

(45)

Questions are Encouraged

You can ask questions during the q g

presentation by using the link provided

in the Webcast Viewer.

(46)

Thank You

Thank You…

Disaster Recovery Best Practices Disaster Recovery Best Practices

Ed Levens David L. Jones

World Wide Technology EMC

References

Related documents

To capture the traditional spiritual power of the Bozhe agents that is highly revered and honored by the Sabat Bet Gurage peoples, sheyikh Budalla seemed to have

A special thank you to ALL of our PTA members for their efforts this year, led by our president, Darlene Fiederowicz , and her board.. These ladies deserve a huge thank you

First the Inca Kola ad shows a remarkable gender stereotype where women is the housewife and have to serve to her family, in this case the mother is serving to his husband and

The positive and signi…cant coe¢ cient on the post shipment dummy in the fourth column implies that prices charged in post shipment term transactions are higher than those charged

[r]

4.1 This test method provides a means of determining the compressive strength of hydraulic cement and other mortars and results may be used to determine compliance

Minors who do not have a valid driver’s license which allows them to operate a motorized vehicle in the state in which they reside will not be permitted to operate a motorized

One typical scenario of the field tests is shown in Figure 4.6, where the acquired relative trajectories of three vehicles are drawn in three different colors: red for the