• No results found

OpenSAF and VMware from the Perspective of High Availability

N/A
N/A
Protected

Academic year: 2021

Share "OpenSAF and VMware from the Perspective of High Availability"

Copied!
27
0
0

Loading.... (view fulltext now)

Full text

(1)

OpenSAF and VMware from the Perspective of

High Availability

Ali Nikzad, Ferhat Khendek Maria Toeroe

(2)

}

Introduction & Objectives

}

Some Background

}

Testbed, Failures and Metrics

}

The baseline architectures

}

Measurements

}

Analysis

}

Architectures combining OpenSAF and virtualization

}

Conclusion

(3)

Introduction & Objectives

} Service availability and continuity have become important requirements in

several domains

} High availability (demanded in some domains) is defined as at least 99.999%

availability which is maximum of 5.26 minutes of downtime in a year

} Including scheduled downtime for upgrade for instance

} The computing world is moving toward cloud services and cloud computing

} Virtualization is an important aspect

(4)

Introduction & Objectives

} Evaluate and position virtualization and the SAForum middleware with

respect to each other

} Using OpenSAF and VMware in the test-bed

} Analyzing the measurements

} Determine pros and cons of using virtualization in/for high availability (HA)

(5)

Background – SAForum and OpenSAF

SAForum is a consortium of telecommunication and computing companies developing standards for enabling highly available services and applications

OpenSAF is an open source project focused on Service Availability implementing the SAForum service specifications, among others AMF, SMF and IMM. Application Interface Platform Interface Logically grouped components Applications SAForum Middleware Operating System Hardware Platform AMF Checkpointing Notification Log Etc.

(6)

Background-AMF

} AMF is one of the most important services defined by SAForum

} AMF configuration: AMF logical entities …

} AMF Node

} AMF Cluster

} Component

} Component Service Instance (CSI)

} Service Unit (SU)

} Service Instance (SI)

} Service Group (SG) } Redundancy Models } Application SA-Aware Non-SA-Aware Pre-instantiable Non-pre-instantiable

2N N+M N-Way N-Way-Active No-Redundancy

Ap pl ic a' on Servic e  G roup Node  1 Service  Unit  1 Component  1 Service  Instance Component  Service   Instance Node  2 Service  Unit  2 Component  2 Node  3 Service  Unit  3 Component  3

(7)

Background-Virtualization and VMware

} Virtualization is the separation of a resource or request for a service from the

underlying physical delivery of that service

} VMs are hosted on software called hypervisor

} Native (bare metal)

} Hosted (non-bare metal)

(8)

Background - VMware Availability solutions

(9)

Testbed, Failures and Metrics

} Test-beds:

} 5 nodes cluster

} Case study application:

} VLC for Video streaming

} Failures

} VLC component failure (application failure)

} VM failure

} Physical node failure

} More than 45 sets of measurements for metrics in different architectures for different

(10)

Testbed – Metrics and Failures

} Metrics

} Qualitative (criteria)

} Complexity of using the solution } Supported Redundancy Models } Scope of failure } Service continuity } Supported platforms } Quantitative Metrics } Reaction Time } Repair Time } Recovery Time } Outage Time Time Failure First  reac.on  

Resump.on  or  Restart   of  service

Faulty  unit  repaired

  Reac.on  Time   Recovery  Time   Repair  Time   Outage  Time

(11)

Baseline architectures

SAF based architecture VMware HA

ESXi   Node  1 Sh ar ed  sto rag e

VMware  vSphere  cluster

ESXi   Node  2 VM VLC Ubuntu Physical  node  1 SU1 VLC   component IP component Physical  node  2 SU2 VLC   component IP component SG 2N  Red. Ubuntu OpenSAF Ubuntu SI   VLC-­‐CSI IP-­‐CSI

SA-Aware and Non-SA-Aware versions of VLC on physical and virtual nodes VM availability with VMware HA

(12)

Baseline architectures

} OpenSAF on VMware HA

} VMs’ lifecycle depends on the hypervisor’s

availability mechanisms

(13)

Baseline architectures

Architectures VLC component

failure VM failure Node Failure

OpenSAF on physical nodes with SA-Aware VLC component √ Not

applicable √

OpenSAF on physical nodes with Non-SA-Aware VLC

component √

Not

applicable √

OpenSAF on virtual nodes with SA-Aware VLC component

(with/without VMware HA enabled) √ √ √

OpenSAF on virtual nodes with Non-SA-Aware VLC component

(with/without VMware HA enabled)

√ √ √

(14)

Baseline architectures: Measurements

(15)

Baseline architectures: Measurements

(16)

Baseline architectures: Measurements

I

(17)

Baseline architectures: Measurements

  Repair of Repair of Repair of  

Failed component Failed VM Failed Node

OpenSAF on Standalone machine Yes - No

OpenSAF in VM Yes No No

VMware HA No Yes (restarting the VM on another host) Yes(restarting the VM on another host)

OpenSAF in VM + HA Yes Yes Yes(restarting the VM on another host)

(18)

Baseline architectures: Analysis

}

SAF based architectures

ü Service continuity with SA-Aware components and better reaction time

ü Support application failure detection and recovery

ü Less overhead

û No repair for the cluster node

}

VMware HA

ü Repair of the node is supported

û but it is very long (~100s)

û Inevitable 15% to 30% overhead due to the VM compared to the physical host deployment

û No support for application failure

}

Baseline combined architecture

ü The advantages of both architectures

(19)

Baseline architectures: Analysis

The reaction time to the failure in OpenSAF is much faster

(20)

Architectures combining OpenSAF and virtualization: Availability in non-bare-metal hypervisor

Physical Cluster  1  Node  1 Hypervisor VMware  worksta'on Physical Cluster  1  Node  2 Hypervisor VMware  worksta'on SU  1 SU  2 OpenSAF Cluster  1 OpenSAF  cluster  2   Node  2 VM  component  2 SU2 VLC   component IP component Ubuntu OpenSAF  cluster  2   Node  1 VM  component  1 SU2 VLC   component IP component Ubuntu OpenSAF  Cluster  2 SG 2N  Red. SG No  Red.

(21)

  Reaction Repair Recovery Outage

VMware HA 72.166 27.16627.166 99.332

OpenSAF with ESXi (VMware HA

manages the VMs) 1.905 107.90 0.047 1.953

The new availability management in

non-bare-metal hypervisor 3.449 3.73 0.056 3.505

Comparison of different architectures for VM failure

Architectures combining OpenSAF and virtualization:

Availability in non-bare-metal hypervisor

(22)

Architectures combining OpenSAF and virtualization:

Availability in non-bare-metal hypervisor

 

Reaction Repair Recovery Outage

VMware HA Not coveredNot coveredNot coveredNot covered

OpenSAF with no virtualization 0.009 0.136 0.046 0.055

OpenSAF with ESXi (VMware HA

manages the VMs) 0.013 0.243 0.068 0.081

The new availability management in

non-bare-metal hypervisor 0.012 0.848 0.580 0.592

Failure of the SA-Aware VLC component in different architectures

(23)

ESXi  hypervisor  1  

Manager  VM1  

OpenSAF  cluster  1  node  1 SU1 VM  1  Comp SU2 VM  2  Comp Ubuntu ESXi  hypervisor  2   Manager  VM  2

OpenSAF  cluster  1  node  2 SU3 VM  1  Comp SU4 VM  2  Comp Ubuntu OpenSAF SI  1 VM  1-­‐CSI SI  2 VM  2-­‐CSI SG  1 2N  red.   SG  2 2N  red.   SU1 VLC   component IP component SU2 VLC   component IP component SG 2N  Red. Ubuntu OpenSAF Ubuntu SI   VLC-­‐CSI IP-­‐CSI

Architectures combining OpenSAF and virtualization:

Availability in bare-metal hypervisor

(24)

Architectures combining OpenSAF and virtualization:

Availability in bare-metal hypervisor… VM Failure

(25)

Architectures combining OpenSAF and virtualization :

(26)

Conclusion

} From baseline architectures to combinations

} Use the powerful availability management of OpenSAF and benefiting from virtualization

} Increased outage time in the new deployment is because of the non-bare-metal hypervisor

} The bare-metal architecture can possibly fix it…

} Improve the repair time of a failed VM through management of the VM life cycle

} Cover the different types of hypervisors (bare-metal and non-bare-metal)

} The VM’s life cycle management

} is not limited to a specific solution

} can support other hypervisors like XEN and Linux KVM because of using similar and standard

interfaces like libvirt

(27)

References

Related documents

ASPHER started its European Public Health Core Competences Programme in 2006, involving in the first place about 100 European public health researchers and teachers and,

Assuming that unskilled labour is the relative abundant factor in developing countries, trade liberalisation should increase its relative returns when compared to capital and

17th IPHS Conference, Delft 2016 | HISTORY - URBANISM - RESILIENCE | VOlUme 02 The Urban Fabric | morphology, Housing and Renewal | Historic Urban Morphology..

The COMELEC cannot cancel her COC on the ground that she misrepresented facts as to her citizenship and residency because such facts refer to grounds for ineligibility in

This unit covers the skills and knowledge required to maintain the effectiveness of business technology in the workplace.. It includes activities such as the maintenance of existing

• Q2 – Have students circle other nouns in the sentence and note whether they are abstract or concrete; and/or have students explain how they know the noun is abstract

Research results show that the role of lecturers, the social behavior of students, the innovation of the training program and the application of information technology have

If the strata with standard penetration test (SPT) ‘N’ value greater than 100.. with characteristics of rock is met with earlier, the bore hole shall be advanced further