• No results found

High Availability Storage

N/A
N/A
Protected

Academic year: 2021

Share "High Availability Storage"

Copied!
30
0
0

Loading.... (view fulltext now)

Full text

(1)

High Availability Storage

High Availability Extensions

Goldwyn Rodrigues

High Availability Storage Engineer

(2)

High Availability Extensions

Highly available services for mission critical systems

Integrated suite over robust open-source technologies

Business Continuity

Protect Data Integrity

Reduce Unplanned Downtime

Commodity Hardware for high availability

(3)

Cluster

A set of computers interacting with each other

One goes down, another picks up its responsibility

Should be available as much as possible

Cluster Types:

Active/Active vs Active/Passive (N+1, N+M)

Physical vs Virtual vs Hybrid

Local Clusters vs Metro vs Geo Clusters

(4)

Why Cluster

Increased Availability

Improved Performance

Low cost of operation

Scalability

Disaster Recovery

Data Protection

Server Consolidation

Storage Consolidation

(5)

CAP Theorem

Brewers Theorem

Consistency Av ailab

ility Partition Tolerance

A guarantee that every request receives a response about

whether it was successful or failed

The system continues to operate despite arbitrary All nodes see the

same data at the same time

(6)

Fault Tolerance vs High Availability

Specialized hardware to detect a hardware fault and switch to redundant hardware

Expensive redundant and replicated components

A set of computers system-wide, shared

resources that cooperate to guarantee essential services

Software and hardware to quickly restore services

(7)

Cluster

DB Server NFS

Server

Mail Server Application

Server Web

Server

(8)

A Dysfunctional Cluster

SPLIT BRA

(9)

STONITH

Shoot the Other Node In the Head

RESET

(10)

Quorum

(11)

Quorum Policies

Ignore

Continue cluster operations as usual

Freeze

Resource management continues

New resources are not started

Stop

All resources affected partition are stopped

Suicide

Fence all nodes in affected partition

(12)

Resource Agents

Open Cluster Framework (OCF)

Manage resources

Web Server

IP Address

Shared Filesystem

Resource Operations

Id

Name

Interval

timeout

(13)

Configuration Tools

Cluster Resource Manager (CRM)

Powerful Command line tool

YasT

Basic Cluster setup

DRBD

IP Load balancing

High Availability Web Konsole (Hawk)

Web based

(14)

Architecture

(15)

Resource Agent Constraints

1. IPAddr

2. Web Server

Database

Location Colocation

Order

(16)

Shared Storage

What's the problem here?

(17)

Shared Storage

A common view for all nodes

Node Failure: One node can pick up from where the other left

Local filesystems don't work

Data corruptions because of writing files other nodes access

Node Cache Inconsistencies

(18)

DLM

Distributed Lock Manager

Provides a Cluster-wide locking for data access

Different Level for wide variety of uses

No centralized-control

Easy for take over

The node accessing the object first gets to create the distributed lock object

Lock Value Block (LVB) for data synchronization

(19)

DLM

Distributed Lock Manager

Mode NL CR CW PR PW EX

NL Yes Yes Yes Yes Yes Yes

CR Yes Yes Yes Yes Yes No

CW Yes Yes Yes No No No

PR Yes Yes No Yes No No

PW Yes Yes No No No No

(20)

cLVM

Clustered Logical Volume Manager

Logical Volume Manager for the cluster

Add or remove devices as storage needs change

Linear

Add storage as required

Simple addition of devices in a linear form

Mirrored

Redundancy of devices over the cluster

(21)

DRBD

Distributed Replicated Block Device

DRBD DRBD

Primary

Node Secondary

Node

Disk 2

(22)

DRBD

Distributed Replicated Block Device

Shared Storage without a SAN

Governed by network speeds

RAID1 over network block device and local device

Fully synchronous, memory synchronous or asynchronous modes of operation

Dual Primary for clustered filesytems

(23)

Shared Filesystem

ocfs2

Simultaneously mounted on all nodes

All nodes should have access to the data

Cluster Filesystem with a good throughput

Should not be bogged down with multiple access

Force cache flush on other nodes using filesystem when current node access the file

(24)

OCFS2 Features

B-tree Extent based

Inline data

Indexed Directories

Metadata ECC

Refcount

Extended Attributes

ACL

Quota

Read man mkfs.ocfs2 for more information..

(25)

Usage Scenarios

Database Server Applications

Common database repository

CTDB (Samba)

Web Servers

WWW root

Highly Available Virtual Machines

VM Disks

(26)

Generic Clustering Tips

Always use STONITH

Recommend multiple STONITH devices

SBD is an alternative

Time synchronization

Read the logs when things are not working

Record Time of event

Redundant Communications

Network Device bonding

Redundant Ring Protocol

(27)

OCFS2 Tips

Prefer hardware based RAID (mirroring)

If you don't want a feature don't enable it

Quota, ACL, Inline directories use additional data on disk and additional lookups

You can always enable it later using tunefs.ocfs2

Inline directories for large number of files

MetaECC for better data protection

Protects filesystem from getting corrupted further

Immediately run fsck.ocfs2

(28)

Thank you.

SUSE

®

Linux Enterprise High

Availability Extension

https://www.suse.com/products/

highavailability/

(29)
(30)

Unpublished Work of SUSE LLC. All Rights Reserved.

This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC.

Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE.

Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.

General Disclaimer

This document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole

discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.

References

Related documents

Hence, weak loss aversion is simply defined as weak risk aversion on the collection of all symmetric acts, and it is an extension of the intuitive definition of loss aversion given

satisfaction, work life balance imbues the feeling that organization is caring and the place is best place to work. Owing to such acts an employee is obliged to work hard

• CephFS will open multiple connections to Storage nodes when writing 1 file at a time, where as a client using Ceph object storage will only open 1 connection to 1 storage node at

The guiding principle for our integration effort was provided by Friesen and Friesen (2002) who describe integration as a mixing of ideas, a coming together of minds

This document contains summary information and study guidance for individuals preparing for the NDIA Professional Certification Examination in Configuration and Data Management.. It

The programs help you make decisions for different situations based on what you’ve learned in communication, body language and behavior.. Since you gain information and subjective

Company-community grievance mechanism – Institutionalized approaches, procedures, and roles for the resolution of concerns or complaints at the project level raised by individuals

CSB-E07363r Rat Vascular Endothelial cell Growth Factor B,VEGF-B ELISA kit 96 tests CUSABIO. E0145Ra Rat Vascular endothelial cell Growth Factor C(VEGFC) 96 tests