• No results found

Red Hat Cluster Suite

N/A
N/A
Protected

Academic year: 2021

Share "Red Hat Cluster Suite"

Copied!
29
0
0

Loading.... (view fulltext now)

Full text

(1)

Red Hat Cluster Suite

HP User Society / DECUS

17. Mai 2006

Joachim Schröder

Red Hat GmbH

(2)

Two Key Industry Trends

Clustering (scale-out) is happening

20% of all servers shipped will be clustered by 2006. - Gartner

Linux clusters are growing at a cagr of 44% per year. - IDC

30%+ of Red Hat inquiries are about implementing clusters.

Clustering software is a $500m+ market today.

Storage capacity and complexity are growing

Storage (capacity) will continue to grow at 40-60% annually. -

Aberdeen.

SAN management software use grew by 36% year over year...

... driven "primarily by customer need to support larger and more

complex storage networks” . - IDC

(3)
(4)

Clustering ­ motivation

Reduce TCO by:

Consolidate services to fewer systems

Simplify management tasks

Increase service uptime

Approaches:

Scale­up: few very powerful systems running multiple services

Scale­out: many small systems collaborate on several services

Requirements:

Data sharing

Monitoring infrastructure

Service management layer

(5)

Clustering – “ buzzwords”

Application Failover

Detect service failures and failover to standby systems

Load Balancing

Distribute service load over multiple systems

Common for network based systems

Active/Passive

Active node owns service, passive node is idle

Failover requires “ hand-over” of resources to passive node

Advantage: generally any application

Active/Active

Multiple nodes are owning the service

Requires typically data sharing

(6)

Red Hat Clustering Architecture

 Red Hat Cluster Suite provides

● Application failover

● Improves application availability ● Included with GFS

● Core services for enterprise cluster

configurations*

 Red Hat Global File System (GFS)

● Cluster-wide concurrent read-write file

system

● Improves cluster availability, scalability

and performance

● Includes Cluster Logical Volume Manager

(CLVM)* Red Hat Enterprise Linux

Single node LVM2

Red Hat Cluster Suite

HA Services

(Failover)

Core services*:

DLM – Connection Manager – Service Manager I/O Fencing – Heartbeats – Management GUI

IP Load Balancing Cluster Logical Volume Manager* Cluster File System

Red Hat Global File System (GFS)

* These features will be available with launch of Cluster Suite v4 and GFS  6.1 – availability concurrent with RHEL4 Update 1

(7)

Red Hat Cluster Suite: Core Cluster

Services

 Core functionality for both Clustering and GFS is delivered in Red Hat Cluster Suite

● Membership management ● I/O fencing

● Lock management ● Heartbeats

● Management GUI

 Support for up to 300 nodes

 Two selectable lock management models

● Client-server with SLM/RLM (single/redundant lock manager) ● Was in prior version

● Distributed Lock Manager (DLM)

● New with Red Hat Cluster Suite v.4

(8)

Red Hat Cluster Suite Overview

 Provides two major technologies

● High Availability failover – suitable for unmodified applications

● IP Load Balancing – enables network server farms to load share IP load

 New with Cluster Suite v4

● Elimination of requirement for shared storage

● Significantly reduces the cost of high availability clustering ● Shared Quorum partition is no longer required

● Service state, previously stored in Quorum partition, is now

distributed across cluster

● Online resource management modification

● Allows services to be updated without shutting down (where possible) ● Additional fencing agents

(9)

Red Hat Cluster Manager Architecture

Node 0

Node 1

Clients

Network Power Switch

SAN

Oracle NFS bind Heartbeat

(10)

Handling Failures

 Software Failures

● Monitoring provided by resource agents handles restarting

● If a resource fails and is correctly restarted, no other action is taken

● In the event that the resource fails to restart, RHCM will stop and relocate the

entire service to another node

 Hardware, cluster failures

● If the cluster infrastructure evicts a node or nodes from the cluster; RHCM

selects new nodes for the services it was running based on the failover domain if one exists

● If a NIC fails or a cable is pulled (but the node is still a member of the

cluster), the service will be relocated

Double Faults – Usually difficult or impossible to choose a universally correct

course of action when one occurs. Ex: Node with iLO losing all power vs. pulling all of its network cables.

(11)

After a node failure...

Node 0

Node 1

Clients

Network Power Switch

SAN

Oracle NFS bind Heartbeat

(12)
(13)
(14)

Data Sharing

Purpose: parallel access to data resources

Files and Directories

Needed identical or single data

Data Sharing technologies

SAN – block based sharing

NAS – file based sharing

Data Replication on cluster nodes

File or block based, typically not symmetric

(15)

Red Hat Global File System v6.1

 New version for Red Hat Enterprise Linux v.4

● Uses new common cluster infrastructure in Red Hat Cluster Suite (included)

 Provides two major technologies

● GFS cluster file system – concurrent file system access for database, web

serving, NFS file serving, HPC, etc. environments

● CLVM cluster logical volume manager

 Fully POSIX compliant  Much faster fsck

● Ported to GFS 6.0 on RHEL 3 Update 5

 Data and meta-data journaling (per-node journals, clusterwide recovery)

 Maximum filesize & file system size: 16TB with 32-bit systems, 8EB with 64-bit

systems

 Supports file system expansion  Supports SAN, iSCSI, GNBD

 Default use of new Distributed Lock Manager (DLM)

(16)

Distributed Lock Manager

 Red Hat Cluster Suite v.4 includes a Distributed Lock Manager (DLM)

● Primarily used by Red Hat Global File System

● Available for general purpose use by any application ● Closely mirrors the original Digital VMS DLM

 A DLM is a highly functional, distributed (cluster-wide), application

synchronization subsystem

● Processes use the DLM to synchronize access to a shared resource (e.g. a

file, program, or device) by establishing locks on named resources

● Permits the creation of distributed applications ● e.g Oracle RAC (which uses a private DLM) ● Provides a collection of services

● Multiple lock spaces and concurrency (lock) modes ● Lock hierarchies/domains (resources & subresources) ● Range locking

● Lock conversions & value blocks ● Blocking notifications

(17)

Cluster Logical Volume Manager (CLVM)

CLVM builds upon LVM 2.0 and the kernel device mapper component

included in 2.5 and 2.6 Linux distributions

Essentially a cluster-aware version of LVM 2.x

Commands, features, functions all work just fine in a cluster, any Linux

server may mount any volume

Provides

● Cluster safe volume operations

● Cluster-wide concatenation and stripping of volumes ● Dynamic Volume resizing

● Cluster-wide snapshots (mid '06) ● Cluster-wide mirroring (mid '06) ● Other RAID levels (mid '06)

(18)

Topology: - 2 or mode nodes - NIC card - NO shared storage Applications: - Read-only data - Small web serving - NFS/FTP serving - Edge of network

Data model:

- Unshared, static data - Replication w/ rsync - ext3 file system

Use Case: High availability with local data

 Red Hat Enterprise Linux (ext3) + Red Hat Cluster Suite

v4

 Requires NO ADDITIONAL HARDWARE

● No physically shared storage

● Prior to RHEL4, physically shared storage was

required

 Ensures that an application stays running

● Monitors all cluster nodes

● Fails-over applications/services from stopped nodes ● Simple management GUI

● Service definitions are automatically propagated and

synchronized, cluster-wide

Local storage only Application failover

(19)

Use Case: Shared file datastore, NAS

 Red Hat Enterprise Linux (NFS)  Uses an external NFS file server

NetApp, etc.

Shared storage – file based (NFS) Cluster nodes are NFS clients

HA Cluster

Data model:

- Physically shared data - File based (NFS)

Applications:

- Small/medium web serving - NFS/FTP serving

- Read/write data - File level access Addnl. Cluster features: - Increased storage capacity - Applications can failover and access the same data as before - NFS sharing model

NetApp, etc.

(20)

Use Case: Shared block datastore, SAN

 Red Hat Enterprise Linux (ext3, integrated

HBA drivers)

 Offers higher, SAN-based, performance

Shared storage – block based (SAN) Partitions accessed on a per node basis

SAN

Addnl. Cluster features: - SAN access provides: -- improved performance -- heterogeneous access by other systems

Applications:

- Medium/large web serving - Medium database

- Read/write data - Block level access

Data models:

- Storage Area Network - Physically shared storage - Block based

(21)

Use Case: Concurrent file system access

 Red Hat Enterprise Linux + Red Hat Global

File System

● Includes Red Hat Cluster Suite

 Shared file system access  Scalable performance

 Same hardware as previous configurations  Common software infrastructure with

previous configuration

Shared storage – block based (SAN) Partitions concurrently accessed by all nodes

SAN

Applications:

- Large web serving - Large database - Read/write data - Block level access - Concurrent access

Data models:

- Storage Area Network - Shared file system access - Block based

- GFS file system

Addnl. Cluster features: - SAN access provides: -- improved performance -- heterogeneous access by other systems

(22)

About Piranha & IPVS

 Piranha

● Front-end to IPVS (or LVS) which enables IPVS director failover ● Provides a web-based configuration GUI

 IPVS – does the heavy lifting

● Layer-4 (application) switching/routing

● Multiple scheduling algorithms: round-robin, weighted round-robin,

least-connections, weighted-least-least-connections, and more...

● Several routing topologies: NAT, Direct Routing, IP Tunneling

● DR and TUN are tricky to set up, and require configuration on the Real

(23)

Piranha Architecture

 Appears as one mega-machine to clients: SSI at the IP level  One or two nodes acting as IPVS directors

● Stand-alone cluster: Not dependent on CMAN, GULM, etc.

● IPVS routing table is replicated between the primary and backup directors for

hot-standby operation

● No dependencies on shared storage (no shared data to protect)

 Real servers can use GFS, NFS, or manual replication (using rsync, for

example)for static content

 Provides high availability for applications via redundant Real Servers and

(24)

Piranha Architecture

Primary

Backup

RS0

RS1

RS2

Internet

Heartbeat Virtual IP Director Real Server

(25)

Piranha – Real Server Failure

Primary

Backup

RS0

RS1

RS2

Internet

Heartbeat Virtual IP Director Real Server

(26)

Piranha – Director Failure

Primary

Backup

RS0

RS1

RS2

Internet

Virtual IP Heartbeat Director Real Server

(27)

Red Hat and GFS Joint Linux Clustering Solution

HP Serviceguard for Linux creates a cluster of Linux servers that make 

application services available despite hardware or software failures or 

planned downtime.

Red Hat Global File System (GFS) is a cluster file system that enables 

multiple servers to simultaneously read and write to a single shared file 

system on a Storage Area Network.

Recently tested. Working together, now provide best of breed clustering 

technologies for Linux. The combination offers greater availability of 

applications and data using a cluster of low­cost Linux servers.

White Paper 

Solution Brief 

Joint marketing

New

!

(28)
(29)

Vielen Dank!

Joachim Schröder, Solution Architect

[email protected]

References

Related documents

In Iberian birch, secondary growth depended on the presence of European beech and Iberian birch, as well as the number of trees higher than the focal, whereas in the pedunculate

The Gaining Optimal Asthma ControL (GOAL) study demonstrated that salmeterol/ fluticasone maintenance treatment can improve asthma control and reduce future risk compared with

We present the protocol for the ongoing INTREPID (INvestigation of TRelegy Effectiveness: usual PractIce Design; ClinicalTrials.gov identifier: NCT03467425) study, a

This study demonstrates that treatment of moderate-to- severe COPD patients with aclidinium bromide 200 mg once- daily for 6 weeks improved exercise ET and reduced lung

(LABA) and the antimuscarinic tiotropium bromide (TIO) on resting lung function are translated into lower operating lung volumes and improved exercise tolerance in patients with

Assim, o Terceiro Setor, e em especial as organizações não governamentais, tentam suprir de alguma forma a ausência do Estado e a falta de ações sociais

Actually, parental encouragement has been defined across studies as representing many different behaviors and practices at home or school, including parental aspirations,

Regarding barriers of PM, the mean scores were higher among physicians working in public hospitals (M=9.62, SD=2.06) compared with those working in educational