Oracle Solaris Cluster
Oracle Solaris Cluster
Gia-Khanh Nguyen
Gia Khanh Nguyen
The following is intended to outline our general product direction. It is intended
for information purposes only, and may not be incorporated into any contract. It
is not a commitment to deliver any material code or functionality and should
is not a commitment to deliver any material, code, or functionality, and should
not be relied upon in making purchasing decisions. The development, release,
and timing of any features or functionality described for Oracle’s products
remains at the sole discretion of Oracle
Oracle Solaris Cluster
Oracle Solaris Cluster
Datacenter Evolution
From Traditional Datacenters to Cloud Infrastructure
From Traditional Datacenters to Cloud Infrastructure
D di
t d S
Vi t
li
d S
t
Solaris
Solaris
Solaris
Solaris
Solaris Solaris
Solaris
Mi
i
C iti
l Cl
d
S l
i 11
Dedicated Servers
Virtualized Systems
S l
i 9
Solaris 10
Mission Critical Clouds
Solaris 11
21
st
Century Cloud Infrastructure
Highly Available For All Mission Critical Applications
Highly Available For All Mission Critical Applications
Oracle
Solaris
Cluster
Oracle Solaris
Solaris 11
Zone
Solaris 10
Zone
Solaris
Legacy Zone
Oracle
Enterprise
Manager
Ops Center
SPARC
x86
Oracle VM
Ops Center
Oracle Solaris Cluster
For Mission Critical Clouds
Virtual
Clusters
•Built-in server,
Cluster-level
Load Balancing
•Optimized
Enterprise
High Availability
•Instant
system
Ultimate
Disaster Recovery
•Business continuity
li it d
storage, network
virtualization
•Multi-tenant
configuration from
Web to database
distribution of
applications load
and priority
management for
optimized
failure detection
•Orchestrated
policy-based,
application-specific failover
across unlimited
distance
•One-click,
automated
switchover
Oracle Solaris Cluster
Web to database
•Secure, isolated
Oracle Solaris
Zone clusters
•Application fault
p
distribution
•
Soft and hard
resource limits for
flexible behavior
specific failover
•
Pre-adapted
fencing and
quorum for data
integrity
switchover
•One-click,
automated
takeover
•Physical and
i t
li
d
Best HA
isolation in zone
clusters
•
Failover Zones
•
Dedicated zone
network and data
•
Broadest data
management and
networking support
virtualized
environments
Best HA.
Best Integration.
Oracle Solaris 11
network and data
resources
Oracle Solaris Cluster Functions
•
Monitors health of all cluster components:
– Servers, storage, network
,
g ,
– OS, virtual machines
– Applications
T l
t
f il
b
l iti
h d
•
Tolerates failures by exploiting hardware
redundancy and software algorithms
•
Recovers cluster infrastructure and applications in
Recovers cluster infrastructure and applications in
in the event of failures
High Availability Framework
Heartbeats
• Monitors nodes over a
Membership
•
Establishes consistent
inclusion of nodes in cluster
• Monitors nodes over a
private network
• Triggers reconfiguration
when a node leaves or joins
•
Coordinates reconfiguration
the cluster
Cluster Configuration
Repository
P
id
l
l
h
d
• Provides local copy on each node
• Enables automatic updates
• Enables nodes to arbitrarily join
or leave the cluster
Preserving Data Integrity
•
Quorum
– Prevents partitions (split brain, amnesia) in the cluster
– Protects against data corruption
1
1
Protects against data corruption
– Uses a majority voting scheme
– 2 node clusters require a quorum device (an external tie-breaker)
1
1
1
•
Disk Fencing
– Used to preserve data integrity
Used to p ese e data
teg ty
– Non cluster nodes are fenced off from updating
any shared data
Application Level Management
Resource Group Manager
•
Rich and extensible framework for plugging applications into Oracle
Solaris Cluster
Resource Group Manager
– Application is wrapped by an RGM resource, supplying methods for controlling
the application: Start, Stop, Monitor, Validate (aka agent)
– Closely related resources placed in Resource Groups (RG)
•
Support rich dependencies between Resources and RGs
– Facilitates proper startup/shutdown sequencing
– Dependencies can have various flavors such as strong/weak/restart
Dependencies can have various flavors such as strong/weak/restart
– Works between RG and across different cluster nodes
•
Oracle Solaris Service Management Facility (SMF) support
Simple example of an HA web server
Apache web server on ZFS
Install
Apache and set up web pages and scripts
using ZFS file systems in a zpool “webpool” on
shared storage and using the webhost hostname
# clrt register apache
# clrs create -g websrv-rg –t apache \
Apache web server on ZFS
shared storage and using the webhost hostname
# clrg create websrv-rg
# l lh
t
b
\
-p Bin_dir=/webpool/install/apache/bin \
-p Resource_dependencies=websrv-hasp-rs \
-p Port_list=80/tcp websrv-rs
# clrslh create -g websrv-rg \
-h webhost websrv-lh-rs
# clrt register HAStoragePlus
p
_
p
Bring the resource group online
# clrg online websrv-rg
# clrs create -t HAStoragePlus -g websrv-rg \
-p zpools=webpool websrv-hasp-rs
# clrg online websrv rg
Switchover to the other node
# clrg switch –n 2 websrv-rg
Oracle VM for SPARC Cluster Models
Virtualization
G
t d
i
HA
Virtualization
•
Guest domain as HA
resource (black box) –
Oracle Solaris Cluster
Oracle Solaris
Oracle Solaris
App
App
running only in control
domain
•
Domain as cluster node
Oracle
Solaris
Oracle Solaris
Cluster
Oracle Solaris
Cluster
Oracle
Solaris
Oracle
Solaris
Oracle
Solaris
Oracle
Solaris
Oracle
Solaris
•
Domain as cluster node
(control, I/O or guest domain)
Oracle VM for SPARC
Control
Domain
Domain
Domain
Oracle VM for SPARC
Control
Domain
High Availability Designed for Virtualization
Oracle Solaris Zones HA Deployments
Oracle Solaris Zones HA Deployments
Zone
Cluster
App
App
Zone Clusters
Ideal for multi-tiered workloads and multi-tenants
•
Application protection:
policy based
HR
Cloud
App
App
Z
Physical Cluster
management
•
Ease of use and security:
delegated
administration extended to virtual cluster
Zone
Cluster
Finance
Cloud
App
App
Failover Zone
Ideal for packaged workloads
•
Zone level protection:
resource
d
d
i
t
t t
d
Failover
Zone
Oracle VM
for SPARC
dependencies management, restart and
failover
•
Ease of migration
: Support for older
Oracle Solaris environments in a zone
Application-specific HA
Efficient Availability Multi tier Savings
•
Multi-tenant clusters
Efficient Availability. Multi-tier Savings.
•
Combine application, web and
database tiers
•
Application-specific failover for
virtualized applications
ua
ed app ca o s
Example of Multi-tier consolidation
PeopleSoft HCM in Zone Clusters with ZFS SA 7000
# clzc status
=== Zone Clusters ===
PeopleSoft HCM in Zone Clusters with ZFS SA 7000
Zone Cluster Status
---Name Node ---Name Zone Host---Name
Status Zone Status
----
---
---
---
---appsrv-zc
people01-s10 ipseapp1 Online Running
people02-s10 ipseapp2 Online Running
websrv-zc
people01-s10
ipseweb1
Online
Running
websrv zc
people01 s10 ipseweb1 Online Running
people02-s10 ipseweb2 Online Running
dbsrv-zc
people01-s10 ipsedb1 Online
Running
people02 s10
ipsedb2
Online
Running
people02-s10 ipsedb2 Online Running
Example of Multi-tier consolidation
PeopleSoft HCM in Zone Clusters with ZFS SA 7000
# hostname
people01-s10
# clrg list
websrv-lb-rg
weblh2-rs
web-PIA1-rs
PeopleSoft HCM in Zone Clusters with ZFS SA 7000
# clnode list
people01-s10
people02-s10
# l
li t |
l
weblh1-rg
weblh2-rg
websrv1-rg
b
2
web-PIA3-rs
web-PIA2-rs
web-PIA6-rs
b PIA5
# clrg list | wc -l
0
# zlogin websrv-zc
# hostname
websrv2-rg
scalmnt-rg
# clrs list
web lb rs
web-PIA5-rs
web-PIA4-rs
fs-psftweb-rs
# clrt list
# hostname
ipseweb1
# clnode list
ipseweb1
web-lb-rs
web-lblh-rs
weblh1-rs
web-admin-rs
# clrt list
SUNW.LogicalHostname:4
SUNW.SharedAddress:2
SUNW wls:4
ipseweb1
ipseweb2
web admin rs
adminlh-rs
SUNW.wls:4
SUNW.ScalMountPoint:3
Configurations addressing Disaster Recovery
Multi-site stretched/campus cluster
Multi-site, multiple
clusters
Shared-storage Campus Cluster
Quorum server
Room 3
N di t
li it
Room 1
Public Network
Room 2
Single Cluster
No distance limit
Private IP Network
FC/AL Switch
Fiber Link
FC/AL Switch
Replication-based Campus Cluster
Quorum server
Room 3
N di t
li it
Room 1
Public Network
Room 2
Single Cluster
No distance limit
Private IP Network
Array 1 Array 2
Array-based replication link
Geographic Edition – enabled Clusters
Primary Site
Backup Site
Admin. Client
Public IP network
Primary Site
Backup Site
Optional
Heartbeat
Network
subnet 1 subnet 2Network
Optional
Storage
Network
Network
OSCGE Architectural model
Layered Extension of Solaris Cluster
Layered Extension of Solaris Cluster
Built on Solaris Cluster (SPARC, x64)
Can be installed/removed with no application downtime
M
l
t
li
ti
d
i t d
li
ti
Manages cluster applications and associated replication
Builds on same hierarchical concepts with associated admin CLI
Partnership
p
of two clusters (physical or zone clusters)
(p y
)
Each cluster has one or more nodes
Protection Group
contains Resource Groups and Resources
Resource Group contains application and storage resources
Heartbeat
for connection monitoring in partnership
Standard TCP/IP heartbeat, optional plug-ins
p
p g
Alerts by email, SNMP trap, custom script
OSCGE Architectural model (continued)
Protection Groups are the managed entities
Within a Cluster,
Resource Groups
fail/switch over
Within a Partnership
Protection Groups
fail/switch over
Within a Partnership,
Protection Groups
fail/switch over
“Big Red Button” switch-over of Protection Group
Synchronized switchover of application and replication
Action script is triggered
Can be used to fail-over name server entries, etc.
“One-click” manual switchover
One-click manual switchover
by design
by design
Standard Business Continuity practice
Enables integration with non-IT parts of BC plan
(people, buildings, etc.)
Oracle Solaris Cluster
Oracle Solaris Cluster
Fastest Failure Detection and Recovery
Out of the Box Support. Top Enterprise Applications.
Largest Portfolio of Applications
Out of the Box Support. Top Enterprise Applications.
Web Tier / Presentation Tier
Oracle Communications
Calendar Server
Business Logic Tier
Oracle Weblogic Server
Oracle E-business Suite
Infrastructure Tier
Calendar Server
Oracle Communications
Instant Messaging Server
Oracle Communications
Messaging Exchange Server
Database Tier
Oracle DB and Oracle
RAC
Sybase ASE
Oracle Application Server
Oracle Siebel CRM
Oracle Business Intelligence
Enterprise Edition
Infrastructure Tier
Sun Grid Engine
DNS, NFS, DHCP
ZFS, QFS
Samba
Oracle iPlanet Web Server
Oracle iPlanet Proxy Server
Apache Web/Proxy Server
Apache Tomcat
Sybase ASE
Informix IDS
MySQL
SAP/MaxDB Database
PostgreSQL
Oracle PeopleSoft Enterprise
Sun Java System Message
Queue, Directory Server
Agfa IMPAX
IBM WebSphere MQ
Samba
Oracle Solaris Zone
Oracle VM for SPARC
•
No scripting or development required
GUI to add custom applications
g
TimesTen
IBM WebSphere MQ,
WebSphere Message Broker
SAP liveCache, J2EE
Engine, Enqueue Server
SWIFT Alliance Access/
G t
Best Availability for Enterprise Applications
Protocol
•
IPV4
•
IPV6
Hardware
Private Interconnect
•
Infiniband
•
2-6 independent
interconnects
Managed
Takeover or
Managed
Switchover
•
SPARC
•
X86
•
Vertical: 1-106
processors
•
Horizontal:
1-16 nodes
Pre-tested with the most networking, storage,
d d t
t
t
Public Network
•
Infiniband
•
Fast Ethernet
•
Gigabit Ethernet
•
10 Gigabit Ethernet
Replication & Mirroring
•
Oracle Data Guard
File System
•
Root: ZFS UFS VxFS
Storage Solutions
•
Sun ZFS Storage Appliance
16 nodes
and data management components
Infiniband
•
Fast Ethernet
•
Gigabit Ethernet
•
10 Gigabit Ethernet
•
IPMP
•
Sun Trunking
J
b F
•
Oracle Data Guard
•
Sun StorageTek Availability Suite
•
MySQL Replicator
•
Hitachi Universal Replicator and
True Copy
•
EMC SRDF
•
Root: ZFS, UFS,VxFS
•
Failover file system: ZFS, UFS,
NFS, QFS, VxFS
•
Cluster file system: ACFS, PxFS,
shared QFS
•
Sun ZFS Storage Appliance
•
EMC, Fujitsu, Hitachi, NetApp,
and more
Storage Technologies
•
Fiber Channel SCSI iSCSI
•
Jumbo Frames
•
VLAN
Volume Manager
•
ZFS, SVM, VxVM, ASM
Fiber Channel, SCSI, iSCSI,
Best Availability for Oracle Applications
Complete Oracle Integration
HA out-of-the box for
Oracle applications, middleware
Complete Oracle Integration
Seamless integration with
Oracle database and Oracle RAC
pp
,
Built on Oracle Solaris, for Oracle Solaris
Secure High Availability for
i t
li
d
i
t
Pre-tested, pre-engineered with SPARC,
x86 SPARC SuperCluster and Sun storage
virtualized environments
Oracle Solaris Cluster for SPARC SuperCluster
Ultimate Mission Critical Platform
•
Co-engineered with
SPARC SuperCluster
Ultimate Mission Critical Platform
T4-4 Node 2 T4-4 Node 2 T4-4 Node 1 T4-4 Node 1 WebLogic WebLogic Oracle Web Proxy Server Oracle Web Proxy Server Oracle Web Proxy Server Oracle Web Proxy Server
p
•
Kernel-level Oracle Solaris
integration
Faster failure detection
APPS VM + S10 IO Domain APPS VM + S10 IO Domain APPS VM+ S10 IO Domain APPS VM+ S10 IO Domain g Zone Cluster g Zone Cluster Application Server Zone Cluster Application Server
Zone Cluster PeopleSoft Oracle Application Server Oracle PeopleSoft Application Server Oracle WebLogic Server Oracle WebLogic Server Oracle WebLogic Server Oracle WebLogic Server Oracle PeopleSoft Application Server Oracle PeopleSoft Application Server
Application Server
Application Server
Oracle PeopleSoft
Oracle PeopleSoft
Oracle PeopleSoft
Oracle PeopleSoft
– Faster failure detection
– Faster services recovery
•
Data integrity protection
ith ZFS SA
DB VM + S11 DB VM + S11 DB VM + S11 DB VM + S11 Oracle RAC Oracle RAC Oracle RACOracle RAC
Zone Cluster
Zone Cluster
Oracle RACOracle RACApplication Server
Application Server
Application Server
Application Server
InfiniBand
internal
Network
with ZFS SA
– NFS IO fencing and files lock release
•
Combines mission-critical HA
ZFS 7320 Storage Appliance
Cluster
ZFS 7320 Storage Appliance
Cluster
Oracle Solaris Cluster
Oracle Solaris Cluster
Availability Meets The Cloud
Oracle Solaris Cluster 4.0
High Availability
Built-in Application support
Wide range of Oracle applications and
databases, web-, application and
Oracle Solaris 11
a
le
g
y
Cluster membership, Heartbeat
Disk fencing, Quorum,
Component , Storage Resources,
Quorum, Disk Path Monitoring
SMF Integration
Configuration Checker
Disaster Recovery
pp
infrastructure solutions
Security
Role based access control,
zones security isolation
Oracle Solaris 11
• Integrated end-to-end fault healing of
hardware, OS and virtualization
P d
ti
f
b
bilit
•
Component-Level Dynamic Reconfiguration
Sc
a
Virtualization
Oracle Zone cluster, failover zone
Oracle VM for SPARC: cluster
node, failover node
Disaster Recovery
Stretched Cluster and Multi-site/
Multi-cluster with automated failover
and replication solution
•
Production-safe observability
•
Zero overhead
virtualization
•
Secure live migration
•
Dynamic Domains
•
No software to install
•
Solaris Zones
O
l VM T
l t
VMware
• HA for VMs
• Vmotion
• Virtualization
mgmt tools
•
SPARC / x86 Hypervisors
•
Oracle VM Templates
•
Instant provisioning
mgmt tools
• Support for x86
Availability
Full Integration with Oracle Solaris 11
delivery framework
delivery framework
– IPS based delivery
•
Unified installation experience
•
Unified installation experience
•
Error-free software updates
•
Automatic patch dependencies resolution
Boot environment
– Boot environment
•
Instant snapshot and rollback
•
Lower risk updates
– Automated Installer
•
Common provisioning tool
•
Easy full stack, multi-node installation
Cloud-optimized
software distribution
Unique HA Protection For Oracle Solaris 11
Virtualization
Cloud-ready application protection
with Oracle Solaris Zone
cluster for Oracle Solaris 11, Failover Zone for Oracle Solaris 10
and 11, and Oracle VM for SPARC
Software
Life-cycle
Management
Cloud-optimized software distribution
with automated
dependency analysis through Automated Installer and IPS
packaging support
R li bl
t
ti
f
di
t
th
h
t
t d
Disaster
Recovery
Reliable protection from disaster
through automated
application failover and coordination for replication solutions such
as StorageTek Availability Suite 4.0, Oracle Data Guard and
script-based plug-in. Support peer OSC3.3u1 cluster.
p
p g
pp
p
Built-in
Application
Management
Increased application availability, simplified service
deployment and management
for Apache, Apache Tomcat,
DHCP, DNS, NFS, Oracle Database 11.2.0.3 (single instance and
Management
Oracle Solaris Cluster
Oracle Solaris Cluster
Mix of S10 and S11 zones in a 4-node OSC4.0 cluster
ptomcat1
ptomcat2
ptomcat3
ptomcat4
Zone
1
L
=
12
Zone
2
L
=
12
Critical
Important
P
=
30
Solaris
11
P
=
30
Solaris
10
Zone
3
L
=
8
Zone
4
L
=
8
Test
P
=
20
Solaris
11
P
=
20
Solaris
10
Zone
5
Zone
6
L
=
load
factor
P
=
priority
Solaris 11
S l i 11
S l i 11
S l i 11
L
=
8
P
=
10
Solaris
11
L
=
8
P
=
10
Solaris
10
1 processor
6 cores / 24 threads
Load limit = 20
Load factor is the
Dedicated CPU count
Solaris 11
Solaris 11
Solaris 11
Solaris 11
Dedicated CPU count
Zones availability gracefully maintained
ptomcat1
ptomcat2
ptomcat3
ptomcat4
Zone
1
L
=
12
Zone
2
L
=
12
Planned outage
Maintenance shutdown
of
ptomcat1
P
=
30
Solaris
11
P
=
30
Solaris
10
Zone
3
L
=
8
Zone
4
L
=
8
of
ptomcat1
P
=
20
Solaris
11
P
=
20
Solaris
10
Zone
5
Zone
6
Solaris 11
S l i 11
S l i 11
S l i 11
L
=
8
P
=
10
Solaris
11
L
=
8
P
=
10
Solaris
10
Automatic optimal placement
ptomcat1
ptomcat2
ptomcat3
ptomcat4
Zone
1
L
=
12
Zone
2
L
=
12
Planned outage
Switchover of zones
completed
P
=
30
Solaris
11
P
=
30
Solaris
10
Zone
3
L
=
8
Zone
4
L
=
8
completed
P
=
20
Solaris
11
P
=
20
Solaris
10
Zone
5
Zone
6
Solaris 11
S l i 11
S l i 11
S l i 11
L
=
8
P
=
10
Solaris
11
L
=
8
P
=
10
Solaris
10
Additional unplanned outage also handled
ptomcat1
ptomcat2
ptomcat3
ptomcat4
Zone
1
L
=
12
Zone
2
L
=
12
Human error
Power off of
ptomcat2
P
=
30
Solaris
11
P
=
30
Solaris
10
Zone
3
L
=
8
Zone
4
L
=
8
ptomcat2
P
=
20
Solaris
11
P
=
20
Solaris
10
Zone
6
Zone
5
Solaris 11
S l i 11
S l i 11
S l i 11
L
=
8
P
=
10
Solaris
10
L
=
8
P
=
10
Solaris
11
Zones availability maintained as per preset policy
ptomcat1
ptomcat2
ptomcat3
ptomcat4
Zone
1
L
=
12
Zone
2
L
=
12
Re-balancing
Critical Zone 2 given
priority
P
=
30
Solaris
11
P
=
30
Solaris
10
Zone
3
L
=
8
Zone
4
L
=
8
priority
Offlined Test Zone 6
P
=
20
Solaris
11
P
=
20
Solaris
10
Solaris 11
S l i 11
S l i 11
S l i 11
Solaris 11
Solaris 11
Solaris 11
Solaris 11
Back to optimal placement based on nodes availability
ptomcat1
ptomcat2
ptomcat3
ptomcat4
Zone
1
L
=
12
Zone
2
L
=
12
Initial configuration
restored after
servers online
P
=
30
Solaris
11
P
=
30
Solaris
10
Zone
3
L
=
8
Zone
4
L
=
8
servers online
and remastering of
RGs
P
=
20
Solaris
11
P
=
20
Solaris
10
Zone
5
Zone
6
Solaris 11
S l i 11
S l i 11
S l i 11
L
=
8
P
=
10
Solaris
11
L
=
8
P
=
10
Solaris
10
Oracle Solaris Cluster
Oracle Solaris Cluster
PeopleSoft on SPARC SuperCluster
HA Configuration
T4-4 Node 2
T4-4 Node 1
HA Configuration
DB
LDom
Apps
LDom
DB
LDom
Apps
LDom
Solaris 11
DB
RAC
Solaris 11
DB
RAC
Solaris 10
WLS-ZC
Proxy
WLS
Solaris 10
WLS-ZC
Proxy
WLS
Hypervisor
Hypervisor
RAC
11gR2
RAC
11gR2
PSFT-ZC
PSFT
PSFT-ZC
PSFT
Hypervisor
Hypervisor
Hardware
CPU, Memory,
PCIe
2 CPUs 512 GB RAM 2 PCI Root Complexes2 CPUs 512 GB RAM 2 PCI Root Complexes
Hardware
CPU, Memory,
PCIe
2 CPUs 512 GB RAM 2 PCI Root Complexes
2 CPUs 512 GB RAM 2 PCI Root Complexes