HP Integrity Servers with Microsoft®
Windows Server™ 2003
Cluster Installation and Configuration Guide
HP Part Number: 5992-4441 Published: April 2008
© Copyright 2008 Hewlett-Packard Development Company, L.P.
Legal Notices
Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.
The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
Microsoft®, Windows®, and Windows NT® are trademarks of Microsoft Corporation in the U.S. and other countries.
Intel® and Itanium® are registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Java® is a U.S. trademark of Sun Microsystems, Inc.
UNIX® is a registered trademark of The Open Group.
Table of Contents
About This Document...9
Intended Audience...9
New and Changed Information in This Edition...9
Document Organization...9
Typographic Conventions...9
Related Information...10
Publishing History...10
HP Encourages Your Comments...10
1 Introduction...11
Clustering Overview...11
Server Cluster Versus Network Load Balancing...12
Server Cluster...13
NLB...13
Cluster Terminology...14
Nodes...14
Cluster Service...14
Shared Disks...14
Resources...14
Resource Dependencies...15
Groups...15
Quorums...15
Single Quorum...16
Majority Node Set (MNS) Quorum...17
Heartbeats...18
Virtual Servers...18
Failover...18
Failback...19
2 Administering the Cluster...21
Verifying Minimum System Requirements...21
Gathering Required Installation Information...23
Creating and Configuring the Cluster...25
Configuring the Public and Private Networks...26
Private Network...26
Public Networks...27
NIC Teaming...28
Preparing Node 1 for Clustering...28
Configuring the Shared Storage...28
Preparing Node 2+ for Clustering...29
Creating the Cluster...30
Joining Node 2+ to the Cluster...30
Configuring Private and Public Network Role and Priority Settings...31
Validating Cluster Operation...32
Method 1: Simulate a Failover...32
Method 2: Run the Cluster Diagnostics and Verification Tool...32
Upgrading Individual Nodes...32
Table of Contents 3
List of Figures
1-1 NLB Example...14
1-2 Single Quorum Example...16
1-3 MNS Quorum Example...17
2-1 Example cluster hardware cabling scheme...23
5
List of Tables
1-1 Server Cluster and NLB Features...12 2-1 Installation and Configuration Input ...23
7
About This Document
This document describes how to install and configure clustered computing solutions using HP Integrity servers running Microsoft® Windows Server™ 2003.
The document printing date and part number indicate the document’s current edition. The printing date changes when a new edition is printed. Minor changes may be made at reprint without changing the printing date. The document part number changes when extensive changes are made.
Document updates may be issued between editions to correct errors or document product changes.
To ensure that you receive the updated or new editions, you should subscribe to the appropriate product support service. See your HP sales representative for details.
The latest version of this document can be found online athttp://www.docs.hp.com.
Intended Audience
This document is intended for system administrators and HP support personnel responsible for installing, configuring, and managing clustered computing solutions using HP Integrity servers.
This document is not a tutorial.
New and Changed Information in This Edition
This document includes the following changes since its last release:
• Updated for Windows-on-Integrity Release 6.0
Document Organization
This document is organized as follows:
Describes cluster concepts and terminology.
Chapter 1:
“Introduction” (page 11)
Describes how to set up and administer clusters.
Chapter 2:
“Administering the Cluster” (page 21)
Typographic Conventions
This document uses the following typographical conventions:
WARNING A warning calls attention to important information that if not understood or followed will result in personal injury or nonrecoverable system problems.
CAUTION A caution calls attention to important information that if not understood or followed will result in data loss, data corruption, or damage to hardware or software.
IMPORTANT This alert provides essential information to explain a concept or to complete a task
NOTE A note contains additional information to emphasize or supplement important points of the main text.
KeyCap The name of a keyboard key or graphical interface item (such as buttons, tabs, and menu items). Note that Return and Enter both refer to the same key.
Computer output Text displayed by the computer.
Intended Audience 9
User input Commands and other text that you type.
Command A command name or qualified command phrase.
Ctrl+x A key sequence. A sequence such as Ctrl+x indicates that you must hold down the key labeled Ctrl while you press another key or mouse button.
[] The contents are optional in command line syntax. If the contents are a list separated by |, you must choose one of the items.
{} The contents are required in command line syntax. If the contents are a list separated by |, you must choose one of the items.
... The preceding element can be repeated an arbitrary number of times.
Indicates the continuation of a code example.
| Separates items in a list of choices.
Related Information
You can find more information about HP Integrity servers, server management, and software in the following locations:
• For more information about cluster configuration and supported storage systems:
http://h18004.www1.hp.com/solutions/enterprise/highavailability/answercenter/
configuration-all-list.html#03ei
• For more information about troubleshooting and maintaining clusters:
http://technet2.microsoft.com/windowsserver/en/library/
549145e4-4f5d-4545-a9b5-53ebd86d75911033.mspx?mfr=true
• For more information about network load balancing (NLB):
http://technet2.microsoft.com/WindowsServer/en/library/
98d46a24-96d8-412c-87d8-28ace62323d21033.mspx?mfr=true
Publishing History
The publishing history below identifies the edition dates of this manual. Updates are made to this publication on an unscheduled, as needed, basis. The updates will consist of a complete replacement manual and pertinent online or CD documentation.
Publication Date Supported Products
(Servers) Supported SmartSetup
Version Supported Operating
Systems Manufacturing Part
Number
April, 2008 BL860c, BL870c,
rx2660, rx3600, rx6600, rx7620, rx7640, rx8620, rx8640, Superdome, Superdome/sx2000 Version 6.0
Microsoft Windows Server 2003 for Itanium-based Systems, 64–bit 5992-4441
HP Encourages Your Comments
HP encourages your comments concerning this document. We are committed to providing documentation that meets your needs. Send any errors found, suggestions for improvement, or compliments to:
Please include the document title, manufacturing part number, and any comment, error found, or suggestion for improvement you have concerning this document.
1 Introduction
This document describes how to install and configure clustered computing solutions using HP Integrity servers running Microsoft Windows Server 2003.
The clustering improvements for Microsoft Windows Server 2003, 64-bit Edition (over Microsoft Windows 2000) include the following:
Larger cluster sizes 64-bit Enterprise and Datacenter Editions now support up to eight nodes.
Enhanced cluster installation wizard Built-in validation and verification function helps ensure base components are ready to be clustered.
Installation Clustering software is automatically copied during operating system installation.
Multi-node addition Multiple nodes can now be added in a single operation instead of one by one.
Active Directory integration Tighter integration including a virtual computer object, Kerberos authentication, and a default location for services to publish service control points. Users can access the virtual server just like any other Windows server.
Clustering Overview
A cluster is a group of individual servers, or nodes, configured to appear as a single, virtual server to both users and applications. The nodes making up the cluster run a common set of applications. They are physically connected by cables and programmatically connected by the clustering software.
Clusters provide the following advantages over standalone servers:
High availability Clusters avoid single points of failure. Applications can be distributed over more than one node, creating a high degree of parallelism and failure recovery.
Manageability Clusters appear as a single system to end users, applications, and the network, while providing a single point of control for administrators both locally and remotely.
Scalability You can increase the cluster's computing power by adding more processors or computers. Applications can also be scaled according to need as your organization grows.
Because of the inherent redundancy of hardware and software, clusters protect businesses from system downtime due to single points of failure, power outages, natural disasters, and even during routine system maintenance or upgrades. In addition, clusters help businesses eliminate penalties and other costs associated with not being able to meet Service Level Agreements.
A cluster is similar to a general distributed system, except that it provides the following additional capabilities:
• Every node has full connectivity and communication with the other nodes in the cluster through the following methods:
Hard disks on a shared bus One or more shared buses are used for storage. Each shared bus attaches one or more disks that hold data used to manage the cluster. Cluster service provides a dual-access storage model whereby multiple systems in the cluster can access the same storage.
Private network One or more private networks, or interconnects, carry internal cluster communication only (heartbeats). At least one private network is required.
Clustering Overview 11
Public network One or more public networks can be used as a backup for the private network and can be used both for internal cluster communication and to host client applications. Network adapters, known to the cluster as network interfaces, attach nodes to networks.
• Each node tracks cluster configuration. Every node in the cluster is aware when another node joins or leaves the cluster.
• Every node in the cluster is aware of the resources that are running locally and the resources that are running on the other nodes.
You can create clusters using nodes that have different numbers of CPUs, CPUs with different clock speeds, or even from nodes running different Integrity platforms. Diverse configurations are tested, qualified, and certified frequently by HP. The only limitations are that each node must be an HP Integrity system, and each node must have the same Host Bus Adaptors (HBAs), HBA drivers, and HBA firmware.
Server Cluster Versus Network Load Balancing
Windows Server 2003 provides two types of clustering services:
Server Cluster Available only in Windows Server 2003 Enterprise Edition or Datacenter Edition, this service provides high
availability and scalability for mission-critical applications such as databases, messaging systems, and file and print services. The nodes in the cluster remain in constant communication. If one of the nodes becomes unavailable because of failure or maintenance, another node
immediately begins providing service, a process known as failover. Users accessing the service continue to access it, unaware that it is now being provided from a different node. Both Windows Server 2003 Enterprise Edition and Datacenter Edition support server cluster configurations of up to eight nodes.
Network Load Balancing (NLB) Available in all editions of Windows Server 2003, this service load balances incoming IP traffic across clusters.
NLB enhances both the availability and scalability of Internet server-based programs such as web servers, streaming media servers, and Terminal Services. By acting as the load balancing infrastructure and providing control information to management applications built on top of Windows Management Instrumentation (WMI), NLB can integrate into existing web server farm infrastructures.
NLB clusters can scale to 32 nodes.
Table 1-1summarizes some of the differences between these two technologies.
Table 1-1 Server Cluster and NLB Features
NLB Server Cluster
Used for web servers, firewalls, and web services Used for databases, email services, line of business
(LOB) applications, and custom applications
Included with all versions of Windows Server 2003 Included with Windows Server 2003 Enterprise
Edition and Datacenter Edition
Provides high availability and scalability Provides high availability and server consolidation
Generally deployed on a single network, but can span multiple networks if properly configured
Can be deployed on a single network or a geographically distributed network
Table 1-1 Server Cluster and NLB Features (continued)
NLB Server Cluster
Supports clusters up to 32 nodes Supports clusters up to eight nodes
Doesn't require any special hardware or software Requires the use of shared or replicated storage
Server Cluster
Use a server cluster to provide high availability for mission critical applications through failover.
It uses a shared-nothing architecture, which means that a resource can be active on only one node in the cluster at any given time. Because of this, it is well suited to applications that maintain some sort of fixed state (for example, a database). In addition to database applications, enterprise resource planning (ERP), customer relationship management (CRM), online transaction processing (OLTP), file and print, email, and custom application services are typically clustered using server cluster.
When you deploy a server cluster, first configure the two to eight servers that will act as nodes in the cluster. Then configure the cluster resources that are required by the application you are clustering. These resources can include network names, IP addresses, applications, services, and disk drives. Finally, bring the cluster online so that it can begin processing client requests.
Most clustered applications and their associated resources are assigned to one cluster node at a time. If a server cluster detects the failure of the primary node for a clustered application, or if that node is taken offline for maintenance, the clustered application is started on a backup cluster node. Client requests are immediately redirected to the backup cluster node to minimize the impact of the failure.
NOTE: Though most clustered services run on only one node at a time, a cluster can run many services simultaneously to optimize hardware utilization. Some clustered applications can run on multiple Server Cluster nodes simultaneously, including Microsoft SQL Server.
NLB
Use NLB to provide high availability for applications that scale out horizontally and do not maintain a fixed state, such as web servers, proxy servers, and other services that need client requests distributed across nodes in a cluster. NLB uses a load balancing architecture, which means that a resource can be active on all nodes in the cluster at any given time.
NLB clusters do not use a quorum, and so they don't impose storage or network requirements on the cluster nodes. If a node in the cluster fails, NLB automatically redirects incoming requests to the remaining nodes. If you take a node in the cluster offline for maintenance, you can use NLB to allow existing client sessions to finish before taking the node offline. This eliminates any end-user impact during planned downtime. NLB can also weigh requests, which enables you to mix high-powered servers with legacy servers and ensure all hardware is efficiently used.
Most often, NLB is used to build redundancy and scalability for firewalls, proxy servers, or web servers, as illustrated inFigure 1-1. Other applications commonly clustered with NLB include virtual VPN endpoints, streaming media servers, and Terminal Services.
For more information about the key features of this technology and its internal architecture and performance characteristics, go to:
http://technet2.microsoft.com/WindowsServer/en/library/
98d46a24-96d8-412c-87d8-28ace62323d21033.mspx?mfr=true
Server Cluster Versus Network Load Balancing 13
Figure 1-1 NLB Example
Cluster Terminology
A working knowledge of clustering begins with the definition of some common terms. The following terms are used throughout this document.
Nodes
Individual servers or members of a cluster are referred to as nodes or systems (the terms are used interchangeably). A node can be an active or inactive member of a cluster, depending on whether or not it is currently online and in communication with the other cluster nodes. An active node can act as host to one or more cluster groups.
Cluster Service
Cluster service refers to the collection of clustering software on each node that manages all cluster-specific activity.
Shared Disks
Shared disks are devices (normally hard disk drives) that the cluster nodes are attached to by a shared bus. Applications, file shares, and other resources to be managed by the cluster are stored on the shared disks.
Resources
Resources are physical or logical entities (such as file shares) managed by the cluster software.
Resources can provide a service to clients or be an integral part of the cluster. Examples of resources are physical hardware devices such as disk drives, or logical items such as IP addresses, network names, applications, and services. Resources are the basic unit of management by the cluster service. A resource can only run on a single node in a cluster at a time, and is online on a node when it is providing its service on that node.
At any given time, a resource can exhibit only one of the following states:
• Offline
• Offline pending
• Online
• Online pending
• Failed
When a resource is offline, it is unavailable for use by a client or another resource. When a resource is online, it is available for use. The initial state of any resource is offline. When a resource is in one of the pending states, it is in the process of either being brought online or taken offline.
If the resource cannot be brought online or taken offline after a specified amount of time, and the resource is set to the failed state, you can specify the amount of time that cluster service waits before failing the resource by setting its pending timeout value in Cluster Administrator.
Resource state changes can occur either manually (when you use Cluster Administrator to make a state transition) or automatically (during the failover process). When a group fails over, the states of each resource are altered according to their dependencies on the other resources in the group.
Resource Dependencies
A dependency is a reliance between two resources that makes it necessary for both resources to run on the same node (for example, a Network Name resource depending on an IP address).
The only dependency relationships that cluster service recognizes are relationships between resources. Cluster service cannot be told, for example, that a resource depends on a Windows 2003 service; the resource can only be dependent on a resource representing that service.
Groups
Groups are a collection of resources to be managed as a single unit for configuration and recovery purposes. Operations performed on a group, such as taking it offline or moving it to another node, affect all resources contained within that group. Usually a group contains all the elements needed to run a specific application, and for client systems to connect to the service provided by the application.
If a resource depends on another resource, both resources must be a member of the same group.
For example, in a file share resource, the group containing the file share must also contain the disk resource and network resources (such as the IP address and NetBIOS name) to which clients connect to access the share. All resources within a group must be online on the same node in the cluster.
NOTE: During failover, entire groups are moved from one node to another node in the cluster.
A single resource cannot fail from one node to another.
Quorums
Each cluster has a special resource called the quorum. The quorum provides a means for arbitration leading to node membership and cluster state decisions. Only one node at a time can own the quorum. That node is designated as the primary node. When a primary node fails over to a backup node, the backup node takes ownership of the quorum.
The quorum resource also provides physical storage to maintain the configuration information for the cluster. This information is kept in the quorum log, which is a configuration database for the cluster. The log holds cluster configuration information, such as which servers are part of the cluster, what resources are installed in the cluster, and what state those resources are in. By default the quorum log is located at \MSCS\quolog.log.
NOTE: Quorums are only used in the server cluster, not in NLB. References to quorums throughout the remainder of this document apply to server clusters only.
The quorum resource is important in a cluster for the following reasons:
Consistency Because a cluster is multiple physical servers acting as a single virtual server, it is critical that each physical server have a consistent view of how the cluster is configured. The quorum acts as the definitive repository for all configuration information relating to the cluster. In the event that the cluster service is unable to read the quorum log, it cannot start because it is unable to guarantee that the cluster is in a consistent state.
Cluster Terminology 15
Arbitration The quorum is used as the tie-breaker to avoid split-brain scenarios. A split-brain scenario occurs when all network communication links between two or more cluster nodes fail. In these cases, the cluster can split into two or more partitions that cannot communicate with each other. The quorum then guarantees that any cluster resource is brought online on one node only. It does this by allowing the partition that owns the quorum to continue, while the other partitions are evicted from the cluster.
There are two types quorums: the single quorum and the Majority Node Set (MNS) quorum.
Single quorum clusters behave differently than MNS quorum clusters, so take care when choosing a model for your cluster.
For example, if you have only two nodes in your cluster, the MNS model is not recommended because failure of one node leads to failure of the entire cluster (a majority of nodes is impossible).
After the cluster is created, you cannot change the quorum from one type to another.
Single Quorum
A single quorum uses a quorum log file located on a single disk hosted on a shared storage interconnect that is accessible by all nodes in the cluster. Single quorums are available in Windows Server 2003 Enterprise Edition and Datacenter Edition.
IMPORTANT: The quorum must use a physical disk resource, as opposed to a disk partition, because the entire physical disk resource is moved during failover.
NOTE: You can configure clusters to use the local hard disk on one node to store the quorum, but this is only supported for testing and development purposes. Do not use this configuration in a production environment. Each node connects to the shared storage through some type of interconnect, with the storage consisting of either external hard disks (usually configured as RAID disks), or a storage area network (SAN), where logical slices of the SAN are presented as physical disks.
The following figure shows a single quorum in a four-node cluster.
Figure 1-2 Single Quorum Example
Single quorums are sufficient for most clusters. The following are typical single quorum situations:
Highly available data in a single location
Most customers that require their data to be highly available need this on a per-site basis only. If they have multiple sites, each site has its own cluster. Typical applications that use this type of cluster include Microsoft SQL Server, file shares, printer queues, and network services (for example, DHCP and WINS).
Stateful applications Applications or Windows NT services that require only a single instance at any time and require state information to be stored typically use single quorums, because they already have shared state information storage.
Connecting all nodes to a single storage device simplifies transferring control of the data to a backup node. Another advantage is that only one node must remain active for the cluster to function.
However, this architecture has several weaknesses. If the storage device fails, the entire cluster fails. If the storage area network (SAN) fails, the entire cluster fails. And while the storage device and SAN can be designed with complete redundancy to eliminate those possibilities, there is one component in this architecture that can never be truly redundant — the facility itself.
Floods, fires, earthquakes, extended power failures, and other serious problems cause the entire cluster to fail. If your business requires that work continue even if the facility is taken offline, a single quorum cluster solution will not meet your needs.
Majority Node Set (MNS) Quorum
A Majority Node Set (MNS) quorum appears to the server cluster as a single quorum resource.
However, the data is stored on the system disk of each node of the cluster by default. The clustering software ensures that the configuration data stored on the MNS is kept consistent across the different disks. MNS quorums are available in Windows Server 2003 Enterprise Edition and Datacenter Edition.
AsFigure 1-3shows, MNS clusters require only that the cluster nodes be connected by a network.
That network does not need to be a LAN. It can be a wide area network (WAN) or a virtual private network (VPN) connecting cluster nodes in different buildings or even cities. This enables the cluster to overcome geographic restrictions imposed by its storage connections.
Figure 1-3shows an MNS quorum in a four-node cluster.
Figure 1-3 MNS Quorum Example
Although the disks that make up the MNS can be disks on a shared storage fabric, the MNS implementation provided as part of Windows Server 2003 uses a directory on each node's local system disk to store the quorum data. If the configuration of the cluster changes, that change is reflected across the different disks. The change is committed, or made persistent, only if that change is made to the following:
(<Number of nodes configured in the cluster>/2) + 1
This ensures that a majority of the nodes have an up-to-date copy of the data. The cluster service itself only starts up and brings resources online if a majority of the nodes configured as part of the cluster are up and running the cluster service. If there are fewer nodes, the cluster does not have a quorum and the cluster service waits until more nodes join. Only when a quorum of nodes are available can the cluster service start up and bring the resources online. This ensures that the cluster always starts up with the most up-to-date configuration.
Cluster Terminology 17
In the case of a failure or split-brain, all partitions that do not contain an MNS quorum are terminated. This ensures that if there is a partition running that contains a majority of the nodes, it can safely start up any resources that are not running on that partition. Thus, it can be the only partition in the cluster that is running resources.
MNS quorums have strict requirements to ensure they work correctly. Be sure that you fully understand the issues involved in using MNS-based clusters before implementing this solution.
Use MNS quorums only in the following situations:
Geographically dispersed clusters A single MSCS cluster has members in multiple geographic sites. Though geographic clusters can use a standard quorum, presenting the quorum as a single, logical shared drive among all sites can be challenging. MNS quorums solve these challenges by enabling the quorum to be stored on the local hard disk.
NOTE: HP supports geographically dispersed clusters with shared quorums using Cluster Extension XP (CLX) and Continuous Access (CA) technology on XP storage products.
Clusters with no shared disks Some specialized configurations need tightly consistent cluster features without having sharing disks. For example:
• Clusters that host applications that can failover, but where another, application-specific method can keep data consistent between nodes (for example, database log shipping for keeping the database state up-to-date, or file replication for static data).
• Clusters that host applications that have no persistent data but must cooperate in a tightly coupled way to provide a consistent volatile state.
• Independent Software Vendors (ISVs): By abstracting storage from the cluster service, an MNS quorum provides ISVs with greater flexibility to design sophisticated cluster scenarios.
Heartbeats
Heartbeats are network packets periodically broadcast by each node over the private cluster network. Heartbeats inform other nodes of a single system's health, configuration, and network connection status. When heartbeat messages are not received among the other nodes as expected, the cluster service interprets this as node failure, and a failover begins.
Virtual Servers
Groups that contain an IP address resource and a network name resource (along with other resources) are published to clients on the network under a unique server name. Because these groups appear as individual servers to clients, they are called virtual servers. Users access applications or services on a virtual server the same way they access applications or services on a physical server. They do not need to know that they are connecting to a cluster and have no knowledge of which node they are connected to.
Failover
Failover is the process of moving a group of resources from one node to another in the case of a failure. For example, in a cluster where Microsoft Internet Information Server (IIS) is running on node A and node A fails, IIS fails over to node B of the cluster.
Failback
Failback is the process of returning a resource or group of resources to the node on which it was running before it failed over. For example, when node A comes back online, IIS can fail back from node B to node A.
Cluster Terminology 19
2 Administering the Cluster
This chapter provides step-by-step installation and configuration directions for HP Integrity clustered systems running Microsoft Windows Server 2003, 64-bit Edition.
Verifying Minimum System Requirements
To verify that you have all of the required software and firmware and have completed all the necessary setup tasks before beginning your cluster installation, complete the following steps:
1. Before installation, see the HP Cluster Configuration Support website for details about the components that make up a valid cluster configuration. A support matrix for each clustering solution lists all of the components necessary to provide a quality-tested and supported solution.
View these support matrices at:
http://h18004.www1.hp.com/solutions/enterprise/highavailability/answercenter/
configuration-all-list.html#03ei
Select the operating system and storage platform for your clustering solution. View the support matrix and verify that you have two or more supported HP Integrity servers, supported Fibre Channel Adapters (FCA), two or more supported network adapters, two supported Fibre Channel switches, and one or more supported shared storage enclosures.
Also verify that you have the required drivers for these components.
2. (This step is for servers running non-preloaded, Enterprise versions of the OS only.)
Use the Microsoft Windows Server 2003, 64-bit Enterprise Edition CD to install the OS on each of the nodes that will make up the clustered system. For more information about this step, see the appropriate “Windows on Integrity: Smart Setup Guide” document at:
http://docs.hp.com/en/windows.html
NOTE: Datacenter versions of Windows Server 2003 always come preloaded on HP Integrity servers, so this step does not apply to those systems, or to systems on which the Enterprise edition comes preloaded per the purchase agreement.
3. Use the Smart Setup CD to install the Support Pack on each node. This installs or updates the system firmware and operating system drivers. Insert the Smart Setup CD, click the Support Packtab, and follow the onscreen instructions.
4. Use the Smart Update CD (if shipped with your system) on each node to install any Microsoft quick fix engineering (QFE) updates or security patches that have been published for the operating system.
5. Locate your HP Storage Enclosure configuration software CD.
6. Locate your HP Storage Enclosure Controller firmware, and verify you have the latest supported version installed.
7. Locate your HP StorageWorks MultiPath for Windows software.
NOTE: You must use MultiPath software if you have redundant paths connected to your Fibre Channel storage. Installing more than one HBA per cluster provides multiple
connections between the clusters and your shared storage (seeFigure 2-1). Multiple HBAs, along with MultiPath software, are highly recommended because they provide continuous access to your storage system and eliminate single points of failure.
8. Locate your HP Fibre Channel switch firmware, and verify that you have the latest supported version installed.
Verifying Minimum System Requirements 21
9. Verify that you have sufficient administrative rights to install the OS and other software onto each node.
10. Verify that all of the required hardware is properly installed and cabled (seeFigure 2-1).
For information about best practices for this step, go to:
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/ServerHelp/
f5abf1f9-1d84-4088-ae54-06da05ac9cb4.mspx
NOTE: Figure 2-1is an example only. It might not represent the actual cabling required by your system.
11. Determine all the input parameters required to install your clustered system and record them in the table in“Gathering Required Installation Information” (page 23). Also see the Microsoft discussion of this topic at:
http://technet2.microsoft.com/windowsserver/en/library/
f5abf1f9-1d84-4088-ae54-06da05ac9cb41033.mspx?mfr=true
12. Run the Microsoft Cluster Configuration Validation Wizard (also known as “ClusPrep”) to test your collection of servers for proper configuration before you create your cluster.
ClusPrep will validate that your system is configured properly by taking inventory of your system configuration and highlighting discrepancies in service pack levels and driver versions, and evaluate and test your network and storage configuration. To download this tool, refer to:
http://www.microsoft.com/downloads/
details.aspx?FamilyID=bf9eb3a7-fb91-4691-9c16-553604265c31&DisplayLang=en
Figure 2-1 Example cluster hardware cabling scheme
Gathering Required Installation Information
UseTable 2-1to record the input parameters you need to install the OS and configure the cluster.
Record the information in the Value column next to each description.
Table 2-1 Installation and Configuration Input
Value Input Description
Node 2:
Node 1:
Node name
Node 4:
Node 3:
Node 6:
Node 5:
Node 8:
Node 7:
Gathering Required Installation Information 23
Table 2-1 Installation and Configuration Input (continued)
Value Input Description
Node 2:
Node 1:
Public network connection, IP address, and subnet mask
for each node IP address:
Subnet mask:
IP address:
Subnet mask:
Node 4:
Node 3:
IP address:
Subnet mask:
IP address:
Subnet mask:
Node 6:
Node 5:
IP address:
Subnet mask:
IP address:
Subnet mask:
Node 8:
Node 7:
IP address:
Subnet mask:
IP address:
Subnet mask:
Node 2:
Node 1:
Private network connection (cluster heartbeat), IP address, and subnet mask for each node
IP address:
Subnet mask:
IP address:
Subnet mask:
Node 4:
Node 3:
IP address:
Subnet mask:
IP address:
Subnet mask:
Node 6:
Node 5:
IP address:
Subnet mask:
IP address:
Subnet mask:
Node 8:
Node 7:
IP address:
Subnet mask:
IP address:
Subnet mask:
Table 2-1 Installation and Configuration Input (continued)
Value Input Description
Node 2:
Node 1:
WWID, slot number, and bus of each FCA
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
Node 4:
Node 3:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
Node 6:
Node 5:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
Node 8:
Node 7:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
FCA 1 WWID:
FCA 1 slot and bus:
FCA 2 WWID:
FCA 2 slot and bus:
Cluster name
IP address:
Subnet mask:
Cluster IP address and subnet mask
Default gateway IP address WINS server IP address DNS IP address
NOTE: For security reasons, do not record the password here.
Local system administrator password (used during OS installation)
Domain name
NOTE: For security reasons, do not record the password here.
Domain administrator user name and password (used during OS installation for machine to join domain)
NOTE: For security reasons, do not record the password here.
Domain account user name and password for cluster service (this account has special privileges on each cluster node)
Creating and Configuring the Cluster
The following sections describe the steps you follow to create and configure a cluster.
Creating and Configuring the Cluster 25
Configuring the Public and Private Networks
NOTE: Private and public NICs must be configured in different subnets, otherwise the cluster service and Cluster Administrator utility cannot detect the second NIC.
In clustered systems, node-to-node communication occurs across a private network, while client-to-cluster communication occurs across one or more public networks. To review the Microsoft recommendations and best practices for securing your private and public networks, go to:
http://technet2.microsoft.com/windowsserver/en/library/
f64e46ba-2d09-4f1a-ba9c-f2b1f71821eb1033.mspx?mfr=true
When configuring your networks, observe the following guidelines:
• Set your private network IP address to a unique, nonroutable value on each cluster, as discussed in Microsoft Knowledge Base Article Number 142863, “Valid IP Addressing for a Private Network”:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;142863
For example, use a valid private IP address of 10.1.1.1 with a subnet mask of 255.0.0.0 for node A, and an IP address of 10.1.1.2 with a subnet mask of 255.0.0.0 for node B, and so on.
• If possible, install your private and public NICs in the same slots of each node in the cluster (private NICs in same slot, public NICs in same slot, and so on).
• Use NIC Teaming to provide redundancy of your public network (see“NIC Teaming”
(page 28)).
NOTE: You cannot use NIC Teaming to provide redundancy for your private network. However, you can provide redundancy for your private network without having it fail over to the public network by configuring an additional NIC on each cluster member in a different, nonroutable subnet, and setting it for “Internal Cluster communication only”. This method is described in the Microsoft Knowledge Base Article Number 258750, “Recommended private “Heartbeat”
configuration on a cluster server,” at:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;258750
Private Network
Before configuring the private network for your cluster, see the following Microsoft Knowledge Base articles for Microsoft recommendations and best practices:
• Article number 258750, “Recommended Private “Heartbeat” Configuration on a Cluster Server,” at:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;258750
• Article Number 193890, “Recommended WINS Configuration for Microsoft Cluster Server,”
at:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;193890
• Article Number 142863, “Valid IP Addressing for a Private Network,” at:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;142863 To configure your private network, complete the following steps:
1. Right-click the My Network Places icon on your desktop, and select Properties.
2. Determine which Local Area Connection icon in the Network Connections window represents your private network. Right-click that icon, select Rename, and change its name to Private.
3. Right-click the Private icon and select Properties.
4. Click the General tab. Be sure that only the Internet Protocol (TCP/IP) checkbox is selected.
5. If you have a network adapter that transmits at multiple speeds, manually specify a speed and duplex mode. Do not use an autoselect setting for speed, because some adapters can drop packets while determining the speed. The speed for the network adapters must be hard set to the same speed on all nodes according to the card manufacturer specification.
If you are not sure of the supported speed of your card and connecting devices, Microsoft recommends setting all devices on that path to 10 MB per second and Half Duplex.
NOTE: All network adapters in a cluster attached to the same network must be configured identically to use the same Duplex Mode, Link Speed, Flow Control, and so on. Contact your adapter manufacturer for specific information about appropriate speed and duplex settings for your network adapters.
6. Click the Internet Protocol (TCP/IP), then click Properties.
7. Click the General tab, verify that you have selected a static IP address that is not on the same subnet or network as any other public network adapter.
For example, use 10.10.10.10 for the private adapters on node 1 and 10.10.10.11 on node 2 with a subnet mask of 255.0.0.0. Be sure to use a different IP address scheme than that used for the public network.
8. Verify that no values are defined in the Default Gateway or Use the following DNS server addressesfields, and click Advanced.
9. Click the DNS tab and verify that no values are defined in this field. Be sure that the Register this connection’s address in DNSand the Use this connection’s DNS suffix in DNS registrationcheckboxes are cleared.
10. Click the WINS tab and verify that no values are defined in this field. Click Disable NetBIOS over TCP/IP.
11. Close the dialog box, and click Yes if the following message appears:
This connection has an empty primary WINS address. Do you want to continue?
12. RepeatStep 1throughStep 11for all remaining nodes in the cluster, using different static IP addresses.
Public Networks
To configure your public networks, complete the following steps:
1. Right-click the My Network Places icon on your desktop, and select Properties.
2. Determine which Local Area Connection icon in the Network Connections window represents your public network. Right-click that icon, select Rename, and change the name to Public. If you have more than one public network (recommended), you can name them Public-1, Public-2, and so on. Repeat for each of your public network connection icons.
3. Right-click the Public icon and select Properties.
4. Click the General tab, click Internet Protocol (TCP/IP), then click Properties.
5. Click the General tab, and assign the IP Address and Subnet Mask values determined by your network administrator.
6. Click OK several times to implement the changes and exit the connection properties window.
7. Click Start→Settings→Control Panel and double-click Network Connections. In the Network Connections Advanced menu, select Advanced Settings. In the Connections box, verify that your connections are listed in the following order:
1. External public network
2. Internal private network and heartbeat 3. Remote access connections
Creating and Configuring the Cluster 27
NOTE: If your public network paths are teamed, you must put your teamed connection at the top of the list (instead of the external public network).
8. RepeatStep 1throughStep 7for each node in the cluster. Be sure to assign a unique IP address to each node while keeping the subnet mask the same for all nodes.
9. If you are running multiple public networks (for example, Public-1, Public-2, and so on), repeatStep 1throughStep 8for each network, until all are configured.
10. When all your public networks are configured, continue to the section,NIC Teaming.
NIC Teaming
To configure NIC teaming for multiple public networks, complete the following steps:
1. Double-click the Network Configuration Utility (NCU) icon in the lower right corner of your taskbar.
2. In the list that displays in the NCU main window, click each of your public network NICs (one per network), then click Team>Properties.
3. The Team Properties window appears. Accept the default settings or change them as determined by your network administrator, and then click OK to create the NIC team.
4. Right-click the My Network Places desktop icon and select Properties.
5. A new connection icon appears in the Network Connections window. This is the single teamed connection that represents the multiple networks you just teamed together.
6. Right-click the new icon, select Rename, and change the name to TEAM.
7. Right-click the TEAM icon and select Properties.
8. Click the General tab, click Internet Protocol (TCP/IP), then click Properties.
9. Click the General tab, and assign the IP Address and Subnet Mask values determined by your network administrator.
10. Click OK several times to implement the changes and exit the connection properties window.
NOTE: Microsoft does not recommend the use of NIC Teaming for private networks.
Preparing Node 1 for Clustering
To prepare node 1 for clustering, complete the following steps:
1. Power on and boot node 1.
2. Click Start→Settings→Control Panel→HP Management Agents.
3. Click the Services tab, select Clustering Information on the right side, then click Add to move it to the left side. The Cluster Agent service starts, which forwards cluster status information and makes it accessible from the System Management Homepage. Click OK.
4. Right-click the My Computer desktop icon and select Properties.
5. Click the Computer Name tab and click Change. Select the Domain Name radio button and enter the domain name determined by your network administrator. Reboot when prompted and log into the new domain.
Configuring the Shared Storage
To review the Microsoft recommendations and best practices for securing the shared data in your cluster, go to:
http://technet2.microsoft.com/windowsserver/en/library/
f64e46ba-2d09-4f1a-ba9c-f2b1f71821eb1033.mspx?mfr=true To configure the shared storage, complete the following steps:
1. Power on node 1 and log into the domain.
2. Install and configure your HP StorageWorks MultiPath for Windows software.
For an overview and general discussion of the MultiPath software, go to:
http://h18006.www1.hp.com/products/sanworks/secure-path/spwin.html
HP MultiPathing IO (MPIO) Device Specific Module software can be used as an alternative to HP StorageWorks Secure Path to provide multipath support.
NOTE: You must use MultiPath software if more than one host bus adapter (HBA) is installed in each cluster. The reason for installing more than one HBA per cluster is to provide multiple connections between the clusters and your shared storage (seeFigure 2-1). HP strongly recommends multiple HBAs and MultiPath software because they provide continuous access to your storage system and eliminate single points of failure.
3. Connect the node to the shared storage.
4. For details about creating logical drives, see your storage system user guide. Using those directions, create a logical drive with at least 510 MB of space.
NOTE: While the absolute minimum allowable size is 50 MB, Microsoft recommends at least 500 MB for the cluster quorum drive (specifying 510 MB ensures that the disk size is at least 500 MB of formatted space). The extra space in the logical drive is used for internal disk size calculations by your Storage Array Configuration Utility. For information about this topic, see the Microsoft Knowledge Base article EN-US280345, “Quorum Drive Configuration Information,” at:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;280345
Information about disk sizes is also available in the cluster node Help documentation.
Server clusters do not support GPT shared disks. For information, see the Knowledge Base article EN-US284134, “Server clusters do not support GPT shared disks,” at:
http://support.microsoft.com/default.aspx?scid=kb;en-us;284134
5. Select Start→Programs→Administrative Tools→Computer Management→Disk Management. Use this tool to create the NTFS partitions, making them the MBR type.
When running Disk Management, complete the following tasks:
• Allow Disk Management to write a disk signature when initializing the disk.
• Establish unique drive letters for all shared disks, typically starting in the middle of the alphabet to avoid local and network drive letters.
• Establish meaningful volume label name on shared disks, such as: Quorum Disk Q or SQL Disk S.
IMPORTANT: Do not upgrade the logical drives from Basic to Dynamic. Microsoft Cluster Services do not support Dynamic disks.
6. Close Disk Management for Microsoft Windows Server 2003, 64-bit Edition.
Preparing Node 2+ for Clustering
To prepare node 2 for clustering, complete the following steps:
1. Power on and boot node 2.
2. Click Start→Control Panel→HP Management Agents.
3. Click the Services tab, select Clustering Information on the right side, then click Add to move it to the left side. The Cluster Agent service starts, which forwards cluster status information and makes it accessible from the System Management Homepage.
4. Right-click the My Computer desktop icon and select Properties.
Creating and Configuring the Cluster 29
5. Click the Computer Name tab, and click Change. Select Domain Name and enter the domain name determined by your network administrator. Reboot when prompted and log into the new domain.
6. Install the MultiPath software on this node.
7. All other nodes should be powered Off before completing this step. Click
Start→Programs→Administrative Tools→Computer Management→Disk Management and select Disk Management. Use this tool to confirm that consistent drive letters and volume labels have been established by the first node.
8. RepeatStep 1throughStep 7for each of the remaining nodes in the cluster (up to a maximum of 8 nodes).
Creating the Cluster
To create the cluster using node 1, complete the following steps:
1. Power all the nodes off except node 1 and log into the domain.
2. On node 1, select Start→Programs→Administrative Tools→Cluster Administrator.
3. In the Action menu, select Create New Cluster, click OK, and click Next. The Cluster Creation wizard begins.
4. Assign a Cluster Name, keep the default value for Domain, and click Next.
5. Be sure the computer name appears in the Server Name list box and click Next. Cluster analysis begins.
6. The blue Tasks Completed bar grows longer during this process. The bar stays blue up until it fills completely, then it turns green. Click Next.
NOTE: If the Tasks Completed bar turns red at any time during this process, an error occurred and cluster analysis was aborted. See the log file to locate the source of the problem.
Debug or reconfigure as necessary and try again.
7. Enter the cluster IP address and click Next. The cluster IP address must be on the same subnet as the public network.
8. Enter the Cluster User Account Name and Password and click Next.
9. A detailed summary of the proposed cluster displays. Review this information, then click Nextto begin the cluster creation process.
10. The blue Tasks Completed bar grows longer during this process. The bar stays blue up until it fills completely, then it turns green. Click Next.
NOTE: If the Tasks Completed bar turns red at any time during this process, an error occurred and cluster analysis was aborted. See the log file to locate the source of the problem.
Debug or reconfigure as necessary and try again.
11. When the wizard finishes creating the cluster, click Next→Finish.
Joining Node 2+ to the Cluster
NOTE: Microsoft Windows Server 2003 supports a maximum of eight cluster nodes. Repeat the following steps for each additional node. These steps can be completed from node 1 or node 2+.
To join node 2+ to the cluster, complete the following steps:
1. Power on node 2, and log into the domain.
2. Select Start→Programs→Administrative Tools→Cluster Administrator.
3. Select File→Open Connection.
4. In the Action menu list, select Add Nodes to Cluster and click OK.
5. In the Welcome to Add Nodes wizard, click Next.
6. Enter the name of the node you want to add under Computer Name, click Add, then click Next. Cluster analysis begins.
NOTE: You can list all the nodes at the same time by entering the name of each one and clicking Add. This adds all nodes to the cluster in a single step. However, there is a risk with this method. If there is any kind of problem during the add process that causes it to abort, it is much more difficult to determine which node caused the problem. For this reason, HP recommends that you add the nodes one at a time.
7. The blue Tasks Completed bar grows longer during this process. The bar stays blue up until it fills completely, then it turns green. Click Next.
NOTE: If the Tasks Completed bar turns red at any time during this process, an error occurred and cluster analysis was aborted. See the log file to locate the source of the problem.
Debug or reconfigure as necessary and try again.
8. Enter the Password for the cluster service account and click Next.
9. A detailed summary of the proposed cluster displays. Review this information, then click Nextto begin the node addition process.
10. The blue Tasks Completed bar grows longer during this process. The bar stays blue up until it fills completely, then it turns green. Click Next.
NOTE: If the Tasks Completed bar turns red at any time during this process, an error occurred and cluster analysis was aborted. See the log file to locate the source of the problem.
Debug or reconfigure as necessary and try again.
11. When the wizard finishes adding the node, click Next, then click Finish.
12. RepeatStep 1throughStep 11for each node you want to add to the cluster.
Configuring Private and Public Network Role and Priority Settings
To configure the private and public network role settings, complete the following steps:
1. Select Start→Programs→Administrative Tools→Cluster Administrator.
2. In the left pane under Cluster Configuration→Networks, right-click Private and select Properties.
3. Click the General tab, and click Internal Cluster Communications only (private network).
Click Apply→OK.
4. In the same network folder (Cluster Configuration→Networks), right-click Public and select Properties.
5. Click the General tab, and click All communications (mixed network). Click Apply→OK to apply the changes.
To configure the network Priority settings, complete the following steps:
1. Select Start→Programs→Administrative Tools→Cluster Administrator.
2. In the left pane, right-click the cluster name and select Properties.
3. In the Network Priority tab, locate the Networks used list. Be sure the line labeled Private is at the top of this list, above the line labeled Public. If you need to change the order, select Private and click Move Up to move it upward. Click Apply→OK to apply the changes.
Creating and Configuring the Cluster 31
Validating Cluster Operation
To validate your cluster installation, use one or both of the following methods from any node in the cluster.
Method 1: Simulate a Failover
To simulate a failover, complete the following steps:
1. Select Start→Programs→Administrative Tools→Cluster Administrator and connect to the cluster.
2. If your cluster has only two nodes, right-click one of the cluster groups and select Move Group. If there are more than two nodes in your cluster, select Move Group, then choose which node you want to fail over to.
3. Verify that the group fails over and all resources come online.
4. Right-click the same cluster group and select Move Group.
5. Verify that the group fails over to the previous node and all resources come online.
6. RepeatStep 1throughStep 5for each resource group in the cluster, if desired.
Method 2: Run the Cluster Diagnostics and Verification Tool
This method of cluster validation is optional.
The Cluster Diagnostics and Verification tool (ClusDiag.exe) is a GUI utility that helps you perform diagnostic tests on preproduction or production clusters, then view the resulting log files to debug any problems or failures. You can also use it to generate cluster-related reports from the information gathered during the diagnostic process.
To use the Cluster Diagnostics and Verification Tool, complete the following steps:
1. Install the Microsoft cluster diagnostics tool (ClusDiag.exe). Download this tool at:
http://www.microsoft.com/downloads/
details.aspx?FamilyID=b898f587-88c3-4602-84de-b9bc63f02825&DisplayLang=en 2. Select Start→Programs→Cluster Diagnostics Tool.
3. Select Online, select a cluster name from the menu, and click OK.
4. Click Tools→Run Test, and in the Run Test dialog box select the type of test you want to run from the dropdown list. Click Launch. The two test types are as follows:
Basic Tests for a fixed period of time.
Regular Tests for a user-defined period of time.
5. Testing begins. Upon completion, click OK to see the results.
6. If desired, see the log file for more detailed information.
NOTE: If you experience problems validating your cluster using either of the methods listed above, see the Microsoft “MS Cluster Server Troubleshooting and Maintenance” information at:
http://www.microsoft.com/technet/archive/winntas/support/mscstswp.mspx Also see the “Troubleshooting cluster node installations” section at:
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/library/ServerHelp/
79c8164e-ee17-4e6d-a46f-f3db9869d9ea.mspx
Upgrading Individual Nodes
After your initial installation and configuration, you can upgrade the software and drivers installed on each node in your cluster and add the latest system updates and security fixes. This task must be done regularly to keep your Integrity servers up-to-date and secure.
With clustered systems, you can do maintenance even when users are online. Wait until a convenient, off-peak time when one of the nodes in the cluster can be taken offline for maintenance and its workload distributed among the remaining nodes. Before the upgrade, however, you must evaluate the entire cluster to verify that the remaining nodes can handle the increased workload.
Pick the node you want to upgrade, then use the Cluster Administrator to move all the clustered resources onto one or more of the remaining nodes. You can also use scripts to move resources.
Once all the resources have been failed over to the other nodes, the selected node is ready to upgrade. For more information about how to upgrade your Integrity servers with the latest drivers and QFEs, see the latest Smart Setup Guide for Integrity servers at:
http://docs.hp.com/en/hw.html#Windows%2064-bit%20on%20HP%20Integrity%20Servers Once the upgrade to the first node is complete, reboot it if necessary and move the resources back to it. As soon as possible, repeat this process to upgrade the other nodes in the cluster. This minimizes the amount of time the nodes are operating with different versions of software or drivers.
Upgrading Individual Nodes 33