After initial deployment of a virtualized infrastructure, you must often provide high availability for the services running within the environment. In its simplest form, you can provide a high-availability solution by configuring multiple Windows servers into a failover clustering configuration. This style of configuration can cluster up to 16 physical servers with Windows 2008 R2 and up to 64 physical servers with Windows 2012. The virtual machines that are configured on shared SAN storage then become resources that can be moved amongst the nodes.
You can implement Windows failover clustering for use with Hyper-V virtual machines in the same way as implementing the Windows cluster environment for other
applications such as SQL Server or Exchange Server. Virtual machines become another form of application that failover clustering can manage and protect.
Use the failover cluster management wizard to configure a new application that converts an existing virtual server instance into a highly available configuration. Use the option to configure a virtual machine as shown in Figure 20. Shut down the virtual machine to configure it for high availability, and locate all storage objects, including items such as ISO images that are mounted to the virtual machine, to SAN storage.
Failover clustering with Windows 2008 R2 assumes that access to storage objects from all nodes within the cluster is symmetrical. This means that all drive mappings, file locations, and mount points are identical, and during configuration, checks are made to ensure that this condition is met.
With failover clustering with Windows Server 2012, you can have asymmetrical storage configurations, where the same storage is not connected to all nodes in the cluster. Such configurations are possible in many geographically dispersed cluster scenarios. In this case, the cluster validation wizards only validate storage against nodes in a common site. Wizard failure results when mandatory requirements are not met. You will receive warnings when failover clustering is not able to verify some of these aspects, or when failure is likely. Read the warnings for information about how to fix the problems.
Windows failover clustering for Hyper-V servers
Figure 20. High Availability Wizard
After you import a virtual machine into failover clustering, manage and maintain the virtual machine through the failover cluster management interface. Avoid starting and stopping the virtual machine outside of the control of failover clustering. If the virtual machine shuts down outside of the control of failover clustering, the clustering software assumes that the virtual machine has failed and restarts the virtual machine.
Failover Cluster manager, where necessary, launches the required virtual machine management interfaces. Use failover clustering to manage all availability options and state changes for the virtual machine.
When you import a virtual machine instance into a high-availability configuration, the machine must include all related storage disk devices so that you can manage the virtual machine correctly. The High Availability Wizard fails if it is unable to include all storage configured for the virtual machine within the cluster environment. Configure all shared storage correctly across the cluster nodes. When you add disk storage devices, correctly configure the devices as shared storage within the cluster.
The primary goal of Windows Server failover clustering is to maintain availability of the virtual machine when the virtual machine becomes unavailable due to
unforeseen failures; however, this protection does not always maintain the virtual machine state through such transitions. As an example of this style of protection, consider the case of a physical node failure where one or more virtual machines were running. Windows failover clustering detects that the virtual machines are not
operational and that a node is no longer available and attempts to restart the virtual machines on a remaining node within the cluster configuration.
EMC Storage with Microsoft Hyper-V Virtualization 32 Availability for the virtual machine resources is ensured through the use of Windows failover clustering at the parent level; however, protection at the virtual machine level may not provide high availability for the applications running within the virtual machines. For example, a server instance cannot start if a virtual machine instance has corrupted files. The high-availability protection for the virtual machine can ensure that the virtual machine is running, but cannot ensure that the operating system itself, or the applications installed on the server, are accessible.
Windows failover clustering checks at the application level to ensure that services are accessible. For example, a clustered SQL Server instance continually undergoes
“Look Alive” and “Is Alive” checks to ensure that the SQL Server instance is
accessible to user connections. Implementing clustering within the virtual machines can provide this additional level of protection.
You cannot configure a Failover Cluster within virtual machines that are running Windows Server 2008 R2 or Windows Server 2012 using virtual disks or pass-through disks. This limitation is because of the filtering of the necessary SCSI-3 Persistent Reservation commands. However, you can form Windows Cluster configurations with virtual machines that are running Windows Server 2008 R2 with iSCSI shared storage devices. In such configurations, the iSCSI initiator is implemented within the child virtual machines, and the shared storage is defined on the iSCSI LUNs.
With Windows Server 2012 you can use both iSCSI and virtual Fibre Channel as shared storage within a virtual machine cluster. You can also use SMB file share storage with certain clustered applications, such as SQL Server. If you use SMB file share storage, you should also use SMB 3.0 based file shares.
With Windows Server 2012 R2, you can also use VHDs as shared storage between virtual machines that run Windows failover clustering. “Windows Server 2012 R2 new VHD features” on page 25 provides more information about the shared virtual hard disk feature.
Movement of virtual machines within a cluster was different for systems before Windows 2008 R2. When an administrator or an automated management tool requested a move, the virtual machine state was saved to a disk and then resumed after disk resources were moved to the target node. This move, or quick migration operation, took so long that outages often occurred, even though the virtual machine state would then resume.
With Windows Server 2008 R2 and Windows Server 2012 for failover cluster nodes, you can use the live migration functionality available with the clustering environment.
Live migrations move virtual machines transparently between nodes. Unlike quick migration move requests, there is no outage for a client application, and the migration between nodes is completely transparent. To achieve this level of client transparency, live migrations copy the memory state representing the virtual machine from one server to another so as to mitigate any loss of service.
Live migration configurations require a robust network configuration between the nodes within the cluster. This network configuration optimizes the memory copy between the nodes and enables an efficient virtual machine transition. For such live migration configurations, you must have at least one dedicated 1 Gb (or greater) network between cluster nodes to enable the memory copy. We also recommend that Windows failover
you dedicate specific private networks exclusively to live migration traffic, as shown in Figure 21. Networks that are disabled for cluster communication can still be used for live migration traffic. Deselect networks within the live migration settings if you do not want to use them.
Figure 21. Live Migration Settings window
When you use a live migration, failover clustering replicates the virtual machine configuration and memory state to the target node of the migration. Multiple cycles of replicating the memory state occur to reduce the amount of changes that need to be sent on subsequent cycles.
You can use live migration operations for virtual machines that contain virtual disks, pass-through disks, virtual Fibre Channel storage, or iSCSI storage as presented directly to the virtual machine. We recommend using CSVs for virtual disks, but they are not required for live migrations. You can migrate a virtual machine with dedicated storage devices that are used for virtual disk access. If you migrate a virtual machine, the virtual disks transition from offline to online on the target cluster node during the live migration process.
Network connectivity allows for the timely transfer of state, and the migration
process, as a final phase, momentarily suspends the machine instance, and switches all disk resources to the target node. After this process, the virtual machine
immediately resumes processing. The transition of the virtual machine is required to complete within a TCP/IP timeout interval such that no loss of connectivity is
experienced by client applications.
Note: The live migration process is different from the quick migration process because no suspension of virtual machine state to disk occurs. Failover clustering still provides support for quick migrations.
If the migration of the virtual machine cannot execute successfully, the migration process reverts the virtual machine back to the originating node. This also maintains the availability of the virtual machine to ensure that client access is not impacted.
You can also terminate a live migration by using the Cancel in progress Live Migration option in the Cluster Manager console.
EMC Storage with Microsoft Hyper-V Virtualization 34 Windows Server 2012 introduces a new type of live migration referred to as a shared-nothing live migration. This form of live migration allows for the movement of non-clustered virtual machines between Hyper-V hosts when there is no shared storage.
The migration can occur between hosts using local storage, SAN storage or SMB 3.0 file shares. If both hosts have access to the SMB file share, then no storage
movement is necessary. When non-shared storage is used, Hyper-V uses these steps to initiate a storage live migration:
1. Throughout most of the migration, reads and writes are serviced from the source virtual disks, while the contents of the source are copied, over the network, to the new destination VHDs.
2. Following the initial full copy of the source, writes are mirrored to the source and destination VHDs. Outstanding changes to the source are also replicated to the target.
3. When the source and target VHDs are synchronized, the virtual machine live migration begins, following the same process used for shared storage live migrations.
Offloaded Data Transfer can be used as a part of the migration. “Storage live migration” on page 34 provides more details.
4. When the live migration completes, the virtual machine runs from the destination server and the original source VHDs are deleted.
Virtual Machine Live Migration Overview, in Microsoft TechNet, provides more information about shared-nothing live migrations.
Starting with Windows Server 2012 you can migrate the virtual hard disk storage of a virtual machine between LUNs non-disruptively. You can migrate storage on stand-alone hosts or on Hyper-V clusters where virtual hard disks reside or will reside on CSVs or SMB 3.0 file shares. You can start the storage migration process from Hyper-V manager for stand-alone hosts, from Failover Cluster Manager for clustered hosts (as shown in Figure 22) or from PowerShell, by using the Move-VMStorage cmdlet. If SCVMM exists in the environment, you can start migrations from the SCVMM console or from PowerShell.
If the virtual machine that is being migrated is offline, the machine remains offline and the virtual hard disks are moved between the source and target. If the virtual machine that is being migrated is online, a live storage migration occurs, using the following process:
1. Throughout most of the migration, reads and writes are serviced from the source virtual disks while the contents of the source are copied to the new destination VHDs.
2. Following the initial full copy of the source, writes are mirrored to the source and destination VHDs. Outstanding changes to the source are also replicated to the target.
3. When the source and target VHDs are synchronized, the virtual machine begins using the target VHDs.
Shared-nothing live migration
Storage live migration
4. The original source VHDs are then deleted.
Figure 22. Storage Migration within a Hyper-V cluster
You can accelerate the storage migration process with ODX. If the storage array where the migration occurs supports ODX, the storage migration automatically runs ODX.
Using ODX greatly enhances the speed of the initial copy operation between the source and target devices. For EMC Symmetrix® VMAX, EMC VNX® and EMC VNXe® systems where ODX is supported, both the source and target must reside in the same storage array. EMC environments also require a Windows hotfix for Server 2012 support with ODX. The hotfix ensures that if ODX copy operations are rejected that the host based copy engages and resumes from where the ODX copy left off. The hotfix also corrects an issue with clustered storage live migration that can lead to data loss.
You can download the Update that improves cloud service provider resiliency in Windows Server 2012 hotfix from Microsoft Support at:
http://support.microsoft.com/kb/2870270. “Windows Server 2012 Offloaded Data Transfer” on page 47 provides more details.
You can use Windows Server 2008 R2 and Windows Server 2012 to configure shared SAN storage volumes so that all nodes within a given cluster configuration can access the volume concurrently. In this configuration, the volume is mounted as read/write to all nodes at the same time. The new model for allowing direct read/write access from multiple cluster nodes is called Cluster Shared Volumes (CSVs). CSV supports running multiple virtual machines on different nodes where the VHD storage devices are located on a commonly accessible storage device.
CSVs help make the transition process for VHD ownership during live migrations more efficient, as no transition of ownership and subsequent mounting is required, as is typical for cluster storage devices. The SAN storage configured as CSVs is mounted and accessible by all cluster nodes.
Windows failover clustering with Cluster Shared Volumes
EMC Storage with Microsoft Hyper-V Virtualization 36 The CSV feature is enabled by default in Windows Server 2012. You must enable the CSV feature in Windows Server 2008 R2. To enable the feature, select Enable Cluster Shared Volumes, or select Enable Cluster Shared Volumes from Failover Cluster Manager on a Windows Server 2008 R2 cluster, as shown in Figure 23.
Figure 23. Windows 2008 R2 CSV from Failover Cluster Manager
After you enable CSV, in Windows 2008 R2, a new Cluster Shared Volumes option appears in Failover Cluster Manager. In Windows Server 2012, you can access CSVs at Storage > Disks in Failover Cluster Manager. As shown in Figure 24, you can use this option to convert any disk within the available storage group to a CSV.
Figure 24. Add available storage to CSVs
For Windows Server 2008 R2 and Windows Server 2012, you must format a disk with NTFS to be added as a CSV. Resilient File System (ReFS) is not supported for CSV use on Windows Server 2012. For Windows Server 2012, the CSV files system is called
“CSVFS.” Although the name has changed, the underlying file system is still NTFS. If a CSV is removed from a cluster, the file system designation returns to NTFS, with all data on the file system remaining intact.
After you convert a SAN device to be used as a CSV volume, you can access the storage device on all cluster nodes. The CSV volume is mounted to a common, but local, location on all nodes, which ensures that the namespace to VHD objects is identical on all cluster nodes. The namespace attributed for each CSV volume is based on the system drive location, which must be the same for all cluster nodes. The namespace includes a ClusterStorage location, in which the volumes are
physically mounted on each node. The mount location is a sequentially generated name of the form Volume1 where the appended numeric value is incremented for each subsequent volume.
Note: You can rename the mount points assigned to CSVs. To rename the specified volume based mount point, select Rename from Windows Explorer. The new name appears on all nodes of the cluster.
All CSV devices list the current owner for the resource. The owner must coordinate access to the various VHD devices that represent virtual machine storage within the cluster. Virtual machines continue to run on only a single physical server at any time.
When a virtual machine that is deployed on CSV storage configured within the cluster is to be brought online, the node that is starting the virtual machine communicates with the CSV owner to request permission to generate I/O to the VHD device when the virtual machine is brought into operation. The node that starts the virtual machine locks the VHD device to ensure that no other process can write to the VHD from any other node. If the VHD has already been locked by another node, then the request is denied. When the CSV owner grants permission, the node generates direct I/O to the VHD on the storage device as needed by the virtual machine.
CSVs also protect against external failure scenarios, such as physical connectivity loss from a given node. If connectivity from a node is lost to the underlying storage,
EMC Storage with Microsoft Hyper-V Virtualization 38 I/O operations are redirected over the CSV network to the current owning node. This functionality prevents the failure of a virtual machine as a result of the loss of storage connectivity. While this functionality allows the virtual machine to continue
operating, this indirection should not be relied on to provide ongoing access to the virtual machine. Performance is affected when running in redirected mode, resolve the loss of connectivity, or execute a live migration.
CSVs are NTFS volumes, and have the same limits as NTFS. NTFS volumes and CSVs have a theoretical maximum of the largest NTFS volume of 256 TB. You can determine appropriate sizing for CSV volumes based on the cumulative workload expected from the VHD files located in the CSV.
The CSV is physically represented by a single LUN presented from a storage array. The LUN is supported by some number of physical disks within the array. Use the typical sizing for both storage allocation and I/O capacity to ensure that both the storage allocation for a given CSV and the I/O requirements are adequately met.
Undersizing the LUN for I/O load results in poor performance for all VHDs located on the CSV, and for all applications installed in the virtual machines that use the VHDs.
We recommend adding multiple CSVs to distribute workloads across available resources.
Windows Server 2012 includes Hyper-V Replica, a native replication technology for virtual machines. You can use Hyper-V Replica to enable asynchronous host-based replication of VHDs between standalone hosts or clusters. You can also use Hyper-V Replica to enable virtual machine replication between sites without shared storage.
Hyper-V Replica is useful for branch offices and for replicating virtual machines to hosted cloud providers.
When using Hyper-V Replica, you can enable or disable replication for each VHD. You can select data that you do not want to replicate, such as an operating system page file, and create a separate virtual disk that you configure for that workload and
When using Hyper-V Replica, you can enable or disable replication for each VHD. You can select data that you do not want to replicate, such as an operating system page file, and create a separate virtual disk that you configure for that workload and