One of the primary functions of the failover feature is to provide path
management. If there is more than one path from the server to the controller, some multipath drivers are also able to spread the I/Os (input/output) between the paths. Please check the documentation with the multipath failover driver for this support.
Note: The connections between the hosts and the storage subsystems in the following figures are meant to illustrate the concept of multipath drivers. These are not recommendations.
Figure 13 on page 106 and Figure 14 on page 107 show how the I/Os flow in the optimal single and two paths from server to controller environment.
Host Application
Multipath Driver
HBA HBA
Controller B 1
2
Controller A 3
4
dg1gq010
Figure 13. I/O flow in an optimal single path
Figure 14 also illustrates that I/Os to the logical drive can be round robin through all of the available paths if the multipath driver supports.
Failover
The multipath drivers monitor the data path to the storage subsystem if they are not working correctly or in case of multiple link errors. If multipath drivers detect either of these conditions, it checks the path table for the redundant paths and controller. The failover driver performs a path failover if alternate paths to the same controller are available. Figure 15 on page 108 shows that the multipath driver uses only one of the two paths to the controller because the other path fails.
If all of the paths to a controller fail, the multipath driver performs a controller failover as shown in Figure 16 on page 109 and Figure 17 on page 110. Here when controller A fails, the multipath driver moves the ownership of the logical drive from controller A to controller B. The controller B then receives and process all I/Os to the logical drive.
Host Application
Multipath Driver
Controller B 1
2
dg1gq014
3 4
Controller A
5
HBA 3
HBA 4 HBA
1
HBA 2
Figure 14. I/O flow in optimal two path
Host Application
Multipath Driver
Controller B 1
2
dg1gq015
4
Controller A
HBA 3
HBA 4 HBA
2
3 HBA
1
Figure 15. Use of one path when the other fails.
.
Host Application
Multipath Driver
HBA HBA
Controller B 1
2
3
dg1gq011
4 5
6
7 Controller A
Figure 16. Failover of I/O in a single path environment
Depending on the controller firmware and the multipath driver code, the
multipath driver performs different actions for the controller failover depending on the enable failover mode set by selecting the appropriate host as per Table 22 on page 114. There are three controller failover modes depending on the versions of controller firmware:
1. Automatic Volume Transfer (AVT/ADT) failover mode - If the host type is set to enable AVT/ADT failover mode, the multipath driver will redirect I/Os to the surviving controller. You can set the surviving controller to take ownership of the logical drive and process I/Os. Ownership can be set regardless of whether the failed controller is up and running. This is similar to the case in which all paths to a controller fail,, or the controller itself fails. Controller firmware version 7.77.xx.xx or earlier supports this failover mode.
2. RDAC failover mode - If the host type is set to disable AVT/ADT or non-ALUA, the multipath driver will issue a mode page 2C to the surviving controller to move the ownership of the logical drive to the surviving controller. Then, the surviving controller will take the ownership of the logical drive and process I/Os on it. The surviving controller will take ownership of the logical drive no matter whether the other controller is up and running or not as in the cases where all paths to the controller fail or the controller is itself failed. This failover mode is supported with all versions of controller firmware.
3. Asymmetric Logical Unit Access (ALUA) mode - With controller firmware version 7.83.xx.xx and later, if the host type is set to enable ALUA, the multipath driver will just redirect I/Os to the surviving controller. If the
“FAILED” controller is up and running as in the case where the paths to the
Host Application
Multipath Driver
Controller B 1
2
dg1gq016
5
Controller A
HBA 3
HBA 4 3
6
7 HBA
1
HBA 2
4
Figure 17. Failover of I/O in a multipath environment
controller failed but the controller is itself still optimal, the surviving controller will ship the IO to the “FAILED” controller for processing instead of taking the ownership of the logical drive and process I/Os on it. If this condition persists more than 5 minutes, the surviving controller will stop IO shipping to the other controller for processing and take the ownership and the processing of the I/Os to the logical drive.
The advantages of ALUA are:
v “Boot from SAN” server will not failed during boot because the boot LUN is not the path or is not owned by the controller that is on the path that the server first scans during the server boot process. Boot from SAN server is the server whose operating system disk resides in one of the logical drive of the storage subsystem instead of internally inside the server chassis.
v Eliminating unnecessary logical drive failovers/failbacks if there are intermittent shot duration (<5 mins) path interruptions.
v Preventing ”LUN ping-pong” in certain conditions where the logical drives are mapped to servers in a cluster environment.
v The logical drive is operated as active-active in a dual controller configuration. The I/Os can be sent to both controllers for processing irrespective of which controller owns the logical drive. In RDAC or
AVT/ADT failover mode, only the controller that owns the logical drive can process I/Os to that logical drive. This is also referred to as active-passive operating mode in a dual controller configuration.
Figure 18 illustrates the failover when all paths to the controller fail but the controller is itself optimal in AVT/ADT and RDAC failover modes. In this failover scenario, the logical drive ownership is transferred to controller B and controller B processes all I/Os to the logical drives even when controller A is up and optimal
Host Application
Multipath Driver
HBA
Controller B 1
2
dg1gq012
4 3
5 Controller A
HBA
Figure 18. All paths to controller fail in AVT/ADT and RDAC failover modes
and the failure is caused only by the path failure from the host to controller A.
Figure 19 and Figure 20 on page 113 illustrate failover when all paths to the controller fail but the controller is itself optimal in ALUA failover mode.
During the first five minutes of the failover, the I/Os to the logical drive are shipped internally to Controller A for processing as shown in Figure 19. Controller A is still the owner of the logical drive. After five minutes, if the path to Controller A is still failed, Controller B takes the ownership and the processing of the I/Os to the logical drive as shown in Figure 20 on page 113.
Host Application
Multipath Driver
Controller B 1
2
dg1gq017
5
Controller A
HBA 3
HBA 4 4
6
7
8 HBA
1
HBA 2
3
Figure 19. All paths to controller fail in ALUA failover mode. First five minutes of the failover.
When operating in failover modes 1 and 2 above, the dual controller in the storage subsystem operate in an active-passive combination from the mapped LUN
perspective. This means I/O can be only sent to the controller that owns the mapped LUN for processing. The other controller will be in standby mode until either the LUN-owning controller failed or all paths to the LUN-owning controller failed. I/Os sent to the controller that does not own the mapped LUN will either cause the LUN to failover that controller (AVT/ADT mode) or be failed by the controller (RDAC mode). In ALUA failover mode, the dual controllers are now operated as active-active combination from the mapped LUN perspective. I/Os can be sent to both controllers for processing instead of just to the owning controller.
The LUN non-owning controller does not have to operate in standby/passive mode until the LUN-owning controller failed. The I/Os are automatically routed internally to the controller that owns the LUNs for processing. In addition, the LUN ownership changes only when one controller processes more than 75% of the I/Os to the LUN within a 5 minute period in ALUA mode.
Controller firmware versions 7.77.xx.xx and earlier support AVT/ADT and RDAC failover modes. Controller firmware version 7.83.xx.xx and later support only RDAC and ALUA failover modes. AVT/ADT mode is not supported in controller firmware version 7.83.xx.xx or later. Note the same controller NVSRAM bit in the host type region is used to enable AVT/ADT or ALUA. Depending on the version of the controller firmware, that bit is either enabled AVT/ADT or ALUA failover mode. To enable which failover mode, the appropriate host type must be selected
Host Application
Multipath Driver 1
2
dg1gq018
HBA 1
HBA 2
HBA 3
HBA 4
Controller B 4
5 6
7 3
Controller A
Figure 20. All paths to the controller fail in ALUA mode. Five minutes into the failure
for the server host partition. The following table lists the host type for various OS-es and which fail over mode that was enabled for that host type:
Table 22. Failover mode for each Operating System Host
Index
Host type (full
name) Host type (short name)1 ADT/AVT2 RDAC ALUA2
0 Default Base No Yes No
1 MacOS MacOS No Yes No
2 Windows Server
10 Unused 10/Irix3 Unused10/Irix No Yes No
11 Unused
17 HP-UX TPGS HPXTPGS No Yes No
18 Linux <Linux Non-ADT>
LNX <Linux Non-ADT> No Yes No
19 IBM I/Os IBM i No Yes No
20 Onstor Onstor Yes No No
Table 22. Failover mode for each Operating System (continued) Host
Index
Host type (full
name) Host type (short name)1 ADT/AVT2 RDAC ALUA2
21 Windows ALUA W2KALUA No No Yes
22 Linux ALUA LNXALUA No No Yes
23 AIX ALUA w/
1. The actual name might be slightly different depending on the version of the NVSRAM file loaded. However, the host type index should be the same across all versions.
2. Even though the same NVSRAM bit enables either ADT/AVT depending on the controller firmware, only ALUA the host types (host index 21-27) must be used for enabling ALUA failover mode because of additional ALUA-specific settings are required.
3. Irix and Netware failover host types are defined in the NVSRAM files for controller firmware version 7.77.xx.xx or earlier. For controller firmware 7.83.xx.xx or later, Netware and Irix servers are not supported as host attachment; therefore, these host type were changed to 'Unused'.
Failback
The multipath drivers also monitor the status of the failed paths periodically and failback the logical drive to the preferred controller once the failed path is restored.
In the case that some of multiple paths to the controller failed and then restored, the multipath driver will start using the restored path again to send I/Os. The multipath driver use the same mode (AVT/ADT, RDAC or ALUA) as described in the failover section to move the logical drive back to preferred controller.
The automatic logical drive failback feature of the multipath driver could be disabled in server clustered configurations to prevent 'LUN ping-pong' between controllers problem in certain failover scenarios.