• No results found

Remote device connections verification

In document Practical Guide for SAN with pseries (Page 159-162)

Chapter 4. Problem determination guide

4.4 Checking the pSeries server

4.4.2 Remote device connections verification

At this point, we are reasonably confident that no problems exist with the HBA, its associated drivers, and the SCSI I/O protocol devices. The next series of actions further refine the scope and nature of the problem. This is accomplished by a systematic check of all storage resources that are supposed to be available to the pSeries server, as shown in Figure 4-14 on page 142, Map 6. To be able to complete this confirmation, however, there are several assumptions that we make:

򐂰 All storage resources that should be available are known.

򐂰 The problem is not due to the addition or modification of storage resources.

򐂰 If a redundant path application (such as a SDD) is installed, the application was properly installed and previously working correctly.

To be able to determine if resources are missing, we must have a good idea of what is supposed to be available. If changes have been made due to the addition or modification of resource allocations, then it is evident that a configuration error has been introduced. In this case, troubleshooting can proceed directly to the area where the modifications were introduced. For example, a change in LUN masking in the IBM 2105 ESS will quickly be noticed and the masking changes carefully reviewed.

If a redundant pathway application, such as a SDD, is utilized and having problems, then the methodology and troubleshooting flows will naturally progress to checking that area of operations. As we have discussed earlier, the scope of the problem is determined in a methodical manner, and any lack of resources is a crucial data item to have. Even if our assumptions are not valid, using the presented methods will eventually progress to the point of checking the items that are the base of our assumptions.

Figure 4-14 Map 6 - Advanced troubleshooting (continued)

Step 1 for checking remote device connections uses the command:

lsdev -C | grep hdisk

This command lists the logical hard disks (hdisks) that are associated with any Fibre Channel adapters in the pSeries server. If there are no disk devices in the server configuration, skip this step and proceed to Step 2. Otherwise, the result should be similar to Example 4-5. The key item to monitor is the status of the hdisk. A known hdisk has an Available status, while problems with a hdisk results in the Defined status. Care must be taken to not confuse disk subsystems that

Example 4-5 Sample of available hdisks

# lsdev -C | grep hdisk | pg

hdisk0 Available 10-88-00-8,0 16 Bit LVD SCSI Disk Drive hdisk1 Available 10-78-01 IBM FC 2105E20

hdisk236 Available 10-78-01 EMC Symmetrix FCP Raid1 hdisk237 Available 10-78-01 EMC Symmetrix FCP Raid1

If any disk resources are missing, or listed with a Defined status, this information can be a clue about the problem and its scope. In this situation, the next action is to determine whether the missing disk(s) reside in the same physical storage device. For any missing disks, the reader should consult the SAN documentation to determine whether the disks all reside in the same physical disk storage unit.

Situations involving a missing subset of disks in a single device would indicate a configuration issue within the server or storage device itself. Thus, the scope of the problem has been localized to just two prime candidates for further

investigation. In situations involving more than one physical disk subsystem with missing resources, the server requires additional troubleshooting in its various configurations. If all logical disks from a given storage subsystem are not available, then the server or fabric is the possible cause.

After checking for available hdisks to the pSeries server, we proceed to Step 2 of Map 6. If tape systems have never been assigned to the pSeries server, this step is omitted. To check installed tape devices, issue the lsdev - Cc tape command.

Check the listing of devices to be sure that the tape units supposedly connected to the server do appear in the resulting output of the command (refer to

Example 4-6).

Example 4-6 Sample listing of available tape drive units

# lsdev -Cc tape

rmt0 Available 10-70-01 IBM 3590 Tape Drive and Medium Changer (FCP) rmt1 Available 10-70-01 IBM 3590 Tape Drive and Medium Changer (FCP) rmt2 Available 10-70-01 IBM 3590 Tape Drive and Medium Changer (FCP) rmt3 Available 10-78-01 IBM 3590 Tape Drive and Medium Changer (FCP) rmt4 Available 10-78-01 IBM 3590 Tape Drive and Medium Changer (FCP) rmt5 Available 10-78-01 IBM 3590 Tape Drive and Medium Changer (FCP)

To check the WWN of the actual attached tape unit to what is expected, issue the tapeutil -f /dev/rmt# qrypath command (where rmt# designates the device in question). A valid response is data indicating the device is a 3590 with its WWN.

Now that the investigations of remote device connections have been performed, the reader should have a much better idea of where to next investigate. In situations that still indicate the server as a possible cause, the next logical move is to investigate the SDD.

One main concept of the troubleshooting process is that the more information that is gathered about a problem, the better the problem and its cause are understood. With this improved understanding, the reader can start making more efficient decisions about which portions of the problem determination process can be quickly verified. At the same time, the more detailed locations for the problem will have detailed examinations. The reader should understand that we have provided a basic blueprint of sorts that can and should be flexible as more data is collected about the decision.

In document Practical Guide for SAN with pseries (Page 159-162)