Chapter 1. Components of the logical volume manager
2.4 Mirroring of the rootvg
This section will attempt to show in detail why the rootvg mirroring has such special treatment in the AIX system management.
Add a Logical Volume
Type or select values in entry fields. Press Enter AFTER making all desired changes.
[TOP] [Entry Fields] Logical volume NAME [mirror2lv] * VOLUME GROUP name mirrorvg
* Number of LOGICAL PARTITIONS [10] # PHYSICAL VOLUME names [] + Logical volume TYPE []
POSITION on physical volume middle + RANGE of physical volumes minimum + MAXIMUM NUMBER of PHYSICAL VOLUMES [] # to use for allocation
Number of COPIES of each logical 2 + partition
Mirror Write Consistency? yes + Allocate each logical partition copy yes + on a SEPARATE physical volume?
RELOCATE the logical volume during yes + reorganization?
Logical volume LABEL []
MAXIMUM NUMBER of LOGICAL PARTITIONS [512] # Enable BAD BLOCK relocation? yes + SCHEDULING POLICY for reading/writing parallel + logical partition copies
Enable WRITE VERIFY? no + File containing ALLOCATION MAP []
Stripe Size? [Not Striped] + [BOTTOM]
2.4.1 Brief explanation about the AIX boot sequence
Before describing the AIX boot sequence, several key definitions are given here:
blv It is the abbreviation for boot logical volume also known as hd5. It contains the minimal file system needed to begin the boot of a system (This file system is held in compressed format, and only a limited number of commands, like savebase and bosboot can
manipulate it). An important portion of this minimal file system is the mini-ODM.
bootrec It is the first disk block on the boot disk that the Initial Program Load Read Only Storage (IPL ROS) reads to find the reference to the blv. It is defined in /usr/include/sys/hd_psn.h as follows:
root@pukupuku:/ [320] # grep IPL /usr/include/sys/hd_psn.h
#define PSN_IPL_REC 0 /* PSN of the IPL record */ savebase This command will update the mini-ODM that resides on the same
disk as /dev/ipldevice.
The relationship between the bootrec, the blv, and the mini-ODM is illustrated in Figure 44.
Figure 44. Three important components for booting rootvg
The bootrec is read by the IPL ROS code, it tells the ROS that it needs to jump X bytes into the disk platter to read the boot logical volume, hd5.
bootrec
savebase will
update the mini-ODM mini-ODM
X bytes hd5 hdisk0
This is the first reason why the rootvg mirroring requires special treatment. The bootrec is not synchronized by AIX LVM, since it is not a logical volume.
The processor starts reading in the mini file system to start the AIX system boot. Note that this is a very simplified view of the boot process, but enough for a basic understanding of the upcoming conflicts.
During the processing of the mini-AIX file system, there is a mini-ODM read into the RAM. When the real rootvg file system comes online, AIX merges the data held in the mini-ODM with the real ODM held in /etc/objrepos in the root file system.
Whenever an LVM command is executed that will change the mini-ODM, there is a command called savebase that is run at the successful completion of the LVM command (represented as a) in Figure 44 on page 133). The savebase takes a snapshot of the main ODM and compresses it and picks out only what is needed for boot, such as LVM information concerning logical volumes in rootvg, and so on. It takes this compressed boot image and looks at /dev/ipldevice and finds out what disk this represents. Typically on most systems, /dev/ipldevice is a hard link to /dev/hdisk0 (which contains the rootvg volume group):
root@pukupuku:/ [323] # ls -l /dev/ipldevice /dev/*hdisk0
brw--- 1 root system 15, 1 Sep 03 07:50 /dev/hdisk0 crw--- 2 root system 15, 1 Apr 24 18:38 /dev/ipldevice crw--- 2 root system 15, 1 Apr 24 18:38 /dev/rhdisk0 The /dev/ipldevice is really needed to indicates which disk holds hd5.
2.4.2 Contiguity of the boot logical volume
As described in earlier, the blv (boot logical volume or hd5) is read by the IPL ROS code directly during the boot phase. The IPL ROS code will completely bypass the LVM access routines, and access the memory in sequence, assuming that all this code is stored in a unique place. With the increase of the IPL code, it may happen that one partition is not enough. If you add a new partition to the blv, you must make sure that the second partition will be next to the first one, the IPL code won’t be able to jump from partition 1 to partition 10, for example.
To confirm that the partitions of the blv are contiguous, you have to use following command:
root@pukupuku:/ [128] # lslv -m hd5 hd5:N/A
LP PP1 PV1 PP2 PV2 PP3 PV3 0001 0001 hdisk0
In our case, the blv is made of one logical partition and resides in the first physical partition on hdisk0. If this output shows multiple lines (this means that the blv was composed of multiple logical partitions), you may have to organize it to occupy contiguous regions.
2.4.3 Dump device
The dump device is a raw logical volume (usually having sysdump as its logical volume type) for the sake of the AIX kernel writing its memory image (also known as core image) upon system crash phase.
In AIX, the dump device should reside in rootvg, and it is set to /dev/hd6 by default. The /dev/hd6 is also the initial paging device. After the system has crashed, during the reboot phase, the boot procedure will copy this dump image from the dump device to the /var/adm/ras directory as a file named vmcore to prevent the VMM (virtual memory manager) from overwriting its space. Further details about the system dump, please consult the AIX Version
4.3 Problem Solving Guide and Reference, SC23-4123.
In this section, the following assumptions are made (actually, these assumptions are reasonable from a system management perspective): • The rootvg is mirrored using two physical volumes (hdisk0, hdisk1) which
have same size.
• You have made /dev/hd7 a primary dump device on hdisk0, and /dev/sysdumpnull the second dump device using following command: # sysdumpdev -p /dev/hd7 -P
# sysdumpdev -s /dev/sysdumpnull -P
# sysdumpdev -K (it enables the user initiated system dump) # shutdown -Fr (to take effects above change)
Then, you will see the following system dump configuration: root@pukupuku:/ [336] # sysdumpdev -l
primary /dev/hd7
secondary /dev/sysdumpnull copy directory /var/adm/ras forced copy flag TRUE
always allow dump TRUE dump compression OFF
Before AIX Version 4.3.3, there is no complete support for the mirrored dump device. Since, if the system dumped, the dump image was written to the first mirror copy only without any staleness, bypassing the LVM device driver. The dump image is certainly written in the first mirror copy, but when it was read,
LVM simply started to pass back randomly-selected mirror copies for each logical partition, so the data came back in a corrupted state.
Actually, these work-arounds are performed by the mirrorvg command automatically after AIX Version 4.2.1 (before this release, there was no mirrorvg command provided. Thus, you had to follow the official step-by-step method shown in Appendix A, “Mirroring of the rootvg” on page 353).
With AIX Version 4.3.3, you can use the readlvcopy command to read an individual mirror copy of any logical volumes. Hence, readlvcopy is now used in the snap command to copy the dump data from the first mirror copy only, and thus obtain the original dump data without corruption.
Chapter 3. Striping
This chapter describes the striping function included in the AIX Logical Volume Manager. We will see the concepts and study real examples.The mirroring and striping function, introduced in AIX Version 4.3.3, is also described. The performance issues of using striped logical volumes are covered in Chapter 7, “Performance” on page 303.