Memory performance overview - Performance tuning getting started

Chapter 2. Performance tuning getting started

2.3 Memory performance overview

Memory in AIX is handled by the Virtual Memory Manager (VMM). The Virtual Memory Manager is a facility that makes real memory appear larger than its physical size. The virtual memory system is composed of real memory plus physical disk space where portions of memory that are not currently in use are stored.

The physical part of the virtual memory is divided into three types of

segments that reflect where the data is stored. This is symbolized in Figure 5.

Figure 5. VMM segments from a client perspective

The three types of segments are as follows:

• Persistent segment

A persistent segment persists after use by a process and has (and uses) permanent storage locations on disks. Files containing data or executable programs are mapped to persistent segments. AIX accesses all files as mapped files. This means that programs or file access is started with only a few initial disk pages, which are copied into virtual storage segments. Further pages are page-faulted in on demand.

• Working segment

A working segment is transitory and only exists during use by the owning process. It has no permanent disk storage location and therefore is stored to paging space if free page frames in real memory are needed. For example, kernel text segments and the process stack are mapped onto working segments.

• Client segment

A client segment is where the pages are brought in by CDRFS, NFS, or any other remote file system.

A process can use all of these segments, but from a process perspective, the VMM is logically divided into:

• Codeand Datasegments

The code segment is the executable. This could be placed in a persistent (local) or a client (remote executable) segment. The data segment is data needed for the execution, for example, the process environment.

• PrivateandSharedsegments

The private segment can be a working segment containing data for the process, for example, global variables, allocated memory, and the stack. Segments can also be shared among processes, for example, processes can share code segments, yet have private data segments.

Figure 6. VMM segments from a process perspective

From a process point of view, the memory is further divided into 16 segments, each pointed to by asegment register. These segment registers are

hardware registers located on the processor. When a process is active, the registers contain the addresses of the 16 segments addressable by that process. Each segment contains a specific set of information, as shown in Figure 7.

Figure 7. VMM memory registers

Each segment is further divided into 4069-byte pages of information. Each page sits on a 4 KB partition of the disk known as a slot. The VMM is

responsible for allocating real memory page frames and resolving references to pages that are not currently in memory. In other words, when the system needs to reference a page that is not currently in memory, the VMM is responsible for finding and resolving the reference of the disk frame. The VMM maintains a list of free page frames that is used to accommodate pages that must be brought into memory. In memory constrained

environments, the VMM must occasionally replenish the free list by moving some of the current data from real memory. This is called page stealing. A

page faultis a request to load a 4 KB data page from disk. A number of places are searched in order to find data.

First, the data and instruction caches are searched. Next, the Translation Lookaside Buffer (TLB) is searched. This is an index of recently used virtual addresses with their page frame IDs. If the data is not in the TLB, the Page Frame Table (PTF) is consulted. This is an index for all real memory pages,

and this index is held in pinned memory. The table is large; therefore, there are indexes to this index. The Hash Anchor Table (HAT) links pages of related segments, in order to get a faster entry point to the main PTF.

To the page stealer, memory is divided into Computational memory and File memory.

• Computational memory are pages that belong to the working segment or program text segment.

• File memory consists of the remaining pages. These are usually pages from the permanent data file in persistent memory.

The page stealer tries to balance these two types of memory usage when stealing pages. The page replacement algorithm can be manipulated. When starting a process, a slot is assigned, and when a process references a virtual memory page that is on the disk, the referenced page must be paged in and probably one or more pages must be paged out, creating I/O traffic and delaying the start up of the process. AIX attempts to steal real memory pages that are unlikely to be referenced in the near future, using a page

replacement algorithm. If the system has too little memory, no RAM pages are good candidates to be paged out, as they will be reused in the near future. When this happens, continuous pagein and pageout occurs. This condition is called thrashing.

When discussing memory, the allocation algorithm is commonly mentioned. The following is a discussion fromSystem Management Concepts: Operating System and Devices, SC23-4311, on the allocation algorithm:

The operating system uses the PSALLOC environment variable to determine the mechanism used for memory and paging space allocation. If the PSALLOC environment variable is not set, is set to null, or is set to any value other than early, the system uses the default late allocation algorithm.

The late allocation algorithm does not reserve paging space when a memory request is made; it approves the request and assigns paging space when pages are touched. Some programs allocate large amounts of virtual memory and then use only a fraction of the memory. Examples of such programs are technical applications that use sparse vectors or matrices as data structures. The late allocation algorithm is also more efficient for a real-time, demand-paged kernel such as the one in the operating system.

For Version 4.3.2 and later, the late allocation algorithm is modified to further delay the allocation of paging space. As mentioned previously,

before Version 4.3.2, paging space was allocated when a page was touched. However, this paging space may never be used, especially on systems with large real memory where paging is rare. Therefore, the allocation of paging space is delayed until it is necessary to page out the page, which results in no wasted paging space allocation. This does result, however, in additional overcommitment of paging space. On a system where enough virtual memory is accessed that paging is

necessary, the amount of paging space required may be as much as was required on previous releases.

It is possible to overcommit resources when using the late allocation algorithm for paging space allocation. In this case, when one process gets the resource before another, a failure results. The operating system attempts to avoid complete system failure by killing processes affected by the resource overcommitment. The SIGDANGER signal is sent to notify processes that the amount of free paging space is low. If the paging space situation reaches an even more critical state, selected processes that did not receive the SIGDANGER signal are sent a SIGKILL signal.

The user can use the PSALLOC environment variable to switch to an early allocation algorithm for memory and paging space allocation. The early allocation mechanism allocates paging space for the executing process at the time the memory is requested. If there is insufficient paging space available at the time of the request, the early allocation mechanism fails the memory request.

The new paging space allocation algorithm introduced with Version 4.3.2 is also named Deferred Page Space Allocation (DPSA). After a page has been paged out to paging space, the disk block is reserved for that page if that page is paged back into RAM. Therefore, the paging space percentage-used value may not necessarily reflect the number of pages only in the paging space, because some of them may be back in the RAM. If the page that was paged back in is the working storage of a thread, and if the thread releases the memory associated with that page or if the thread exits, then the disk block for that page is released. This affects the output for thepscommand

and thesvmoncommands on Version 4.3.3. For more information on the differences between Version 4.3.2 and Version 4.3.3 refer toCommands Reference - Volume 5, SBOF-1877 andSystem Management Concepts: Operating System and Devices, SC23-4311.

When working with memory performance tuning, the first command to use is usually vmstat.

The vmstat command

The vmstatcommand summarizes the total active virtual memory used by all of the processes running on the system, as well as the number of

real-memory page frames on the free list. Active virtual memory is defined as the number of virtual-memory working segment pages that actually have been touched. This number can be larger than the number of real page frames in the machine, because some of the active virtual-memory pages may have been written out to paging space.

When determining if a system is short on memory or if some memory tuning is required, use thevmstatcommand over a set interval and examine the pi and po columns on the resulting report. These columns indicate the number of paging space page-ins per second and the number of paging space page-outs per second. If the values are constantly non-zero, there may be a memory bottleneck. Having occasional non-zero values are not a concern, because paging is the main principle of virtual memory.

From a VMM tuning perspective, the middle (highlighted) columns are the most interesting. They provide information about the use of virtual and real memory and information about page faults and paging activity.

# vmstat 2 4

The highlighted columns are described Table 3.

Table 3. VMM related output from the vmstat command

kthr memory page faults cpu

--- --- --- --- --- r b avm fre re pi po fr sr cy in sy cs us sy id wa 0 0 16590 14475 0 0 0 0 0 0 101 9 8 50 0 50 0 0 1 16590 14474 0 0 0 0 0 0 408 2232 48 0 0 99 0 0 1 16590 14474 0 0 0 0 0 0 406 43 40 0 0 99 0 0 1 16590 14474 0 0 0 0 0 0 405 91 39 0 0 99 0 Column Description

avm Active virtual pages. fre Size of the free list. re Pager input/output list.

pi Pages paged in from paging space. po Pages paged out to paging space. fr Pages freed (page replacement).

For more information about thevmstatcommand, see Section 3.2, “The vmstat command” on page 60.

Another tool used in the initial phase of VMM tuning is thepscommand.

The ps command

Thepscommand can also be used to monitor the memory usage of individual processes. Theps v PIDcommand provides the most comprehensive report

on memory-related statistics for an individual process, as discussed in Section 3.3, “The ps command” on page 68.

In the previous discussion, the paging space function of VMM was mentioned. Thelspscommand is an useful tool to check paging space

utilization.

The lsps command

Thelspscommand displays the characteristics of paging spaces, such as the paging-space name, physical-volume name, volume-group name, size, percentage of the paging space used, whether the space is active or inactive, and whether the paging space is set to be automatically initiated at system boot. The following is an example of thelspscommand using the -a flag. The -s flag is useful when a summary and total percentage used over several disks is required.

# lsps -a

Page Space Physical Volume Volume Group Size %Used Active Auto Type

hd6 hdisk2 rootvg 1024MB 1 yes yes lv

When finding problems with memory usage, thesvmoncommand provides a

more detailed report on what process are using what segments of memory.

The svmon command

Thesvmoncommand provides a more in-depth analysis of memory usage. It is more informative, but also more intrusive, than thevmstatand pscommands.

The svmoncommand captures a snapshot of the current state of memory. sr Pages scanned by page-replacement algorithm. cy Clock cycles by page-replacement algorithm.

Column Description

A large portion of real memory is utilized as a cache for file system data. It is not unusual for the size of the free list to remain small.

There are some significant changes in the flags and in the output from the

svmoncommand between AIX Version 4.3.2 and Version 4.3.3. This is discussed in more detail in Section 3.5, “The svmon command” on page 77. The command to use when tuning memory management isvmtune.

The vmtune command

The memory management algorithm tries to keep the size of the free list and the percentage of real memory occupied by persistent segment pages within specified bounds. These bounds can be altered with the vmtunecommand,

which can only be run by the root user. Changes made by this tool remain in effect until the next system reboot. More information on thevmtunecommand can be found in Section 6.5, “The vmtune command” on page 193.

To test how much (or, perhaps, little) memory is needed for a certain server load, use the rmsscommand.

The rmss command

The rmsscommand simulates a system with various sizes of real memory, without having to extract and replace memory boards. By running an application at several memory sizes and collecting performance statistics, you can determine the memory needed to run an application with acceptable performance. The rmsscommand can be invoked for the following purposes.

• To change the memory size and then exit. This lets you experiment freely with a given memory size.

• To function as a driver program. In this mode, the rmsscommand executes a specified command multiple times over a range of memory sizes, and displays important statistics describing command performance at each memory size. The command can be an executable or shell script file, with or without command line arguments.

In document Performance (Page 45-53)