• No results found

Kernel Internals Student Notes

N/A
N/A
Protected

Academic year: 2021

Share "Kernel Internals Student Notes"

Copied!
404
0
0

Loading.... (view fulltext now)

Full text

(1)

AIX 5L Kernel Internals

(Course Code BE0070XS)

Student Notebook

ERC 4.0

IBM Certified Course Material

eServer UNIX Technical Education

cover

(2)

The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

© Copyright International Business Machines Corporation 2001, 2003. All rights reserved.

This document may not be reproduced in whole or in part without the prior written permission of IBM.

Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions Trademarks

The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies:

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks or registered trademarks of International Business Machines Corporation in the United States, or other countries, or both:

ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United States, other countries, or both.

Intel is a trademark of Intel Corporation in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a registered trademark of Linus Torvalds in the United States and other countries. Other company, product and service names may be trademarks or service marks of others.

AIX® AIX 5L™ AS/400®

Chipkill™ DB2® DFS™

Electronic Service Agent™ IBM® iSeries™ LoadLeveler® NUMA-Q® PowerPC®

pSeries™ PTX® RS/6000®

S/370™ Sequent® SP™

zSeries™

(3)

TOC

Contents

Trademarks . . . ix

Course Description . . . xi

Agenda . . . xiii

Unit 1. Introduction to the AIX 5L Kernel. . . 1-1

Unit Objectives . . . 1-2 Operating System and the Kernel . . . 1-3 Kernel Components . . . 1-5 Address Space . . . 1-7 Mode and Context . . . 1-9 Context Switches . . . 1-11 Interrupt Processing . . . 1-13 AIX 5L Kernel Characteristics . . . 1-16 AIX 5L Execution Environment . . . 1-18 System Header Files . . . 1-20 Conditional Compile Values . . . 1-22 Checkpoint . . . 1-24 Exercise . . . 1-25 Unit Summary . . . 1-26

Unit 2. Kernel Analysis Tools . . . 2-1

Unit Objectives . . . 2-2 What tools will you be using in this class? . . . 2-3 The Major Functions of KDB are: . . . 2-4 Enabling the Kernel Debugger . . . 2-6 Verifying the Debugger is Enabled . . . 2-8 Starting the Debugger . . . 2-9 System Dumps . . . 2-10

kdb . . . . 2-13 Checkpoint . . . 2-15 Exercise . . . 2-16 Unit Summary . . . 2-17

Unit 3. Process Management . . . 3-1

Unit Objectives . . . 3-2 Parts of a Process . . . 3-3 Threads . . . 3-5 1:1 Thread Model . . . 3-7 M:1 Thread Model . . . 3-8 M:N Thread Model . . . 3-9 Creating Processes . . . 3-11 Creating Threads . . . 3-13

(4)

Process State Transitions . . . .3-15 The Process Table . . . .3-18 pvproc . . . .3-20

pv_stat . . . .3-21 Table Management . . . .3-22 Extending the pvproc . . . . .3-24 PID Format . . . .3-26 Finding the Slot Number . . . .3-28 Kernel Processes . . . .3-29 Thread Table . . . .3-31 pvthread Elements . . . .3-33 TID Format . . . .3-34 u-block . . . .3-35 Six Structures . . . .3-37 Thread Scheduling Topics . . . .3-39 Thread State Transitions . . . .3-40 Thread Priority . . . .3-43 Run Queues . . . .3-45 Dispatcher and Scheduler Functions . . . .3-46 Dispatcher . . . .3-47 Scheduler . . . .3-48 Preemption . . . .3-49 Preemptive Kernels . . . .3-51 Scheduling Algorithms . . . .3-53 SMP - Multiple Run Queues . . . .3-56 NUMA . . . .3-58 Memory Affinity . . . .3-60 Global Run Queues . . . .3-62 Checkpoint . . . .3-64 Exercise . . . .3-65 Unit Summary . . . .3-66

Unit 4. Addressing Memory. . . 4-1

Unit Objectives . . . .4-2 Memory Management Definitions . . . .4-3 Pages and Frames . . . .4-4 Address Space . . . .4-6 Translating Addresses . . . .4-8 Segments . . . .4-9 Segment Addressing . . . .4-11 32-bit Hardware Address Resolution . . . .4-13 64 Bit Hardware Address Resolution . . . .4-15 Segment Types . . . .4-16 Shared Memory . . . .4-19 shmat Memory Services . . . .4-21 Memory Mapped Files . . . .4-23 32-bit User Address Space . . . .4-26 32-bit Kernel Address Space . . . .4-28

(5)

TOC 64-bit User/Kernel Address Space . . . 4-29

Checkpoint . . . 4-31 Exercise. . . . 4-32 Unit Summary . . . 4-33

Unit 5. Memory Management . . . 5-1

Unit Objectives . . . 5-2 Virtual Memory Management (VMM) . . . 5-3 Object Types . . . 5-5 Demand Paging . . . 5-7 Data Structures . . . 5-10 Hardware Page Mapping . . . 5-12 Page not in Hardware Table . . . 5-13 Page on Paging Space . . . 5-15 External Page Table (XPT) . . . 5-16 Loading Pages From the File System . . . 5-18 Object Type / Backing Store . . . 5-20 Paging Space Management Process . . . 5-21 Paging Space Allocation Policy . . . 5-23 Free Memory . . . 5-25 Clock Hand Algorithm . . . 5-27 Fatal Memory Exceptions . . . 5-29 Checkpoint . . . 5-30 Exercise . . . 5-31 Unit Summary . . . 5-32

Unit 6. Logical Partitioning. . . 6-1

Unit Objectives . . . 6-2 Partitioning . . . 6-3 Physical Partitioning . . . 6-5 Logical Partitioning . . . 6-7 Components Required for LPAR . . . 6-9 Operating System Interfaces . . . 6-13 Virtual Memory Manager . . . 6-14 Real Address Range . . . 6-15 Real Mode Memory . . . 6-17 Operating System Real Mode Issues . . . 6-19 Address Translation . . . 6-21 Allocating Physical Memory . . . 6-23 Partition Page Tables . . . 6-25 Translation Control Entries . . . 6-27 Hypervisor . . . 6-29 Dividing Physical Memory . . . 6-31 Checkpoint . . . 6-33 Unit Summary . . . 6-34

(6)

What is the Purpose of LFS/VFS? . . . .7-3 Kernel I/O Layers . . . .7-5 Major Data Structures . . . .7-7 Logical File System Structures . . . .7-9 User File Descriptor . . . .7-11 The file Structure . . . .7-13 vnode/vfs Interface . . . .7-15 vnode . . . .7-17 vfs . . . .7-19 root (l) and usr File Systems . . . .7-21

vmount . . . .7-23 File and File System Operations . . . .7-25 gfs . . . .7-27 vnodeops . . . .7-29 vfsops . . . .7-31

gnode . . . . .7-33 kdb devsw Subcommand Output . . . .7-35 kdb volgrp Subcommand Output . . . .7-37 AIX lsvg Command Output . . . .7-39 kdb lvol Subcommand Output . . . .7-40 AIX lslv Command Output . . . .7-44 kdb pvol Subcommand Output . . . .7-46 AIX lspv Command Output . . . .7-48 Checkpoint (1 of 2) . . . .7-49 Checkpoint (2 of 2) . . . .7-50 Exercise . . . .7-51 Unit Summary . . . .7-52

Unit 8. Journaled File System . . . 8-1

Unit Objectives . . . .8-2 JFS File System . . . .8-3 Reserved Inodes . . . .8-7 Disk Inode Structure . . . .8-9 In-core Inodes . . . .8-11 Direct (No Indirect Blocks) . . . .8-15 Single Indirect . . . .8-17 Double Indirect . . . .8-18 Checkpoint . . . .8-19 Unit Summary . . . .8-20

Unit 9. Enhanced Journaled File System . . . 9-1

Unit Objectives . . . .9-2 Numbers . . . .9-3 Aggregate and Fileset . . . .9-4 Aggregate . . . .9-6 Allocation Group . . . .9-9 Fileset . . . .9-11 Inode Allocation Map . . . .9-13

(7)

TOC Extents . . . 9-14

Increasing an Allocation . . . 9-16 Binary Tree of Extents . . . 9-18 Inodes . . . 9-20 Inline Data . . . 9-26 Binary Trees . . . 9-27 More Extents . . . 9-28 Continuing to Add Extents . . . 9-29 Another Split . . . 9-30 fsdb Utility . . . 9-32 Exercise . . . 9-34 Directory . . . 9-35 Directory Root Header . . . 9-37 Directory Slot Array . . . 9-39 Small Directory Example . . . 9-41 Adding a File . . . 9-42 Adding a Leaf Node . . . 9-43 Adding an Internal Node . . . 9-44 Checkpoint . . . 9-45 Exercise . . . 9-46 Unit Summary . . . 9-47

Unit 10. Kernel Extensions . . . 10-1

Unit Objectives . . . 10-2 Kernel Extensions . . . 10-3 Relationship With the Kernel Nucleus . . . 10-5 Global Kernel Name Space . . . 10-6 Why Export Symbols? . . . 10-9 Kernel Libraries . . . 10-11 Configuration Routines . . . 10-13 Compiling and Linking Kernel Extensions . . . 10-15 How to Build a Dual Binary Extension . . . 10-19 Loading Extensions . . . 10-21 sysconfig() - Loading and Unloading . . . 10-22 sysconfig() - Configuration . . . 10-23 sysconfig() - Device Driver Configuration . . . 10-24 The loadext() Routine . . . 10-26 System Calls . . . 10-28 Sample System Call - Export/Import File . . . 10-30 Sample System Call - question.c . . . 10-31 Sample System Call - Makefile . . . 10-32 Argument Passing . . . 10-33 User Memory Access . . . 10-35 Checkpoint . . . 10-38 Exercise . . . 10-39 Unit Summary . . . 10-40

(8)

Appendix A. Checkpoint Solutions . . . A-1

Appendix B. KI Crash Dump . . . B-1

Unit Objectives . . . B-2 Crash Dumps . . . B-3 Process Flow . . . B-5 About This Exercise . . . B-6

(9)

TMK

Trademarks

The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies:

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks or registered trademarks of International Business Machines Corporation in the United States, or other countries, or both:

ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United States, other countries, or both.

Intel is a trademark of Intel Corporation in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a registered trademark of Linus Torvalds in the United States and other countries. Other company, product and service names may be trademarks or service marks of others.

AIX® AIX 5L™ AS/400®

Chipkill™ DB2® DFS™

Electronic Service Agent™ IBM® iSeries™ LoadLeveler® NUMA-Q® PowerPC®

pSeries™ PTX® RS/6000®

S/370™ Sequent® SP™

(10)
(11)

pref

Course Description

AIX 5L Kernel Internals Concepts

Duration: 5 days

Purpose

This is a course in basic AIX 5L Kernel concepts. It is designed to provide background information useful to support engineers and AIX development/application engineers who are new to the AIX 5L Kernel environment as implemented in AIX releases 5.1 and 5.2. This course also provides background knowledge helpful for those planning to attend the AIX 5L Device Driver (Q1330) course.

Audience

— AIX technical support personnel

— Application developers who want to achieve a conceptual understanding of AIX 5L Kernel Internals

Prerequisites

Students are expected to have programming knowledge in the C programming language, working knowledge of AIX system calls, and user-level working knowledge of AIX/UNIX, including editors, shells, pipes, and Input/Output (I/O) redirection. Additionally knowledge of basic system administration skills is required, such as the use of SMIT, configuring file systems and configuring dump devices. These skills can be obtained by attending the following courses or through equivalent experience:

— Introduction to C Programming - AIX/UNIX (Q1070) — AIX 5L System Administration II: Problem Determination

(AU16/Q1316)

In addition, the following courses are helpful: — KornShell Programming (AU23/Q1123)

(12)

Objectives

At the end of this course you will be able to: — List the major features of the AIX 5L kernel

— Quickly traverse the system header files to find data structures — Use the kdb command to examine data structures in the memory

image of a running system or system dump

— Understand the structures used by the kernel to manage processes and threads, and the relationships between them

— Describe the layout of the segmented addressing model, and how logical to physical address translation is achieved

— Describe the operation of VMM subsystem and the different paging algorithms

— Describe the mechanisms used to implement logical partitioning — Understand the purpose of the logical file system and virtual file

system layers and the data structures they use

— List and describe the components and function of the JFS2 and JFS file systems

(13)

pref

Agenda

Day 1

Welcome

Unit 1 - Introduction to the AIX 5L Kernel lecture Exercise 1 - Introduction to the AIX 5L Kernel Unit 2 - Kernel Analysis Tools lecture

Exercise 2 - Kernel Analysis Tools

Day 2

Daily review

Unit 3 - Process Management lecture Exercise 3 - Process Management Unit 4 - Addressing Memory lecture

Day 3

Daily review

Exercise 4 - Addressing Memory Unit 5 - Memory Management lecture Exercise 5 - Memory Management Unit 6 - Logical Partitioning lecture

Day 4

Daily review

Unit 7 - LFS, VFS and LVM lecture Exercise 6 - LFS, VFS and LVM Unit 8 - Journaled File System lecture

Unit 9 - Enhanced Journaled File System - Topic 1 lecture Exercise 7 - Enhanced Journaled File System - Topic 1 Unit 9 - Enhanced Journaled File System - Topic 2 lecture Exercise 8 - Enhanced Journaled File System - Topic 2

Day 5

Daily review

Unit 10 - Kernel Extensions lecture Exercise 9 - Kernel Extensions

(14)
(15)

Uempty

Unit 1. Introduction to the AIX 5L Kernel

What This Unit Is About

This unit describes the purpose, concepts and features of the AIX 5L kernel.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe the role the kernel plays in an operating system

• Define user and kernel mode and list the operations that can only be performed in kernel mode

• Describe when the kernel must make a context switch • Describe the role of the mstsave area in a context switch • Name the execution environments available on each of the

platforms supported by AIX 5L

• Using the system header files, identify data element types for each of the available kernels in AIX 5L

How You Will Check Your Progress

Accountability:

• Exercises using your lab system • Check-point activity

• Unit review

References

The Design of the UNIX Operating System, by Maurice J. Bach, ISBN: 0132017997 AIX Online Documentation:

(16)

Figure 1-1. Unit Objectives BE0070XS4.0

Notes:

Unit Objectives

At the end of this unit you should be able to:

Describe the role the kernel plays in an operating system

Define user and kernel mode and list the operations that can

only be performed in kernel mode

Describe when the kernel must make a context switch

Describe the role of the mstsave area in a context switch

Name the execution environments available on each of the

platforms supported by AIX 5L

Using the system header files, identify data element types for

each of the available kernels in AIX 5L

(17)

Uempty

Figure 1-2. Operating System and the Kernel BE0070XS4.0

Notes:

Operating system

The principal purpose of the AIX operating system is to provide an environment where application programs can be executed. This mainly involves the management of hardware resources including memory, CPU and IO.

Kernel

The kernel is the base program of the operating system. It acts as intermediary between the application programs and the computer hardware. It provides the system call interface allowing programs to request use of the hardware. The kernel prioritizes these requests and manages the hardware through its hardware interface.

Operating System and the Kernel

Process Kernel

tty

CPU CPU CPU system call Interface hardware Interface Process Process

(18)

The kernel is the key program

The operating system is made up of many programs including the kernel. It is safe to say that the kernel is the most important part of the operating system; if the kernel is not running nothing else in the operating system can function. This class discusses the internal working of the kernel in the AIX 5L operating system.

(19)

Uempty

Figure 1-3. Kernel Components BE0070XS4.0

Notes:

Introduction

The kernel may be broken up into several sections based on the services provided to applications programs. Each of these sections are discussed in this class. The kernel components are shown in the visual above.

Process management

The process management function of the kernel is responsible for the creation, and termination of processes and threads, along with scheduling threads on CPUs.

Virtual memory management

The Virtual Memory Management (VMM) function of the kernel is responsible for managing all aspects of virtual and physical memory by processes and the kernel. This includes allocating physical page frames to virtual pages, providing space for file

Kernel Components

Virtual memory managment

tty

Disk CPU user kernel Process managment Applications File systems I/O Subsystem

Device driver Device driver

Buffered I/O

Raw I/O Buffered I/O

Disk space managment (LVM)

(20)

system buffering and keeping track of which process memory is resident in physical memory and which is stored on disk.

I/O subsystem

Parts of the kernel that interact directly with I/O devices are called device drivers. Typically each type of device installed on the system will require its own device driver. Device drivers are covered in detail in a separate class on writing device drivers.

Disk space management

The management of disk space in AIX is handled by a layer above the disk’s drivers. The Logical Volume Manger (LVM) provides the function of disk space management.

File system

AIX supports several types of file systems including JFS, JFS2, NFS and several CD-ROM file systems. The file system software interacts with the disk space management software. This class covers the JFS and JFS2 file systems.

(21)

Uempty

Figure 1-4. Address Space BE0070XS4.0

Notes:

Introduction

AIX implements a virtual memory system. Addresses referenced by a user program do not directly reference physical memory; instead they reference a virtual address.

Virtual address space

By using the concept of virtual memory, each process on the system can appear to have its own address space that is separate and isolated from other processes. A process’ address space contains both user- and kernel-memory addresses.

Memory management

Virtual addresses are mapped by the hardware to a physical memory address.

Translation tables are used by the hardware to map virtual to physical addresses. The address translation tables are controlled by the kernel. One set of address translation

Address Space

Address space Address space Address space

user kernel

(22)

tables is kept for each process. To switch from one process’ address space to another, the kernel loads the appropriate address translation table into the hardware.

(23)

Uempty

Figure 1-5. Mode and Context BE0070XS4.0

Notes:

Introduction

Two key concepts of mode and environment are described in this section.

Mode

The computer hardware provides two modes of execution; a privileged kernel mode and a less-privileged user mode. Application programs must run in user mode thus are given limited access to the hardware. The kernel, as you would expect, runs in kernel mode. The following table compares these two modes.

Mode and Environment

User mode

Kernel mode

Application

code

Process

Environment

Interrupt

Environment

Hardware interrupt System Call

Kernel code

Invalid combination - interrupts always run in kernel mode

(24)

Environment

The AIX kernel may execute in one of two environments: process environment or interrupt environment. In process environment, the kernel is running on behalf of a user process. This generally occurs when a user program makes a system call, although it is also possible to create a kernel-mode only process. When the kernel responds to an interrupt, it is running in the interrupt environment. In this context the kernel cannot access the user address space or any kernel data related to the user process that was running on the processor just before the interrupt occurred.

User mode Kernel mode

Memory access is limited to the user’s private memory. Kernel memory is not accessible.

Can access all memory on the system.

I/O instructions are blocked. All I/O is performed in kernel mode. Can’t modify hardware registers related to

memory management.

Memory management registers may be modified.

Interrupts must be handled in kernel mode.

(25)

Uempty

Figure 1-6. Context Switches BE0070XS4.0

Notes:

Introduction

A context switch is the action of exchanging one thread of execution on a CPU for another.

Thread of execution

Threads of execution are simply logical paths through the instructions of a program. The AIX kernel manages many threads of execution by switching the CPUs between the different threads on the system.

Context Switches

mstsave Saved: y CPUs registers y stack pointer y instruction pointer CPU mstsave Saved: y CPUs registers y stack pointer y instruction pointer Thread 1 Thread 2 context switch

(26)

Context switches

Context switches can occur at two points: a. A hardware interrupt occurs.

b. Execution of the thread is blocked waiting for the completion of an event.

mstsave

The context of the running thread must be saved when a context switch occurs. This context includes information such as the values of the CPU registers, the instruction address register and stack pointer. This information is saved in a structure called the mstsave (machine state save) structure. Each thread of execution has an associated mstsave structure.

Restoring a context

When a thread is restored (switched in), the system register values stored in the mstsave of the thread are loaded into the CPU. The CPU then performs a branch instruction to the address of the saved instruction pointer.

(27)

Uempty

Figure 1-7. Interrupt Processing BE0070XS4.0

Notes:

Introduction

A hardware interrupt results in a temporary context switch. Each time an interrupt occurs, the current context of the processor must be saved so that processing can be continued after handling the interrupt.

mstsave

pool

Interrupts can occur when the CPU is currently processing an interrupt; therefore, multiple mstsave areas are needed to save the context of each interrupt. AIX keeps a pool of mstsave areas to use. This is because a thread structure has an mstsave structure, however an interrupt is a transient entity and does not have its own thread structure.

Interrupt Processing

csa

mstsave mstsave mstsave threads mstsave unused (next interrupt goes here) high priority interrupt low priority interrupt base interrupt level current save area

(28)

csa pointer

Each processor has a pointer to the mstsave area it should use when an interrupt occurs. This pointer is called the current save area, or csa pointer.

Interrupt history

When AIX receives an interrupt that is of higher priority than the one it is currently handling it must save the current state in a new mstsave area linking the new save area to the previous one. This forms a history of interrupt processing.

Interrupt processing

Saving context

When an interrupt occurs, the steps AIX takes to save the currently running context are:

Unwinding the interrupts

As the processing of each interrupt is completed the chain of mstsave areas are unlinked. Working backwards from the highest priority interrupt to the lowest and finally to the base-level mstsave. The last or base-level mstsave in the chain is the

mstsave of the thread that was running when the first interrupt occurred. The steps to restore a context are shown in this table.

Step Action

1. Save the current context in the mstsave area pointed to by the CPU’s csa.

2. Get the next available mstsave area from the pool. 3. Link the just used mstsave to the new mstsave.

4. Update the CPU’s csa pointer to point to the new mstsave area.

Step Action

1.

If returning to the base interrupt level and the interrupt has made a thread runnable, invoke the dispatcher. The dispatcher will move the thread originally on the end of the MST chain back to the run queue, and place the best runnable thread at the end of the MST chain.

2. Return the current mstsave area to the pool.

3. Set the CPU’s csa pointer to the previous mstsave area. 4. Reload the registers from the processing the context.

(29)

Uempty

Finding the current mstsave

The csa always points to an unused mstsave area. This mstsave will be used if a higher-priority interrupt occurs. The data in this mstsave will not be valid except for its pointer to the next mstsave in the chain. The last used mstsave area can be located by following the prev pointer from the mstsave pointed to by the csa.

(30)

Figure 1-8. AIX 5L Kernel Characteristics BE0070XS4.0

Notes:

Introduction

The AIX kernel was the first mainstream UNIX operating system to implement several important features. These features are listed above.

Preemptable

Preemptable means that the kernel can be running in kernel mode (running a system call for example) and be interrupted by another more important task. Preemption causes a context switch to another thread inside the kernel. Many other UNIX kernels will not allow preemption to occur when running in kernel mode. This can result in long delays in the processing of real time threads. AIX improves real time processing by allowing for preemption in kernel mode. As an example, Linux does not support preemption when in kernel mode.

AIX 5L Kernel Characteristics

Preemptable kernel

Pageable kernel memory

Dynamically extensible kernel

(31)

Uempty

Pageable

Not all of the kernel’s virtual memory space needs to be resident in physical memory at all times. Portions of the kernel memory may be paged out to disk when not needed. This allows for better utilization of physical memory. The ability to page kernel memory is a feature not found in all UNIX kernels. Most kernels support the paging of

user-virtual-address space. AIX supports paging both user- and kernel-address space. As an example, the kernel memory of the Linux operating system is resident in physical memory at all times.

Pinning memory

Some areas of the kernel’s memory must stay resident meaning they may not be paged to disk. Areas of memory that are not subject to paging are called pinned memory; for example, portions of device drivers must be pinned in memory.

Extensible

The AIX kernel is dynamically extensible. This means that not all the code required for the kernel needs to be included in a single binary (/unix). Portions of the kernel’s code

will be loaded at runtime. Dynamically loaded modules are called kernel extensions. Kernel extensions typically add functionality that may not be needed by all systems. This keeps the kernel smaller and requires less memory. Kernel extensions can include:

- Device drivers

- Extended system calls - File systems

(32)

Figure 1-9. AIX 5L Execution Environment BE0070XS4.0

Notes:

Introduction

AIX 5L supports both 32-bit and a 64-bit execution environments. On 32-bit hardware platforms only the 32-bit environment can be used, but on 64-bit platforms either can be used. The key to this 64-bit platform flexibility is that a 64-bit VMM (Virtual Memory Manager) is run in both cases, using left zero fill of addresses for the 32-bit kernel environment.

32-bit and 64-bit kernel

The primary advantage of the 64-bit kernel is the increased kernel address space. This allows systems to support increased workloads. However, there is an added cost to managing a 64-bit address space. Not all applications will require the increased address space of the 64-bit kernel. In these cases, a 32-bit kernel is provided.

AIX 5L Execution Environment

User Kernel 32-bit Applications 32-bit Kernel 32-bit Hardware 32-bit Applications 64-bit Applications 32-bit Kernel 64-bit Hardware 32-bit Applications 64-bit Applications 64-bit Kernel 64-bit Hardware

(33)

Uempty

Selecting a kernel

The file /unix is a link to the kernel image file that is loaded at boot time. Depending on the hardware type and kernel type (32-bit or 64-bit) the link will point to the appropriate file as shown in this table.

User applications

Both 32-bit and 64-bit applications are supported when running on 64-bit hardware, regardless of the kernel that is running.

User commands

User level commands included with the AIX 5L operating system are designed to work with either the 32-bit or 64-bit kernel. However, some commands require both a 32-bit and a 64-bit version. These are typically commands that must work directly with the internal structures of the kernel. For these commands, the 32-bit version of the command will determine if a 32-bit or 64-bit kernel is running. If a 64-bit kernel is detected, then a 64-bit version of the command is started. The steps are shown in this table.

Kernel extensions

Only 64-bit kernel extensions are supported under the 64-bit kernel. Only 32-bit kernel extensions are supported under the 32-bit kernel. All kernel extensions must be SMP safe. Earlier versions of AIX supported running non-SMP safe kernel extensions on SMP hardware using a mechanism called funneling. Funneling is not supported on the 64-bit AIX 5L kernel.

Hardware platform Kernel type Kernel file

32-bit or 64-bit 32-bit /usr/lib/boot/unix_mp

/usr/lib/boot/unix_up

64-bit 64-bit /usr/lib/boot/unix_64

Step Action

1. 32-bit version of command is run by user.

2. The 32-bit command checks the kernel type (32- or 64-bit).

3.

If a 64-bit kernel is detected, then the 64-bit version of the command is run. For example, under the initial release of AIX 5.1 the command

vmstat would run the command vmstat64. In later versions of AIX 5.1, and in AIX 5.2, vmstat (along with other performance commands) uses a performance tools API.

(34)

Figure 1-10. System Header Files BE0070XS4.0

Notes:

Introduction

The system header files contain the definition of structures that are used by the AIX kernel. We will reference these files throughout this class, since they contain the C language definitions of the structures we will be describing.

Finding header files

The drawing above shows the location of the system header files.

System Header Files

/ (root) usr include sys jfs j2 stdio.h fcntl.h mode.h signal.h dir.h filsys.h ino.h inode.h jfsmount..h proc.h thread.h types.h user.h utherad.h j2-btree.h j2-dinode.h j2-inode.h j2-types.h

(35)

Uempty

Location of header files

The /usr/include directory contains several sub-directories containing header files. Some of the sub-directories are described in this table.

Header file directories Description

/usr/include General program header files

/usr/include/sys Header files dealing directly with the operations of the

system

/usr/include/jfs Header files for the JFS file system

(36)

Figure 1-11. Conditional Compile Values BE0070XS4.0

Notes:

Conditional compile values

Several conditional compiler directives are used in the system header files to select the platform and environment (32-bit or 64-bit kernel). This is because certain data types have different sizes depending on the execution environment (for example, 32-bit or 64-bit).

Example

Shown here is a portion of the definition of a struct thread. The compiler directive #ifndef __64BIT_KERNEL is used to create different definitions for the 32-bit and 64-bit kernels.

Conditional Compile Values

Value

Meaning

_POWER_MP

Code is being compiled for a

multiprocessor machine. This value

should always be used for 64-bit kernel

extensions and device drivers.

_KERNSYS

Enable kernel symbols in header files.

This value should always be used when

compiling kernel code.

_KERNEL

Compiling kernel extension or device

driver code. This value should always

be used when compiling kernel code.

_64BIT_KERNEL

Code is being compiled for a 64-bit

kernel.

_64BIT

Code is being compiled in 64-bit mode.

This value is automatically defined by

the compiler if the -q64 option is

specified.

(37)

Uempty struct thread {

/* identifier fields */

tid_t t_tid; /* unique thread identifier */ tid_t t_vtid; /* Virtual tid */

/* related data structures */

struct pvthread *t_pvthreadp; /* my pvthread struct */ struct proc *t_procp; /* owner process */

struct t_uaddress {

struct uthread *uthreadp; /* local data */

struct user *userp; /* owner process' ublock (const)*/ } t_uaddress;

/* user addresses */ #ifndef __64BIT_KERNEL

uint t_ulock64; /* high order 32-bits */ uint t_ulock; /* user addr - lock or cv */ uint t_uchan64; /* high order 32-bits */ uint t_uchan; /* key of user addr */

uint t_userdata64; /* high order 32-bits if 64-bit mode */ int t_userdata; /* user-owned data */

uint t_cv64; /* high order 32-bits if 64-bit mode */ int t_cv; /* User condition variable */

uint t_stackp64; /* high order 32-bits if 64bit mode */ char *t_stackp; /* saved user stack pointer */

uint t_scp64; /* high order 32-bits if 64bit mode */ struct sigcontext *t_scp; /* sigctx location in user space*/ #else

long t_ulock; /* user addr - lock or cv */ long t_uchan; /* key of user addr */

long t_userdata; /* user-owned data */

long t_cv; /* User condition variable */ char *t_stackp; /* saved user stack pointer */

struct sigcontext *t_scp; /* sigctx location in user space*/ #endif

(38)

Figure 1-12. Checkpoint BE0070XS4.0

Notes:

Checkpoint

1. The______ is the base program of the operating

system.

2. The processor runs interrupt routines in ______mode.

3. The AIX kernel is _______, ________ and __________.

4. The 64-bit AIX kernel supports only _______kernel

extensions, and only runs on _______ hardware.

5. The 32-bit kernel supports 64-bit user applications when

running on ________hardware.

(39)

Uempty

Figure 1-13. Exercise BE0070XS4.0

Notes:

Turn to your lab workbook and complete exercise one.

Exercise

Complete exercise one

Consists of theory and hands-on

Ask questions at any time

Activities are identified by a

What you will do:

(40)

Figure 1-14. Unit Summary BE0070XS4.0

Notes:

Unit Summary

Describe the role the kernel plays in an operating system

Define user and kernel mode and list the operations that

can only be performed in kernel mode

Describe when the kernel must make a context switch

Describe the role of the mstsave area in a context

switch

Name the execution environments available on each of

the platforms supported by AIX 5L

Using the system header files, identify data element

types for each of the available kernels in AIX 5L

(41)

Uempty

Unit 2. Kernel Analysis Tools

What This Unit Is About

This unit describes the different tools that are available to debug the AIX 5L kernel.

What You Should Be Able to Do

After completing this unit, you should be able to:

• List the tools available for analyzing the AIX 5L kernel

• Use KDB to display and modify memory locations and interpret a stack trace

• Use basic kdb navigation to explore crash dump and live system

How You Will Check Your Progress

Accountability:

• Exercises using your lab system

References

AIX Documentation: Kernel Extensions and Device Support Programming Concepts

(42)

Figure 2-1. Unit Objectives BE0070XS4.0

Notes:

Unit Objectives

At the end of this unit you should be able to:

List the tools available for analyzing the AIX 5L kernel

Use KDB to display and modify memory locations and

interpret a stack trace

Use basic kdb navigation to explore crash dump and live

system

(43)

Uempty

Figure 2-2. What tools will you be using in this class? BE0070XS4.0

Notes:

Kernel Analysis Tools

Several tools are available in AIX 5L that are used to examine and debug the kernel. This table list the primary tools we will be covering in this unit.

Typographic conventions

In this class an uppercase KDB will be used when referring to the kernel debugger, and lowercase kdb is used when referring to the image analysis command.

Description Tool

Kernel debugger for live system debugging KDB Used for system image analysis kdb

(44)

Figure 2-3. The Major Functions of KDB are: BE0070XS4.0

Notes:

Introduction

This section covers describes the kernel debugger available in AIX 5L.

Overview

The kernel debugger is built into the AIX 5L production kernel. For the debugger to be used it must be enabled prior to booting.

Interfacing with the debugger

Once started the kernel debugger is operated from a terminal connected to a native serial port of the system. The debugger cannot be operated from the LFT graphics display, or from a serial terminal connected via an 8-port or 128-port adapter.

The Major Functions of KDB are:

Set breakpoints within the kernel or kernel extensions

Execution control through various forms of step execution

commands

Format display of selected kernel data structures

Display and modification of kernel data

Display and modification of kernel instructions

Modify the machine state through alteration of system

registers

(45)

Uempty

Concept

When KDB is invoked, it is the only running program until you exit the debugger. All processes are stopped and interrupts are disabled. The kernel debugger runs with its own Machine State Save Area (mst) and a special stack. In addition, the kernel

debugger does not run operating system routines. Though this requires that kernel code be duplicated within the debugger, this means it is possible to set breakpoints anywhere within the kernel code. When exiting the kernel debugger, all processes continue to run unless the debugger was entered via a system halt.

(46)

Figure 2-4. Enabling the Kernel Debugger BE0070XS4.0

Notes:

Kernel flags

The kernel debugger feature is enabled by setting flags in the boot image prior to booting the kernel. After changing these flags you must create a new boot image and reboot the system to use this new image.

Building a new boot image

The bosboot command is used to build boot images. Arguments supplied to the

bosboot command will set flags in the boot image causing the kernel debugger to be enabled or disabled. After the boot image has been built the system must be re-booted for the new options to take effect.

Enabling the Kernel Debugger

Perform these steps to enable the kernel debugger:

1. Set Kernel boot Flags (bosdebug -D)

2. Build a new boot image

(bosboot -ad /dev/ipldevice)

3. Boot the new image (shutdown -Fr)

(47)

Uempty

bosboot syntax

The syntax of the bosboot command is: bosboot -a [-D | -I] -d device

Example

The following command will build a new boot image with the kernel debugger loaded: # bosboot -a -D -d /dev/ipldevice

The system must be rebooted for the change to take effect.

bosdebug

Attributes in the SWservAt ODM database can be set so that bosboot will enable the kernel debugger regardless of the command line argument used when building the boot image. The bosdebug command is used to view or set these attributes. To view the setting of the debug flags in the ODM database use the command:

# bosdebug

Memory debugger off Memory sizes 0 Network memory sizes 0 Kernel debugger on Real Time Kernel off

To set the kernel debugger attribute on use the command: # bosdebug -D

To set the kernel debugger attribute off use the command: # bosdebug -o

Note: All this command does is to set attributes in the SWservAt ODM database. The bosboot command reads these values and sets up the boot image accordingly.

Argument Description

-d device Specifies the boot device. The current boot disk is represented by

the device: /dev/ipldevice

-D Loads the kernel debugger. The kernel debugger will not

automatically be invoked when the system boots.

-I Loads and invokes the kernel debugger. The kernel debugger will

be invoked immediately on boot.

(48)

Figure 2-5. Verifying the Debugger is Enabled BE0070XS4.0

Notes:

Verifying the kernel debugger is enabled

Once the kernel is booted, you can use the following procedure to verify that the kernel debugger has been enabled.

Verifying the Debugger is Enabled

Step

Action

1 Start the kdb command#kdb 2

View the dbg_avail memory flag

(0)> dw dbg_avail 1

dbg_avail + 000000: 00000002

3

Compare the value of dbg_avail against the mask value in this table.

Mask Description

0x00000000 Do invoke at bootup.

0x00000001 Don't invoke at boot, but debugger is still invokable.

(49)

Uempty

Figure 2-6. Starting the Debugger BE0070XS4.0

Notes:

Invoke vs. load only

When the kernel debugger is configured to be invoked (the -I option) the debugger will start immediately after booting. If configured to be loaded but not invoked (the -D option) one of the conditions listed above must occur after the system is booted for the debugger to be started.

Starting the Debugger

From a native serial port, type the key sequence:

Ctrl-\

From the LFT keyboard, type the key sequence:

Ctrl-alt-Numpad4

A kernel extension or application makes a call to

brkpoint()

A breakpoint previously set using the debugger has been

reached

(50)

Figure 2-7. System Dumps BE0070XS4.0

Notes:

What is in a system dump

Typically, an AIX 5L dump includes all of the information needed to determine the nature of the problem. The dump contains:

- Operating system (kernel) code and data - Some data from the current running application - Most of the kernel extensions code and data

Paged memory

The dump facility cannot page in memory, so only what is currently in physical memory can be dumped. Normally this is not a problem since most of the kernel data structures are in memory. The process and thread tables are pinned, and the uthread and ublock structures of the running thread are pinned as well.

System Dumps

A dump image is not actually a full image of the system

memory but a set of memory areas copied out by the

dump routines.

What is in a system dump?

What is the effect of kernel paging?

What is the role of the Master Dump Table?

What tools are used to analyze system dumps?

(51)

Uempty

The master dump table

The system dump function captures data areas by processing information returned by routines registered in the Master Dump Table. Kernel extensions can specify a routine to be called to include data in a system dump. On AIX 5.1 this is done with the

dmp_add() kernel service, AIX 5.2 uses the dmp_ctl() kernel service. Kernel specific areas to be included in the dump are pre-loaded at kernel initialization.

Analyzing dumps

System dumps can be examined using the kdb command.

Dump Creation Process

Introduction

This section describes the dump process.

Process overview

The following steps are used to write a dump to the dump device:

Step Action

1. Interrupts are disabled

2. 0c9 or 0c2 are written to the LED display, if present

3. Header information about the dump is written to the dump device

4.

The kernel steps through each entry in the Master Dump Table, calling each Component Dump routine twice:

• Once to indicate that the kernel is starting to dump this component (1 is passed as a parameter).

• Again to say that the dump process is complete (2 is passed as a parameter).

• After the first call to a Component Dump routine, the kernel processes the CDT that was returned

For each CDT entry, the kernel :

• Checks every page in the identified data area to see if it is in memory or paged out

• Builds a bitmap indicating each page's status

• Writes a header, the bitmap, and those pages which are in memory to the dump device

(52)

5. Once all dump routines have been called, the kernel enters an infinite loop, displaying 0c0 or flashing 888

(53)

Uempty

Figure 2-8. kdb . BE0070XS4.0

Notes:

kdb

Command

Files needed

The kdb command requires both a memory image (dump device, vmcore or

/dev/mem) and a copy of /unix to operate. The /unix file provides the necessary symbol mapping needed to analyze the memory image file. It is imperative that the

/unix file supplied is the one that was running at the time the memory image was

created. The memory image (whether a device such as /dev/dumplv or a file such as

vmcore.0) must not be compressed.

kdb

The kdb command allows examination of an operating

system image

Requires system image and /unix

Can be run on a running system using /dev/mem

Typical invocations:

# kdb -m vmcore.X -u /usr/lib/boot/unix

or

(54)

Parameters

The kdb command may be used with the following parameters:

Example

To run kdb against a vmcore file use the following command line: # kdb -m vmcore.X -u /unix

To run kdb against the live (running kernel) no parameters are required. # kdb

Parameter Description

no parameter

Use /dev/mem as the system image file and

/usr/lib/boot/unix as the kernel file. In this case

root permissions are required.

-m system_image_file Use the image file provided

-u kernel_file Use the kernel file. This is required to analyze a

system dump on a different system.

-k kernel_modules Add the kernel_modules listed

-w View XCOFF object

-v Print CDT entries

-h Print help

-l Disable in-line more, useful when running non-

(55)

Uempty

Figure 2-9. Checkpoint BE0070XS4.0

Notes:

Checkpoint

1. _____is used for live system debugging.

2. _____is used for system image analysis.

3. The value of the _______kernel variable indicates how

the debugger is loaded.

4. A system dump image contains everything that was in

the kernel at the time of the crash. True or False?

(56)

Figure 2-10. Exercise BE0070XS4.0

Notes:

Introduction

Turn to your lab workbook and complete exercise two.

Read the information blocks included with the exercises. They will provide you with information needed to do the exercise.

Exercise

Complete exercise two

Consists of theory and hands-on

Ask questions at any time

Activities are identified by a

What you will do:

Enable and start the kernel debugger

Display and interpret stack traces

Display and modify variables in kernel memory

Perform basic kdb navigations on live system and

crash dump

(57)

Uempty

Figure 2-11. Unit Summary BE0070XS4.0

Notes:

Unit Summary

List the tools available for analyzing the AIX 5L kernel

Use KDB to display and modify memory locations and

interpret a stack trace

Use basic kdb navigation to explore crash dump and live

system

(58)
(59)

Uempty

Unit 3. Process Management

What This Unit Is About

This unit describes how processes and threads are managed in AIX 5L.

What You Should Be Able to Do

After completing this unit, you should be able to: • List the three thread models available in AIX 5L

• Identify the relationship between the six internal structures: pvproc, proc, pv_thread, thread, user and u_thread • Use the kernel debugging tools in AIX to locate and examine a

process’ proc, thread, user and u_thread data structures • Identify the states of processes and threads on a live system and in

a crash dump

• Analyze a crash dump caused by a run-away process • Identify the features of AIX scheduling algorithms

• Identify the primary features of the AIX scheduler supporting SMP and large system architectures

• Identify the action the threads of a process will take when a signal is received by the process

How You Will Check Your Progress

Accountability:

• Exercises using your lab system • Check-point activity

• Unit review

References

AIX Documentation: Performance Management Guide

AIX Documentation: System Management Guide: Operating System and Devices

(60)

Figure 3-1. Unit Objectives BE0070XS4.0

Notes:

Unit Objectives

At the end of this unit you should be able to:

List the three thread models available in AIX 5L

Identify the relationship between the six internal structures:

pvproc

, proc, pv_thread, thread, user and u_thread

Use the kernel debugging tools in AIX to locate and examine a

process’ proc, thread, user and u_thread data structures

Identify the states of processes and threads on a live system

and in a crash dump

Analyze a crash dump caused by a run-away process

Identify the features of AIX scheduling algorithms

Identify the primary features of the AIX scheduler supporting

SMP and large system architectures

Identify the action the threads of a process will take when a

signal is received by the process

(61)

Uempty

Figure 3-2. Parts of a Process BE0070XS4.0

Notes:

Processes and threads

A process is a self-contained entity that consists of the information required to run a single program, such as a user application.

Process

A process can be divided into two components:

- A collection of resources - A set of one or more threads

Parts of a Process

Resources

y Address space

y Open files pointers

y User credentials y Management data Stack CPU registers Stack CPU registers Stack CPU registers Process

(62)

Resources

The resources making up a process are shared by all threads in the process. The resources are:

- Address space (program text, data and heap) - A set of open files pointers

- User credentials - Management data

Threads

A thread can be thought of as a path of execution through the instructions of the process. Each thread has a private execution context that includes:

- A stack

(63)

Uempty

Figure 3-3. Threads BE0070XS4.0

Notes:

Threads

Threads provide the execution context to the process.

Kernel threads

Kernel threads are not associated with a user process and therefore have no user context. Kernel threads run completely in kernel mode and have their own kernel stack. They are cheap to create and manage thus are typically used to perform a specific function like asynchronous I/O.

Threads

Three type of threads are available in AIX:

Kernel

Kernel-managed

User

Three thread programming models are available for user

threads:

1:1

M:1

M:N

(64)

Kernel-managed threads

Kernel-managed threads are sometimes called ”Light Weight Processes” or LWPs and are the fundamental unit of execution in AIX. Each user process contains one or more kernel-managed threads.

The scheduling and running of kernel-managed threads is managed by the kernel. Each thread is scheduled to run on a CPU independent of the other threads of the process. On SMP systems, the threads of one process can run concurrently.

User threads

User threads are an abstraction entirely at the user level. The kernel has no knowledge of their existence. They are managed by a user-level threads library and their

scheduling and execution are managed at the user level.

Programming models

AIX 5L provides three models for mapping user threads on top of kernel-managed threads. The application developer can chose between 1:1, M:1 and M:N models.

(65)

Uempty

Figure 3-4. 1:1 Thread Model BE0070XS4.0

Notes:

1:1 Model

In the 1:1 model, each user thread is mapped to a single kernel-managed thread:

1:1 Thread Model

User Thread User Thread User Thread Kernel-managed Thread Kernel-managed Thread Kernel-managed thread Thread Library

(66)

Figure 3-5. M:1 Thread Model BE0070XS4.0

Notes:

M:1

In the M:1 model all user threads are mapped to one kernel-managed thread. The scheduling and management of the user threads are completely handled by the thread library.

M:1 Thread Model

Kernel-managed Thread Library Scheduler Thread Library User Thread

(67)

Uempty

Figure 3-6. M:N Thread Model BE0070XS4.0

Notes:

M:N

In the M:N model, user threads are mapped to a pool of kernel-managed threads. A user thread may be bound to a specific kernel-managed thread. An additional “hidden” user scheduler thread may be started by the library to handle mapping user threads onto kernel managed threads.

Thread model for this unit

This unit focuses on the management and scheduling of kernel-managed-threads. Primarily, the 1:1 model is discussed. Unless specified, the term “thread” refers to a kernel-managed thread.

Note that the thread model is selectable. The default for AIX 4.3.1 and higher is the M:N model. Using the 1:1 model can improve performance. The following will select the 1:1 model:

M:N Thread Model

User Thread Kernel-managed Thread Kernel-managed Thread Thread Library User Thread User Thread Kernel-managed Thread User Thread Library Scheduler

(68)

#export AIXTHREAD_SCOPE=S #<your_program>

There are many similar options available for thread tuning. See the Performance Management Guide in the AIX online documentation.

(69)

Uempty

Figure 3-7. Creating Processes BE0070XS4.0

Notes:

Creating processes

A new process is created when an existing process executes a fork() system call. The new process is called a child process; the creating process is the child’s parent.

Exec

When a process is first created it is running the same program as its parent. One of the exec() class of system calls is normally used to load a new program into the process’ address space.

Creating Processes

When a process is created it is given:

A process table entry

Process identifier (PID)

An address space (its contents are copied from the

parent process)

User-area

Program text

Data

User and kernel stacks

A single kernel-managed thread (even if the parent

process had many threads)

(70)

Example

Here is an example of fork and exec to start a new program: main(){

pid_t child;

if ( (child=fork()) == -1){

perror("could not fork a child process"); exit(1);

}

if ( child==0 ) { /* child */ /* exec a new program */

if (execl("/bin/ls","-l",NULL) == -1 ){ perror("error on execl");

exit(1); }

exit(0); /* all done end the new process */ } else { /* parent */

wait(NULL); /* Ensure parent terminates after child */ }

(71)

Uempty

Figure 3-8. Creating Threads BE0070XS4.0

Notes:

Creating threads

When a process is first created it contains a single kernel-managed thread. A process can create additional threads using the thread_create() system call.

Thread library

AIX provides a thread library to assist programers with the creation and management of threads. Typically, the library function pthread_create() is used to create threads rather than calling thread_create() directly. The thread library allows for creation and management of both kernel-managed threads and user threads using the same interface.

Creating Threads

A new thread is created by the thread_create()

system call. When created the thread is assigned:

A thread table entry

A thread identifier

(72)

pthread_create example

Here is an example of the creating a new thread using pthread_create: #include <pthread.h>

#include <errno.h>

void *new_thread(void *arg);

int main () { int i;

pthread_t threadId;

/* start up a new thread */

if (pthread_create (&threadId, NULL, new_thread, NULL )) { perror ("pthread_create");

exit (errno); }

/* main thread code here */ }

void *new_thread(void *arg) { /* new thread code here */ }

(73)

Uempty

Figure 3-9. Process State Transitions BE0070XS4.0

Notes:

Process states

This illustration above shows the states of a process during its life. In AIX a process can be in one of five states:

- Idle - Active - Stopped - Swapped - Zombie

Process State Transitions

Idle

Process creation

fork()

Active

Swapped

Zombie

Non-existent

Stopped

(74)

States

The five process states are described in this table:

Zombie process

Sometimes a Zombie process will stay in the process list for a long time. One example of this situation could be that a process has exited, but the parent process is busy or waiting in the kernel and unable to read the return code. If the parent process no longer exists when a child process exits, the init process (PID 1) frees the remaining resources held by the child.

State Description

Idle

A process is started with a fork() system call. During creation the process is in the idle state. This state is temporary until all of the necessary resources have been allocated.

Active

Once the creation of the process is done it is placed in the active state. This is the normal process state. The threads of the process can now be scheduled to run on a CPU.

Stopped

When a process receives a SIGSTOP signal, it is placed in the stopped state. If a process is stopped, all its threads are stopped and will not be scheduled on a CPU. A stopped process can be restarted by the SIGCONT signal.

Swapped

A swapped process has lost its memory resources and its address space has been moved onto disk. It cannot run until swapped back into memory.

Zombie

When a process terminates, some of its resources are not automatically released. A process is placed in the zombie state until its parent cleans up after it frees the resources. The parent must execute a wait() system call to retrieve the process’ exit status before the process will be removed from the process table.

(75)

Uempty

Process state on a running system

The state of a process can be found on a running system using the ps command. # ps -l

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD

240001 A 201 17670 16390 0 60 20 61f4 496 pts/3 0:00 ksh

200001 A 0 19172 17670 0 60 20 59da 496 pts/3 0:00 ksh

200001 A 0 19392 19172 3 61 20 2605 308 pts/3 0:00 ps

200011 T 0 19928 19172 0 60 20 4dff 436 pts/3 0:00 vi

Process state in a crash dump

The state of a process can also be found in a crash dump using kdb: # kdb

(0)> proc *

SLOTNAME STATE PID PPID PGRP UID ADSPACE pvproc+000000 0 swapperACTIVE 00000 00000 00000 00000 00004812 pvproc+000200 1 init ACTIVE 00001 00000 00000 00000 0000342D pvproc+000400 2 wait ACTIVE 00204 00000 00000 00000 00004C13 pvproc+000600 3 netm ACTIVE 00306 00000 00000 00000 0000282A

S Flag State O Nonexistent I Idle A Active T Stopped W Swapped Z Zombie

(76)

Figure 3-10. The Process Table BE0070XS4.0

Notes:

The process table

The kernel maintains a table entry for each process on the system. This table is called the process table. Each process is represented by one entry in the table. Each entry contains:

- A process identifier - The process state - A list of threads

- A description of the process’ address space - Other process management data

The Process Table

pvproc pvproc pvproc pvproc proc proc proc Process Table Slot Number 1 3 2 . NPROC . . . . . . . pv_procp pv_procp pv_procp 0

(77)

Uempty

Process table

The process table is a fixed-length array of pvproc structures allocated from kernel memory. For the 64-bit kernel, this table is divided into a number of sections called zones. At system startup, one zone is allocated on each SRAD (see later topic, Table Management).

proc structure

The proc structure is an extension on the pvproc structure. The pv_procp in the pvproc points to its associated proc structure. The proc and pvproc structures are split to accommodate large system architectures.

Slot number

References

Related documents