Object Tracker - Data-Type-Aware Fault Injection on Multiple Computer Systems

Chapter 4. Data-Type-Aware Fault Injection on Multiple Computer Systems

4.3. Tool

4.4.1. Object Tracker

The object tracker in each injector node is used to select a fault injection target in dynam- ic memory space. The object tracker tracks various types of dynamic memory objects. The tracked information stored in the object location table and the OS kernel memory is used to translate the symbolic identifier of a fault injection target to the corresponding virtual address (see Figure 4.2). Symbolic identifier specifies the data type of an injection target and the index of an instance of data of the targeted type. Note that when the object tracker is used, the control server sends a symbolic identifier instead of a virtual address to specify a fault injection target.

The object tracker has three different tracking granularities:

(i) Page granularity. The object tracker tracks all physical pages belonging to each

memory region. The following extension is made to calculate the virtual address of a specific page in a memory region. We extend the metadata of a physical page frame with

two fields: type of memory region and a node for a linked list (i.e., maintained for all page frames in each memory region). In page-granularity tracking, the symbolic identifier con- sists of the region name and the index of a page in the region (i.e., allocation order). Through searching of the linked list of a target region, the page frame structure is obtained, and the virtual address of the page is computed. For regions directly belonging to a specific memory allocator, the list is maintained by the allocation and free functions of the memory allocator. For the rest of the pages whose region type changes over time, we instrument the functions that can change the region type. Table 4.1 summarizes the instrumentation points.18

18_{UNIX-variant OSes use similar dynamic memory management techniques. Linux kernel has four types}

of dynamic memory allocators (see Figure 4.4) [BC00][Vah96]: (i) Buddy allocator. The buddy allocator is the top-level allocator for physical memory. It (de)allocates a contiguous physical memory space that has a multiple of a page frame (e.g., 4 kB). The other allocators rely on this allocator to obtain page frames and return the obtained frames. The buddy allocator is used for cache regions. The page cache region contains pages originating from files. The buffer cache region keeps pages that are being transmitted from/to a stor- age device. The function of the swap cache region is to hold pages read from swap areas where swap area is to keep pages evicted from the memory due to a memory overflow. This allocator is also directly used for the page table region when allocating pages for page table entries. (ii) Slab allocator. The slab allocator reduces the overhead of the buddy system when small-size objects are frequently allocated and freed. It has a set of slab caches where a cache keeps a set of slab objects that have the same size. For example, a slab cache can store up to 128 32 bytes objects by using a 4 kB page frame. This solves the internal fragmenta- tion problem of the buddy allocator. This allocator serves the kmalloc allocator and the slab region that contains the common kernel data structures. A part of the page table region also uses the slab allocator (e.g., page global/middle directory). (iii) Kmalloc. The kmalloc allocator is useful in managing variable-size

Table 4.1. Instrumentations to track allocation/free in memory region. Region Event Instrumentation Location

Buffer Cache

Grow drivers/md/raid5.c: grow_buffers(…) Shrink drivers/md/raid5.c: shrink_buffers(…) Page

Cache Grow

mm/filemap.c: add_to_page_cache(…) mm/page_alloc.c: page_alloc_cpu_notify(…)

Shrink mm/filemap.c: __remove_from_page_cache_nocheck(…) Swap

Cache

Grow mm/swap_state.c: __add_to_swap_cache(…)

Shrink mm/swap_state.c: __delete_from_swap_cache_nocheck(…)

Anon Grow

mm/rmap.c: __page_set_anon_rmap(…) mm/rmap.c: page_add_file_rmap(…)

Shrink mm/rmap.c: page_remove_rmap(…) if PageAnon(page) Mapped _{Shrink mm/rmap.c: page_remove_rmap(…) if !PageAnon(page)}Grow mm/highmem.c: kmap_high(…)

Page Table

Grow arch/i386/mm/pgtable.c: pte_alloc_one_kernel(…) _{arch/i386/mm/pgtable.c: pte_alloc_one(…)} Shrink include/asm-i386/pgalloc.h: pte_free_kernel(…) _{include/asm-i386/pgalloc.h: pte_free(…)}

(ii) Object granularity. The object tracker classifies the tracked dynamic memory ob-

jects by using the type of memory allocator and the call stack signature of a function that called the allocator function. For example, a memory object allocated for kernel modules can be specified by sys_init_module() as the first-level caller of the vmalloc() allocator. In our framework, up to 30 nested callers can be specified to point to an object type. In >95 % of cases, the nest call depth from the system call entry is smaller than 30 according to our measurement experiment. This call stack signature is obtained through tracking of the function frame pointers. Specifically, in x86 ISA, the program counter (EIP) and frame pointer (EBP) registers are stored in the call stack when call instructions are executed. EBP points to the old EBP in the stack. Because the old EIP of the caller is stored in the 4 bytes above the old EBP, the virtual address of the caller function is obtained. Through searching of the symbol tables of the kernel and modules with the old EIP, the symbol name of the caller is obtained. That search is repeated up to 30 times until the bottom of the stack is reached.

All allocation and free functions are instrumented. The instrumentation routines (which are a set of pairs of allocator and free functions) are enabled to extract the caller memory objects. Its interface is similar to malloc() and free(). Internally, it uses the slab allocator and cre- ates a set of slab caches where the object sizes of the caches are geometrically distributed from 32 bytes to 128 kB. If an object is requested, it forwards the request to a slab cache that best fits into the requested ob- ject size. (iv) Vmalloc. The vmalloc allocator can allocate variable-size memory objects that are contiguous- ly allocated in the virtual address space but not always in the physical address space. It gets page frames directly from the buddy allocator and maps the page frames into a contiguous virtual address region. It serves as the vmalloc region, containing buffers to copy code pages from files to kernel modules and I/O buffers for some device drivers and file systems. In addition, two more memory regions are identified in the dynamic memory of Linux. The memory-mapped region contains pages mapped into the last 128 MB of virtual address space, into which the physical memory above 896 MB is dynamically mapped. The anony- mous region contains pages for user-level data.

kmalloc() Page Table Buffer Cache Swap Cache Memory Mapped Slab Buddy System

Slab Allocator vmalloc()

Vmalloc Page Cache Anony mous Kmalloc

signature and to find an object that belongs to the specified object type. When an instrumentation routine finds the specified object, the allocated virtual address is sent to the injector for fault injection. If the object is freed before the breakpoint (set by the injector) fires, the breakpoint is unset, and the control client (executing in the user space) is noti- fied of the event.

(iii) Variable granularity. Variable granularity tracking is realized by analyzing the

source code of callers. The analysis extracts the data types of internal variables of a target memory object. The extracted information is used to match the internal variable type to the offset in a tracked memory object. For memory regions using the slab allocator, variable-granularity tracking is easily implemented. The symbolic identifier is the name of a slab cache and an object index. All active slab objects in a specified slab cache are identified through scanning of the data structure of the slab cache in the OS kernel memory, and the index is used to select a specific object. The offset inside the object reveals the variable type because the object data type for a slab cache is fixed. The source code analysis (considering the memory alignment by the compiler) is used to match the offset with the variable type.

In document From experiment to design – fault characterization and detection in parallel computer systems using computational accelerators (Page 83-86)