2.4 Valgrind
2.4.3 Shadow Memory
Dynamic Binary Analysis (DBA) programming tools, that are used for debugging pro- grams and improve software quality, analyze client programs at the level of machine code at run time. Many of them store information of shadow memory, which lets the tool remember about the history of every memory location and/or value in memory, in order to be able to tell where is the memory and what is stored. The shadow memory of the client program is updated and checked by the DBA tools for correctness and violations, so that any misuse or critical error will be reported to the programmer.
Memcheck shadows every bit of the client program memory for the addressability of every byte and validation for every bit [43]. Every bit of the memory has an associated validity bit (V bit) indicating whether it contains a valid value and every byte has an associated addressability bit (A bit) presenting whether the memory is accessible. V bits and A bits are followed all the time and checked when the corresponding part of memory is accessed, for example, when reading a word size (four bytes) variable from memory, its four A bits and 32 V bits will be checked for addressability and validation. Memcheck remembers all the allocation/deallocation operations that have been issued on each memory location, and can thus detect accesses of unreachable memory or already deallocated memory. It also remembers which values are dened which are not, in order to detect uses of undened values. When the client program is launched, all the global data areas are marked as accessible. When the program executes memory allocation operations, the A bits for that area of memory are allocated and marked as
accessible. When the program accesses memory, Memcheck checks the A bits associated with the address to assure the access indicates an invalid address. Furthermore, it checks the V bits of that memory for any undened values. During the program execution phase, A bits are also set for Stack Pointer (SP) register movements, which is useful for automatically marking function entry and local variables accessible and inaccessible on exit. The stack is marked as accessible from SP up to the stack base, and the stack is inaccessible below SP. The operations on registers are also handled by Memcheck by storing and calculating the relevant A/V bits in the simulated CPU until the register is written back to memory, and the A/V bits are consulted and checked if values in the CPU registers are used to generate a memory address or to determine conditional branches.
Because Memcheck stores and tracks all memory and register address and data, it is able to detect and report most of the memory problems at run time. However, there are also cases that Memcheck may detect but does not report as errors. For example, copying values around will not cause Memcheck to report, but only when a value is used in a way that might aect the result of the application, e. g. when writing the value to a le or stdout, or when being the argument of a conditional jump instruction. This also avoids long chains of error messages. Another case that Memcheck may not complain is using low level operations, such as add. The V bits for the operands are calculated for the result V bits, which might be partially or wholly undened. The V bits are only checked for dened when a value is used to generate a memory address, or when making control ow decisions and system calls. Only then an error message is issued, once the undened property is detected.
As described in the previous section, every byte of memory has eight V bits for each bit and one A bit for the whole byte, which is 9 bits in total. Memcheck uses compressed maps to store those bits, in order to avoid the overhead for memory. A two level map structure is used on 32-bit machines, where the top level is used as a shadow memory that saves the status of all memory in the pointer to a valid second level map, while the second level stores the entries to the accessibility and validity permissions (A bits and V bits) on corresponding memory regions. The top level map is indexed by the top 16 bits of the address, and the second level is indexed by the lower 16 bits. So there are 2^16=65536 entries on each level and 65536*2/8=16384 bytes shadowed by the second level. The 4 GB address space is consequently divided in to 64 k lumps, with 64 kb of each, as shown in Figure 2.5. As many of the 64 kb chunks might have the same status for every bit, either accessible or not, the primary map entry points to three distinguished pre-dened maps for indicating not accessible, undened or dened, so that to decrease the size of the stored memory status bits. Actually, for running a real application, more than half of the addressable memory is dened or undened [43].
On the other hand, on 64-bit machines the implementation is more complicated. A four-level structure could also be used, but it causes the amount of memory accesses to be extremely large. As a result, an improved two level structure is implemented to reduce the cost. The top level map is increased to 2^19 entries, indexed from bits 16 to 34 of the memory address space. This new top level map covers the bottom 32 GB of memory. Accesses to the top 32GB are handled by a sparse auxiliary table.
2.4 Valgrind 0134 . . . . . . . ... . . . ... . . . . . . . . . .. . XXXX Lower 16 bits Upper 16 bits 0134 0135 0136 Memory address 2^16 entries 2^16 entries Primary Map: XXXX 0135 0136 XXXX 01 01 01 01 DEFINED UNDEFINED NOACCESS
Distinguished Map 2^16 bytes Memory . . . . . . . . 0134 8bits 8bits 8bits 8bits 8bits 0134 0134 0134 0001 0002 FFFF 0134 0003 0000
Figure 2.5: A/V bits addressing mechanism Client request mechanism
The client request mechanism is a method provided by Valgrind to better interact with user applications. The client requests are unlike normal function invocations, but they are rather macros that can be used directly in user applications. These requests only aect the application when running under the control of Valgrind. Table 2.1 lists several powerful client requests. The client requests are parsed and converted by Valgrind into processor instructions that do not otherwise change the semantics of the application. By inserting this special instruction preamble, Valgrind detects commands to steer the instrumentation, these instructions, otherwise do not have any eects on registers, ags or other state of the processor.
The special instruction preamble rotates the register several times. On the x86- architecture, the right-rotation instruction ror is used to rotate the 32-bit register edi, by 3, 13, 29 and 19, which is 64 bits in total, leaving the same value in edi. The special instruction preamble is dened as follows:
#define __SPECIAL_INSTRUCTION_PREAMBLE \
"roll $3, %%edi ; roll $13, %%edi\n\t" \
"roll $29, %%edi ; roll $19, %%edi\n\t"
The actual command to be executed is then encoded with an register-exchange in- struction (xchgl) that replaces a register with itself (in this case ebx). The complete client request assembly code macro is dened as:
#define VALGRIND_DO_CLIENT_REQUEST( \
_zzq_rlval, _zzq_default, _zzq_request, \
VALGRIND_MAKE_MEM_NOACCESS Marks address ranges as inaccessible VALGRIND_MAKE_MEM_UNDEFINED Marks address ranges as undened VALGRIND_MAKE_MEM_DEFINED Marks address ranges as dened
VALGRIND_MAKE_MEM_DEFINED Marks address ranges as dened when it is
_IF_ADDRESSABLE addressable
VALGRIND_DISCARD Stops reporting errors on user-dened blocks
VALGRIND_CHECK_MEM_ Checks whether address range is addressable IS_ADDRESSABLE
VALGRIND_CHECK_MEM_IS_DEFINED Checks whether the address range is dened VALGRIND_CHECK_VALUE_IS_DEFINED Checks whether the value is dened
VALGRIND_DO_LEAK_CHECK Checks immediately memory leak
VALGRIND_DO_QUICK_LEAK_CHECK Checks immediately a full memory leak with a summary
VALGRIND_COUNT_LEAKS Tests harness code, after calling VALGRIND_ DO_LEAK_CHECK or VALGRIND_DO_QUICK_LEAK_ CHECK, and returns the number of bytes in each category
VALGRIND_COUNT_LEAK_BLOCKS Tests harness code, after calling VALGRIND_ DO_LEAK_CHECK or VALGRIND_DO_QUICK_LEAK_ CHECK, and returns the number of blocks
VALGRIND_GET_VBITS Gets V bits for an address range
VALGRIND_SET_VBITS Sets V bits for an address range