• No results found

Dimensions of analysis

In document Effective Bug Finding (Page 49-52)

- 6252547b8a7 -

type: Null-pointer dereference

descr: Null pointer on !OF_IRQ gets dereferenced if IRQ_DOMAIN

In TWL4030 driver, attempt to register an IRQ domain with a NULL ops structure: ops is de-referenced when registering an IRQ domain, but this field is only set to a non-null value when OF_IRQ.

config:TWL4030_CORE && !OF_IRQ

bugfix:

repo: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

hash: 6252547b8a7acced581b649af4ebf6d65f63a34b

layer: model, mapping

trace: . dyn-call drivers/mfd/twl-core.c:1190:twl_probe() . 1235: irq_domain_add(&domain); .. call kernel/irq/irqdomain.c:20:irq_domain_add() ... call include/linux/irqdomain.h:74:irq_domain_to_irq() ... ERROR 77: if (d->ops->to_irq) links: * [I2C](http://cateee.net/lkddb/web-lkddb/I2C.html) * [TWL4030](http://www.ti.com/general/docs/marketurl.tsp?name=twl4030) * [IRQ domain](http://lxr.gwbnsh.net.../IRQ-domain.txt)

Figure 3.4: VBDb record for bug 6252547b8a7.

3.4

Dimensions of analysis

I begin by selecting a number of properties of bugs to understand, an- alyze and document in bug reports. These are described below and ex- emplified by data from the VBDb database. Figure 3.4 shows an example record for a NULL-pointer dereference bug found in a Linux driver, which was traced back to errors both in the feature model and the mapping. Figure 3.5 shows the simplified version of this bug.

Type of bug. In order to understand the diversity of bugs I establish

the type of each bug (field type) according to the Common Weakness Enu- meration (CWE)8—a taxonomy of numbered software weaknesses and vul- nerabilities. However, since CWE is mainly concerned with security, I had

38 Chapter 3. A Qualitative Study of Bugs in Linux

to add a few additional types of bugs, including type errors, among oth- ers. The types of bugs that I found are listed in Fig. 3.1—my additions lack an identifier in the CWE column.

Bug description. Understanding a bug requires rephrasing its nature

in general software engineering terms (field descr), so that the bug be- comes understandable for non-experts. I add a one-line header to the description, to help identification and listing of bugs. I obtain such a description by studying the bug in depth, and following additional available resources (such as mailing list discussions, available books, commit messages, documentation and online articles). Whenever use of domain-specific terminology is unavoidable, I provide links to the necessary background. Obtaining the description is often non-trivial.

Presence condition. I investigate under what presence condition the

bug appears (field config). This enables further investigation of variabil- ity properties of the bug, for example, the number of features and nature of dependencies that enable the bug. Obtaining the presence condition of a bug requires examining the source code and the Makefiles; as well as the Kconfig files to check for feature dependencies. For instance, Linux commit 6252547b8a7a (cf. Figure 3.4) fixes a 2-degree bug that occurs when enabling TWL4030_CORE, and disabling OF_IRQ.

Bug-fix layer(s). I analyze the bug-fixing commit to establish whether

the source of the bug is in the code, in the feature model, or in the mapping (field layer). The bug of Fig. 3.4 has been fixed both in the model and in the mapping. The commit message asserts that: first, TWL4030_CORE should not depend on IRQ_DOMAIN (fixed in the model); and, second, that the assignment of the variableopsto&irq_domain_simple_opsis part of the IRQ_DOMAIN code and not of OF_IRQ (fixed in the mapping).

Bug trace. I analyze and document the execution trace that leads to

the bug (field trace). Reconstructing the trace is key in understanding the nature and complexity of the bug; and, once documented, it allows other researchers to understand the bug much faster.

There are two types of entries in bug traces: function calls and state- ments. Function call entries can be either static (tagged call), or dynamic (dyn-call) if the function is called via a function pointer. A statement en- try highlights relevant changes in the program state. Every entry starts

3.4. Dimensions of analysis 39 1 # d e f i n e NULL ( void * ) 0 2 3 # i f d e f CONFIG_TWL4030_CORE 4 # d e f i n e CONFIG_IRQ_DOMAIN 5 # e n d i f 6 7 # i f d e f CONFIG_IRQ_DOMAIN // ENABLED 8 i n t irq_domain_simple_ops = 1 ; 9

10 void irq_domain_add ( i n t * ops ) 11 { 12 i n t i r q = * ops ; // ( 4 ) ERROR 13 } 14 # e n d i f 15 16 # i f d e f CONFIG_TWL4030_CORE // ENABLED 17 i n t twl_probe ( ) 18 { 19 i n t * ops = NULL; // ( 2 ) 20 21 # i f d e f CONFIG_OF_IRQ // DISABLED 22 ops = &irq_domain_simple_ops ; 23 # e n d i f 24 25 irq_domain_add ( ops ) ; // ( 3 ) 26 } 27 # e n d i f 28 29 i n t main ( ) 30 { 31 # i f d e f CONFIG_TWL4030_CORE // ENABLED 32 twl_probe ( ) ; // ( 1 ) 33 # e n d i f 34 r e t u r n 0 ; 35 }

Figure 3.5: Simplified version for bug 6252547b8a7.

with a non-empty sequence of dots indicating the nesting of function calls, followed by the location of the function definition (file and line) or statement (only the line). The statement in which the bug is manifested is marked with an ERROR label.

Simplified bug. I synthesize a simplified version of the bug, a small

C program, that exhibits the same essential behavior, and the same es- sential problem. Simplified bugs are self-contained C files that do not depend on Linux kernel code, nor any other library than libc. The entire set of simplified bugs constitute an easily accessible benchmark suite derived from real bugs, which can be used to evaluate proof-of-concept tools on a smaller scale.

40 Chapter 3. A Qualitative Study of Bugs in Linux

Simplified bugs are derived systematically from the bug trace. Along this trace, I preserve relevant statements and control-flow constructs, CPP conditionals, feature mapping information, and function calls. I keep the original identifiers for features, functions and variables. However, I abstract away complex code patterns (e.g. dynamic dispatching via function pointers) whenever this is not essential to capture the bug. For this reason, if a tool finds one of these simplified bugs, that does not imply that it will find the real bug too. When there exist dependen- cies between features, these are encoded operationally, using #define directives (cf. lines 4–6 of Figure 3.5). This makes the simplified bugs independent of any particular configuration modeling notation, such as Kconfig.

Traceability. The record includes the URL of the repository in which

the bug fix is applied (field repo), the commit id (field hash), and links to relevant context information about the bug (field links), in order to support independent verification of my analysis.

In document Effective Bug Finding (Page 49-52)