Parallel Symbolic Execution for
Automated Real-World Software Testing
Stefan Bucur, Vlad Ureche, Cristian Zamfir, George Candea
Cloud9
Automated
Techniques
Automated Software Testing
λ
Symbolic Execution Model CheckingIndustrial
SW Testing
Manual Testing
Static Analysis
Fuzzing
Scalability
Applicability
Usability
Cloud9 - The Big Picture
•
Parallel symbolic execution
•
Linear scalability on commodity clusters
•
Full symbolic POSIX support
•
Applicable on real-world systems
•
Platform for writing test cases
Automated Systems Testing
[*] C. Cadar, D. Dunbar, D. Engler, “KLEE: Unassisted and automatic generation
•
Promising for systems testing:
KLEE
[*]
•
High-coverage test cases
•
Found new bugs
•
... But applied only on small
programs
λ
Memcached
GNU Coreutils
Apache
void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
pkt->cmd == GET pkt->magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
λ.cmd == GET λ.cmd != GET λ.magic == 0xC9 λ.magic != 0xC9 void proc_pkt(packet_t* pkt) { if (pkt->magic != 0xC9) { err(pkt); return; } if (pkt->cmd == GET) { ... } else if ... ... }
Symbolic Execution in a Nutshell
∼
2
paths
λ
program
size
CPU Bottleneck
W1
W2
W3
W1
W2
W3
Parallel Tree Exploration
Key research problem:
Linear Solution to Exponential
Problem
Program Size
Time t
o
Test
Linear Solution to Exponential
Problem
Program Size
Time t
o
Test
Testing target 1 workerLinear Solution to Exponential
Problem
Program Size
Time t
o
Test
Testing targetBring testing time down to practical values
1 worker
2 workers
4 workers 8 workers
Scalability Challenges
Tree structure not known a priori
?
?
?
?
?
?
?
?
?
?
Scalability Challenges
Scalability Challenges
Outline
•
Scalable Parallel Symbolic Execution
•
POSIX Environment Model
Cloud9 Architecture
Global
Cloud9 Architecture
W1’s Local Tree
W2’s Local Tree
W3’s Local Tree
Each worker runs a local
Cloud9 Architecture
Candidate
nodes
Fence
nodes
•
Candidate nodes are selected for
exploration
Load Balancing
LB
W1
W2
W3
Load Balancing
LB
W1
W2
W3
Load Balancing
LB
W1
W2
W3
Work Transfer
W1
Work Transfer
Work Transfer
W1
W2
Virtual
Work Transfer
W1
W2
Work Transfer
W1
W2
Materialized
Work Transfer
W1
W2
1 1 1 1 1 0 0 0 0 0 0 0 0
Path-based Encoding
•
Nodes are encoded as paths in tree
•
Compact binary representation
•
Two paths can share common prefix
•
Small encoding size
•
For a tree of
2
100leaves, a path fits in
Load Balancing in Practice
LB stops after 1 min LB stops after 4 min Continuous load balancing
W o rk d o n e [% o f to ta l in st ru ct io n s] Time [minutes]
0
10
20
30
40
50
60
70
80
90
100
0
2
4
6
8
10
Outline
•
Scalable Parallel Symbolic Execution
•
POSIX Environment Model
Calls into the Environment
if (fork() == 0) { ...
if ((res = recv(sock, buff, size, 0)) > 0) { pthread_mutex_lock(&mutex);
memcpy(gBuff, buff, res);
pthread_mutex_unlock(&mutex); }
...
} else { ...
pid_t pid = wait(&stat); ...
fork()
Program Under Test
Environment
(C Library / OS)
Environment Model
fork()
Program Under Test
Environment
(C Library / OS)
Environment Model
Model Code
Symbolic Execution Engine
Equivalent functionality
Executable symbolically
Starting Point
Symbolic Execution Engine
Network
StubsFiles
POSIX
Single
-thr
eaded
isola
ted nodes
Single
-thr
eaded
utilities
POSIX Environment Model
Symbolic Execution Engine
Network
TCP/UDP/UNIXFiles
Threads
Pipes
pthread_*
Processes
POSIX
M
essage passing
Ser
vers and clien
ts
M
ulti-thr
eaded
pr
og
rams
Distr
ibut
ed sy
st
ems
Signals
Asynchr
onous ev
en
ts
,
IPC
Single
-thr
eaded
utilities
Key Changes in Symbolic Execution
Multithreading and Scheduling
•
Deterministic or symbolic scheduling
•
Non-preemptive execution model
Address Space Isolation
•
Copy on Write (CoW) between processes
Symbolic Engine System Calls
•
Symbolic engine support
needed for threads/processes
1.
Thread/process lifecycle
2.
Synchronization
3.
Shared memory
Symbolic Engine
System Calls
thread_create thread_terminate process_fork process_terminate get_context thread_preempt thread_sleep thread_notify get_wait_list make_shared1
2
3
Outline
•
Scalable Parallel Symbolic Execution
•
POSIX Environment Model
Testing Real-World Software
Memcached
GNU Coreutils
Apache
Time to Reach Target Coverage
printf
Faster time-to-cover, higher coverage values
60% coverage 70% coverage 80% coverage 90% coverage 0 10 20 30 40 50 60 1 4 8 24 48Time to achieve target coverage [minutes]
Increase in Code Coverage
0 10 20 30 40 50 0 10 20 30 40 50 60 70 80 90Additional code covered [ % of program LOC ]
Index of tested Coreutil (sorted by additional coverage)
Coreutils
suite (12 workers, 10 min.)
Exhaustive Exploration
0 1 2 3 4 5 6 2 4 6 12 24 48 Time to complete exhaustive test [hours] Number of workersScalability of exhaustive path exploration
Instruction Throughput
0.0e+00
2.0e+09
4.0e+09
6.0e+09
8.0e+09
1.0e+10
1.2e+10
1.4e+10
1.6e+10
1.8e+10
1 4 6
12
24
48
Useful work done [ #
of
instructions
]
Number of workers
4 minutes
6 minutes
8 minutes
10 minutes
memcached
Execute the “whole world” symbolically
Symbolic State
Experimental Setup
Client
Process
memcached/
Apache/
lighttpd
TCP Stream Symbolic cmd. Srv. responseSymbolic Test Cases
•
Easy-to-use API for developers to write
symbolic test cases
•
Basic symbolic memory support
•
POSIX extensions for environment control
•
Network conditions, fault injection, symbolic
Symbolic Test Cases
Testing HTTP header extension
make_symbolic(hdrData);
// Append symbolic header to request
strcat(req, “X-NewExtension: “); strcat(req, hdrData);
// Enable fault injection on socket
ioctl(ssock, SIO_FAULT_INJ, RD | WR);
// Symbolic stream fragmentation