ABSTRACT
MULUKUTLA, VIKRAM. Wolfsting: Extending Online Dynamic Malware Analysis Systems by Engaging Malware. (Under the direction of Dr. Douglas S. Reeves).
Wolfsting: Extending Online Dynamic Malware Analysis Systems by Engaging Malware
by
Vikram Mulukutla
A thesis submitted to the Graduate Faculty of North Carolina State University
in partial fulfillment of the requirements for the Degree of
Master of Science
Computer Science
Raleigh, North Carolina
2010
APPROVED BY:
Dr. Douglas S. Reeves Chair of Advisor Committee
Dedication
Biography
Vikram Mulukutla did most of his schooling in Bangalore, a city in the southern state of Karnataka
in India. He received his Bachelor of Engineering in Computer Science degree from the BMS College of
Engineering at Bangalore. He then worked for one year as an associate software engineer with IBM’s
India Software Lab, before arriving at the North Carolina State University, where he is currently pursuing
a Master of Science degree in Computer Science. His interests lie in low level programming, security and
Acknowledgements
I thank Dr. Douglas S. Reeves, my advisor, for his inspiring thoughts and ideas, unwavering guidance,
and most of all for the creative freedom he granted in allowing me to come up with an idea and pursue it
to completion. I thank Dr. Peng Ning and Dr. Xuxian Jiang for agreeing to be on my thesis committee. I
thank all the members of the Cyber Defense Lab at NCSU, especially Young Hee Park, whose mentorship
and guidance has proven invaluable to me. I thank the Department of Computer Science and NCSU for
creating and maintaining such a beautiful and professional environment that is conducive to work and
research.
I thank my friends for their support and the good times we’ve had throughout this process. I
thank Priyank Kumar, who is also pursuing the Master’s thesis option here at NCSU, for all the lively
discussions during coffee and lunch breaks during the time when we were writing our thesis drafts. I
thank all those who have been involved with the creation and maintenance of Lyx, the LateX frontend
that made writing this thesis a breeze.
I sincerely acknowledge the work of giants whose shoulders this thesis stands upon.
I especially thank my family for their support and commitment to my education all the way to my
Table Of Contents
List Of Tables . . . . ix
List Of Figures . . . . x
Chapter 1 Introduction . . . . 1
1.1 Malware and Malware Defense . . . 1
1.2 Online Dynamic Malware Analysis Systems . . . 2
1.3 Definitions . . . 3
1.3.1 Malware . . . 3
1.3.2 Malware Behavior . . . 4
1.3.3 Honeypots . . . 4
1.3.4 Malware Analysis . . . 5
1.3.4.1 Static Analysis . . . 5
1.3.4.2 Dynamic Analysis . . . 5
1.3.5 Online Dynamic Malware Analysis Systems . . . 7
1.3.6 Virtualization terminology . . . 8
1.4 Thesis Organization . . . 9
Chapter 2 Background And Related Work . . . . 10
2.1 Overview . . . 10
2.2 Recent and related work in static analysis . . . 10
2.3 Recent and related work in dynamic analysis . . . 13
2.3.1 Categories of dynamic analysis . . . 13
2.3.1.1 Monitoring un-tampered malware execution . . . 13
2.3.1.2 Tampering with malware execution to force malicious behavior . 14 2.3.2 Using the output of dynamic analysis . . . 15
2.4 Online Dynamic Malware Analysis Systems and Wolfsting . . . 16
Chapter 3 Wolfsting Motivation and Methodology . . . . 18
3.1 Overview . . . 18
3.2 Observations . . . 18
3.2.1 Malware looking for certain resources . . . 19
3.2.2 Malware exploiting outdated or very recent software . . . 19
3.2.3 Malware persisting on the user’s machine . . . 20
3.2.4 Focus on certain system calls . . . 20
3.2.5 Advantages of multiple executions . . . 21
3.2.5.1 Randomized identifier strings . . . 21
3.2.5.2 Process injection behavior . . . 21
3.3 Engaging Malware . . . 22
3.3.1 Simulating a user attempting to remove the malware . . . 22
3.3.2 Creating fake resources and settings . . . 23
Chapter 4 Wolfsting Design . . . . 24
4.1 Virtual Environment . . . 24
4.2 Guest OS Components . . . 25
4.3 Analysis Engine . . . 26
4.3.1 New Behavior Extractor . . . 26
4.3.2 Housekeeping/Future-Success System Call Filtering Component . . . 29
4.4 Summary . . . 31
Chapter 5 Wolfsting Execution . . . . 32
5.1 Wolfsting Processing . . . 32
5.2 Two Runs . . . 35
Chapter 6 Wolfsting Implementation . . . . 38
6.1 Virtualized Environment . . . 38
6.2 Guest OS Components . . . 38
6.2.1 Kernel device driver . . . 38
6.2.2 Userland program to simulate user actions . . . 39
6.3 Analysis Engine . . . 39
6.3.1 New Behavior Extractor . . . 39
6.3.2 Filtering Component . . . 40
6.3.3 Resource Creator . . . 41
Chapter 7 Results . . . . 43
7.1 Overview . . . 43
7.2 Trojan Dropper . . . 43
7.2.1 Analysis by Resource Creation . . . 44
7.2.2 Simulating user actions . . . 45
7.3 AVKiller (Agent2) . . . 47
7.3.1 Analysis by Resource Creation . . . 47
7.4 Vundo . . . 51
7.4.1 Analysis by Resource Creation . . . 51
7.4.2 Random Identifiers . . . 53
7.5 TDSS . . . 53
7.6 Zeus/ZBot . . . 56
7.6.2 Random Identifiers . . . 58
7.7 Observations and Summary . . . 60
7.7.1 Overhead analysis . . . 60
7.7.2 Trace output . . . 60
7.7.3 Registry activity bias . . . 61
7.7.4 Summary . . . 61
Chapter 8 Limitations . . . . 62
8.1 Limitations Of Dynamic Analysis . . . 62
8.2 Limitations Of Wolfsting . . . 63
8.3 Techniques to thwart Wolfsting . . . 64
Chapter 9 Conclusions And Future Work . . . . 66
9.1 Conclusions . . . 66
9.1.1 Observations . . . 67
9.1.2 Summary . . . 68
9.2 Future Work . . . 68
9.2.1 Scope for future work . . . 69
9.2.2 Usefulness in Corporate Environments . . . 70
9.2.3 Usefulness to Individual or Home users . . . 70
List of Tables
Table 7.1 Windows registry keys requested by ZBot . . . 57 Table 7.2 New behavior, i.e., new registry values (boldface) accessed by ZBot after
List of Figures
Figure 1.1 A sample CWSandbox Analysis Report . . . 2
Figure 1.2 A sample Anubis Report . . . 3
Figure 1.3 Registry changes recorded in CWSandbox . . . 7
Figure 1.4 File changes recorded in Anubis . . . 8
Figure 1.5 Host, Virtual Machine, and the Guest OS . . . 9
Figure 3.1 CWSandbox output displaying queries made by malware for information on certain folders . . . 19
Figure 4.1 Design overview of Wolfsting . . . 27
Figure 4.2 ZBot injects code into different processes on every execution, thus system calls from two different threads may be identical otherwise . . . 28
Figure 4.3 Certain parameters of the NtCreateFile Windows API are used as part of a key to differentiate between two NtCreateFile calls . . . 29
Figure 5.1 Wolfsting Processing . . . 34
Figure 5.2 A Wolfsting VM guest with the malware binary, userland component and kernel driver loaded into memory . . . 35
Figure 5.3 Two or more system calls with the same purpose . . . 37
Figure 7.3 XML Output from Wolfsting shows how Trojan-Dropper.Win32.VB mod-ifies a registry key to prevent Task Manager from executing . . . 45 Figure 7.2 Trojan-Dropper.Win32.VB prevents task manager from executing by
mod-ifying a registry key that sets a dummy debugger process for task ager. The dummy debugger process does absolutely nothing; task man-ager is never launched. . . 46 Figure 7.4 Security descriptor of the Windows folder, displaying permissions for users
in the system . . . 48 Figure 7.5 Wolfsting XML output displaying some of the folders queried (in the
<filename> tags) by the Trojan.Win32.Agent2 malware . . . 48 Figure 7.6 Wolfsting XML output shows Agent2 setting bad security descriptors for
antivirus folders, making them inaccessible. The exact security descriptor has not been shown for brevity. . . 49 Figure 7.7 Security descriptor of an antivirus (AVG) folder nullified after
modifica-tion by the malware instance . . . 50 Figure 7.8 Registry key modified by Vundo to disable KAVP’s update service (Value
of zero disables updates) . . . 51 Figure 7.9 Wolfsting output displaying modifications made to Internet Explorer
set-tings, specifically the way IE connects to the network and the Internet . . 52 Figure 7.10 Random identifier string (boldface) used by Vundo - the string changes
during every execution . . . 53 Figure 7.11 Wolfsting analysis approach for TDSS . . . 54 Figure 7.12 Trace output (filtered to show only relevant activity) when running TDSS 55 Figure 7.13 ZBot Infection Rate in terms of number of reported infections worldwide
Chapter 1
Introduction
1.1
Malware and Malware Defense
Figure 1.1: A sample CWSandbox Analysis Report
1.2
Online Dynamic Malware Analysis Systems
The purpose of online dynamic malware analysis systems is primarily to assist human mal-ware analysts in reverse engineering malmal-ware binaries. They consist of closely monitored, vir-tualized or emulated environments in which user submitted binaries are executed for a fixed amount of time. These systems trace the execution of the binaries in their instrumented en-vironments. The output of these systems consists of human and/or machine readable reports that include comprehensive information on all traceable actions performed by the malware, from low level native API calls to high level network activity. More recent tools have also begun to represent information graphically to further aid analysts. Two well-known examples of such tools are CWSandbox [1] and Anubis [2]. Figs. 1.1 and 1.2 show sample output traces from these two systems respectively.
Figure 1.2: A sample Anubis Report
While the information provided in these reports is invaluable, the analysis systems themselves are vanilla machines that usually have bare-bones installations of operating systems. The motivation of this work is that more useful work can be done within the isolated analysis system providing tangible benefits to malware analysts and end users.
1.3
Definitions
1.3.1 Malware
This thesis defines malware as any program that, without the user’s knowledge or permis-sion, executes on the user’s machine with the following intent:
1. Stealing information: With the rapid expansion of personal computing and the Internet, users frequently store their personal data on their machines. This includes usernames, passwords, authentication tokens, financial and credit information. All such data is tar-geted by malware; each new malware family is being increasingly tartar-geted to a particular type of data.
some destructive variants are designed to disable or completely destroy the functionality of a system.
3. Using the system’s resources without authorization: Malware may also use a computer’s resources for its own purposes without the knowledge of the user. A large network of such compromised machines is called a botnet. Botnets are now listed as the top threat
to security on the Internet. Hundreds to thousands of machines are forcefully taken over by malware and then slaved to either a peer to peer or centralized command network.
1.3.2 Malware Behavior
All programs interact with the operating system (OS) using the Application Programming Interface (API) provided by the OS. We term malware behavior as the unique set of system calls made by malware processes and threads on a system. These calls represent the interaction of the malware with the system and user environment, and hence, malicious intent as described in this thesis as the set of system calls that enable malware processes to extract information without permission, to persists on the user’s system, to attempt to infect the network, to download other malware, or to - without permission -use, modify or delete resources and settings.
1.3.3 Honeypots
1.3.4 Malware Analysis
1.3.4.1 Static Analysis
Static analysis involves analyzing malware binaries without executing them, using binary disassembly or some other technique to reveal the intent of the malware binary code. Mathe-matical and statistical tools may be used to glean some pattern out of binary code to help in malware classification or detection. Recent work using these methods is detailed in section 2.2. The implementation of static analysis systems may require the knowledge of the following:
1. Underlying Hardware: Object code - when disassembled - is a set of processor instructions and data. For any static analysis method, comprehensive knowledge of the instruction set of the processor that the malware binary has been compiled for is required.
2. Operating System Executable Format: Comprehensive knowledge of the executable format including storage layouts, data structures as well as precise knowledge of how the loader component in the OS loads the binary image into memory is essential to static analysis systems. Disassembly of object code is inherent to most static analysis implementation. This presents a challenging problem, namely that of binary packing and encryption. Binary packing, as defined [14], is the compression and/or encryption of executable code in malware binaries using various algorithms. These algorithms, which are called packers when implemented, may also include anti-debugging and anti-virtualization techniques preventing easy analysis. Unpacking is the reverse process, wherein the compressed executable code is restored to its original form, as defined in [13]. A lot of research has been done to discover a generic method to unpack binaries that are packed with any algorithm as discussed in section 2.2. Static analysis may involve the use statistical algorithms to define parameters and indicators whose measurement is done to assess the system.
1.3.4.2 Dynamic Analysis
are instrumented to provide detailed reports or traces of the binary’s operation. Data tainting may also be used to track data as it passes between function calls, over the network, etc. There are several types of dynamic analysis methods, some categories are listed below:
1. Un-tampered execution tracing: These methods involve running malware instances and monitoring their execution, generating a trace report that details every operation per-formed by the malware instance. The malware program’s execution path is not tampered with or modified in any way once the binary instance begins execution. However, the environment in which the malware is run (for example, network settings) may be varied, as discussed in section 2.3.1.1.
2. Extracting or forcing malicious behavior: These methods involve forcing malware in-stances to execute more of their binary code then they would in a normal execution environment. This involves modifying the execution path of malware instances; several approaches have been defined and are discussed in section 2.3.1.2. For example, this may be accomplished by executing every branch path in a control flow graph of the malware’s execution.
The output of dynamic analysis, i.e. the traces generated by the above approaches may then be converted to an intermediate format and stored in databases that can be used to classify malware or detect new malware. These systems may be categorized as follows:
1. Clustering: The output of dynamic analysis approaches as mentioned above is fed to clustering algorithms that create clusters according to certain parameters. These clusters may then be used to detect and/or classify new malware.
2. Graph Matching: The traces output by dynamic analysis approaches above are used to construct call flow graphs (CFGs) that are subsequently used to detect new malware using graph comparison algorithms.
Figure 1.3: Registry changes recorded in CWSandbox
to generate signatures for host based or network based intrusion detection, as discussed in section 2.3.1.1.
A more detailed discussion of examples of these systems is presented in section 2.3.2. It is the aim of this thesis to provide better traces to improve the accuracy of these systems.
1.3.5 Online Dynamic Malware Analysis Systems
Figure 1.4: File changes recorded in Anubis
1.3.6 Virtualization terminology
1. Host: The underlying physical hardware and operating system on which the virtualization software runs one or more virtualized machines.
2. Guest OS: The operating system installed in the virtual machines.
3. Snapshot: A virtual machine snapshot is an instantaneous image of the guest OS memory and the state of the virtual hardware that is saved on the host machine. The snapshot can be reloaded onto the virtual machine to resume execution from when it was taken.
Host Operating System
Virtual Machine
Guest Operating System
Virtual Machine
Guest Operating System
Virtual Machine
Guest Operating System
Hardware
Figure 1.5: Host, Virtual Machine, and the Guest OS
1.4
Thesis Organization
Chapter 2
Background And Related Work
2.1
Overview
The field of malware analysis and defense has seen exponential growth in this decade. Advances in the area of static and dynamic analysis have resulted in better detection and classification of malware, as well as making reverse engineering malware easier for security professionals. This chapter will explore the current state of the art in the field and the relevance of recent work to this thesis. We first delve into static and dynamic analysis, show how they have led to the evolution of online dynamic malware analysis systems and finally bring Wolfsting into the context, providing a basis for the next chapter, which details the motivation for this work.
2.2
Recent and related work in static analysis
Most static analysis methods today analyze malware binaries directly, due to the unavail-ability of malware source code. Christodorescu et al. [23] discussed a code-obfuscation resilient architecture to detect malware - the detection mechanism takes as its input a generic automaton model of malicious code together with an annotated call flow graph of an executable program and determines if the same malicious patterns exist within the automaton and the CFG. In a subsequent work, Christodorescu et al. [24] use static analysis to create semantic behavior models that are immune to code obfuscation techniques; these models are then used for de-tection. In her PhD thesis, Zhang [36] uses static analysis to create behavioral patterns out of instruction sequences in malware binaries and then performs clustering to provide detection capabilities.
This research has also led malware authors to use increasingly complex techniques to counter the aforementioned methods by employing techniques listed in section 2.2.1. Static analysis is an important tool simply because it does not require the binary to be executed at any point; thus the analysis system is completely trustworthy.
Static analysis has also been used in non-malware research. With regards to fingerprinting for example, Brumley et al. [46] translate binary code to symbolic equations which are then solved to find deviations in differences in implementations of the same protocol.
2.2.1 Techniques to counter static analysis - and countering methods
Berg [50] detect malicious code inside network flows by tracing network traffic looking for typical patterns such as NOP sleds; polymorphic code is detected by looking for evidence of cycles, registers initialized outside and referenced inside loops etc. Li et al. [51] use a heuristic approach to generate signatures for polymorphic worms by analyzing invariant content in the worm binaries. [24], mentioned previously, describes a generic template to capture the decrypting loop of a polymorphic malware instance. C. Kruegel et al. [52] use structural information in executables to detect polymorphic worms; they construct CFGs from exectuables and use graph isomorphism concepts to match these with CFGs constructed from binary traffic in network flows.
2. Metamorphic code: As defined in [48], these techniques involve a malware program rewrit-ing itself - and its payload if present - by insertrewrit-ing NOP instructions, swapprewrit-ing registers, reordering control flow with indirect jumps and other such tweaks; thus each time a mal-ware program copies itself to another location, it will look different both as a binary and in memory during runtime, preventing signature based detection and hampering other forms of static analysis. Chouchane and Lakhotia [53] use code scoring to identify the metamorphic engine - i.e., the code that produces metamorphic variants of the same bi-nary code; their method works with rule based engines. Zhang and Reeves [35] describe MetaAware, a system that identifies metamorphic malware by using an algorithm that quantitatively compares the similarity of instruction sequences that lead to a system call.
Fascinatingly, a lot of research effort has gone into accurately disassembling binaries that use the above methods using dynamic analysis methods. Kang et al [13] describe Renovo, an unpacking system that monitors real time malware execution to determine when a binary has completed its unpacking routine. Martignoni et al. [14] describe OmniUnpack, a similar system. Dinaburg et al. [44] describe Ether, an analysis system completely external to the guest OS in which the malware executes and that is capable of unpacking binaries.
discussed in section 2.3.
2.3
Recent and related work in dynamic analysis
2.3.1 Categories of dynamic analysis
Dynamic analysis has been defined in section 1.3.4.2. Tremendous work has gone into this area of malware analysis, driven by the limits of static analysis, the advent of virtualization tech-nology and the easy availability of cheap commodity hardware. We first discuss two categories of malware analysis as defined in section 1.3.4.2.
2.3.1.1 Monitoring un-tampered malware execution
In this section we discuss methods that do not directly tamper with or modify the execution path of the malware instance once such execution has begun. CWSandbox and Anubis fall into this category as they run malware binaries in monitored virtualized and emulated environments respectively and do not tamper with the malware execution.
Rieck et al. [16] execute botnet binaries repeatedly with varying network and host settings, allowing the binaries tophone home, i.e., contact the command server specified in their
configu-ration. Signatures for such communication are then generated and used for intrusion detection at the network level. Repetitive execution is done with the reasoning that a large spectrum of the kinds of bot traffic will be captured. The fundamental idea is that the host execution environment is modified to create new variations in the network traffic environment. This the-sis emphasizes a similar concept in terms of how only the environment in which the malware binary is executed is modified; Wolfsting tries to modify the system to render to the malware process those resources and settings it is looking for, so that additional malicious behavior can be traced directly on the host.
as the first connection made by worms to the Internet, by tracking user actions such as mouse and keyboard input and matching them to network connections. Thus the system attempts to infer user intent to detect mismatching process behavior; Wolfsting, on the other hand, tries to infer malware intent by simulating user actions and presenting the malware with the environment it is looking for.
[16, 20] are examples of work that force malicious activity without modifying the execution of the malware instance, without brute force disassembly, static trigger-discovery or branch ex-ploration - techniques that may lead to possibly impractical overheads. While the aim of these papers was to create better network intrusion detection signatures and provide advanced mal-ware detection capabilities at the end host, the same concept of forcing malicious activity only by modifying the environment in which it executes, while retaining practical and autonomous aspects, is used by Wolfsting to augment online dynamic analysis systems, as discussed in section 1.2.
2.3.1.2 Tampering with malware execution to force malicious behavior
This section discussed methods that directly tamper with or modify the execution path (in memory) of a malware program once it has begun execution. Extensive research has gone into automatically executing all possible code paths in a malware process. An oft cited work by Moser et al. [11], involves creating snapshots of malware processes when they encounter for example, a branch instruction. To explore both branches, the process snapshot is restored to the original branch point after executing a branch. This allows for a comprehensive execution of the malware binary code, with gains of upto 3000% in terms of code base coverage.
with the overhead experienced by implementations of these approaches is greatly attractive, and complete code coverage is the aim of many dynamic analysis systems; this thesis is offered as a cheaper, more practical means of offering more insight into malware activity.
The above approaches exhibit a high overhead in terms of both time as well as processing. [11] uses upto 20 seconds as the timeout for a single branch of execution; there are several hundred such branches in a single malware executable. [10] reports analysis time of 28 minutes for a single binary of the MyDoom virus. Wolfsting executes binaries with a set time limit of 2 minutes of execution time (with a little extra for overhead as explained in section 7.7.1), however, note that that these systems provide far greater code coverage than Wolfsting can possibly provide. Also, [11] uses virtual machine snapshotting to restore the process to its original state at every branch after subsequent paths have been explored; Wolfsting needs to restore the guest snapshot only once. Therefore, Wolfsting offers a more practical means of eliciting malware behavior while compromising on code coverage.
2.3.2 Using the output of dynamic analysis
We now provide a discussion on the research effort put into using the output of the dynamic analysis approaches in section 2.3.1. Kolbitsch et al extend their work [15] on Anubis to con-struct behavior graphs out of malware execution traces output by the emulation environment of Anubis. Their work uses data taint analysis to track arguments between system calls, and graph comparison algorithms that match behavior graphs to detect and classify malware. In a similar work [12] by Hu et al., the authors construct function call graphs and develop a database that uses graph matching algorithms - with optimizations such as pruning - to detect malware. This thesis may contribute to these systems by providing alternative execution paths that lead to more unique call flow graphs or function call graphs that would help increase the accuracy of malware classification as well as reduce false positive rates.
clusters that are later used to detect and classify malware. In a technical report that extends their work on CWSandbox, K Rieck et al. [19] transform the trace output by CWSandbox to an intermediate instruction set. The set of instructions for a single malware binary is termed a behavioral profile, and clustering is performed on behavioral profiles for large samples of mal-ware binaries to generate clusters that can be used for malmal-ware detection and classification. Wolfsting may contribute to such a system by exposing more malicious behavior in the trace output; increasing the accuracy of classification as well as the reduction of false positives.
Both graph matching and clustering is highly scalable and easy to implement; they do not focus on a single malware instance’s particular execution run. We now delve into how the above-mentioned systems may be made more accurate and efficient by improving the dynamic analysis approach that they depend on.
2.4
Online Dynamic Malware Analysis Systems and Wolfsting
This thesis is built upon the ideas and work specifically in the area of dynamic malware analyzers that operate in a virtualized or emulated environment, specifically providing a method to force more malicious behavior out of an executable. We have already introduced CWSandbox - a tool that is capable of running malware in an isolated environment, providing detailed traces and reports on the malware binary’s execution and Anubis - similar to CWSandbox except that binaries are run in an emulated environment built using QEmu [9]. CWSandbox inserts hooks into Windows user API runtime dynamic libraries and loads these libraries into virtual machines forming part of a cluster. User submitted binaries are run on these virtual machines in which the hooks trace a wide range of calls in the Windows User API. There is no interaction between the analysis system and the malware’s execution in Anubis and CWSandbox; this is logical, as the primary goal of these systems is to be completely automatic.
without any sort of modifications to the execution or environment, by setting up the resources, settings and other objects on a system that a malware instance is looking for.
Chapter 3
Wolfsting Motivation and
Methodology
3.1
Overview
As discussed in Chapter 1, online dynamic malware analyzers have proven useful to both analysts and end users by providing a complete trace of a malware binary’s execution. These traces have also been successfully used as input to subsequent malware detection methods such as the clustering algorithm described in [19]. The motivation behind this work is to improve these systems by actively attempting to engage malware to extract malicious behavior from it. This will not only prove useful to analysts in terms of possessing more relevant information about the malware’s behavior, but will also provide better input to subsequent malware analysis mechanisms. This chapter will describe the motivation behind and the methodology used by Wolfsting to achieve this goal.
3.2
Observations
Figure 3.1: CWSandbox output displaying queries made by malware for information on certain folders
and Anubis using certain malware samples. Examples of these observations that led to the motivation for Wolfsting are:
3.2.1 Malware looking for certain resources
Malware authors write malicious software with specific payloads that target certain users, software and private information. This translates to malware processes looking for certain resources (files, registry keys, processes), which can be observed in the output of the online dynamic malware analyzers; subsequent malware behavior is dependent on these objects being present on the system. To bring out this additional behavior, the system calls that attempt to locate these resources need to succeed, to allow new system calls to be invoked. For example, Fig. 3.1 shows the output of CWSandbox for a malware sample that attempts to locate certain antivirus installation folders.
3.2.2 Malware exploiting outdated or very recent software
malware family. This implies that malware binaries frequently look for older versions of software that contain exploitable vulnerabilities. This makes sense from the perspective of the malware author as well; exploitable systems are frequently ones that are not updated with the latest software or software updates.
On the other hand, malware may also exploit features in recent software or recent updates in software. Analysis systems may not have these software or updates. This can be observed in CWSandbox traces; for example, a trojan called Zeus looks for the phishing filter setting of Internet Explorer version 8; most analysis systems will have Internet Explorer 6 installed since this version has the most number of vulnerabilities.
3.2.3 Malware persisting on the user’s machine
The aim of malware is to steal information or use the resources on a victim machine for as long as it possibly can without being eliminated or in some cases even without being detected. To this end, malware must react to attempts made by the user or antivirus software on the user’s machine to remove the malware instance from the victim machine. Several malware binaries have payloads that are specifically designed to hide malware activities, disable antivirus functionality and security components in browsers. This behavior can only be observed if the target software (or those specific software artifacts that the malware accesses) are present on the system.
3.2.4 Focus on certain system calls
processes etc., and read-queries on certain paths in the registry and certain locations on the file system. If an analysis system could be made to focus on such resource-interaction system calls, by filtering out other calls, unique behavior may be extracted. This was partly the motivation for creating the filtering component as discussed in section 4.3.2.
3.2.5 Advantages of multiple executions
CWSandbox, Anubis and other such systems execute malware once and generate a trace report. By submitting the same malware binary more than once, some behavior was observed that could not have been seen in one execution. This behavior is listed below:
3.2.5.1 Randomized identifier strings
Malware uses randomized identifier names to prevent users and antivirus software from locating the resources it creates. For example, the Vundo trojan uses a random 6 character name for a Dynamically Loaded Library (DLL) that it creates. This name is different each time the Vundo instance executes. Wolfsting is capable of comparing two executions of a malware binary to determine which settings and resources that the malware creates have random names or identifiers, i.e., these settings and resources are usually identical incontent between
different executions, but have different names or identifiers. This is accomplished by allowing the analysis engine to trace the difference in system calls between two identical unmodified malware executions. This information will benefit analysts and end users in tracing malware activity on their machines.
3.2.5.2 Process injection behavior
activity goes undetected. The traces of CWSandbox showed that some malware vary the distribution of such malicious code amongst processes in each run - in one run a bit of code may be injected into process A, while in another the same code is injected into another process B. An example of this behavior seen in the ZBot trojan is discussed in section 4.3.1.
3.3
Engaging Malware
Online malware analyzers such as CWSandbox and Anubis have proven useful to analysts by providing comprehensive behavioral reports detailing almost every system call executed by malware in their virtualized environments. The only variation in the traces across multiple runs is usually caused by malware using randomized strings for filenames, process names and other object names. Wolfsting is intended to further improve these systems by engaging malware during its execution. The term ‘Engagement’ as used here translates to the following operations:
• Allowing malware to execute more of its code base by invoking everyday applications simulating a user attempting to remove the malware.
• Forcing malware to execute more of its code base by creating resources that it attempts to locate but that are not usually found on bare bones analysis systems
Wolfsting also attempts to recognize identifiers with random names. This and the above oper-ations are explained in more detail in the following subsections.
3.3.1 Simulating a user attempting to remove the malware
malicious behavior that Wolfsting aims to invoke by simulating the user’s or antivirus actions. The current implementation of Wolfsting - during the malware instance’s execution - simply launches programs that a user would launch to monitor the system or to attempt to remove malware processes, files and other objects from the system.
3.3.2 Creating fake resources and settings
Chapter 4
Wolfsting Design
The goal of Wolfsting’s design is to modularize each component to allow the use of multiple platforms and tools while implementing the system. To this end, this chapter describes each component of the Wolfsting environment separately, while Chapter 5 will illustrate how the components fit together to create a working system.
4.1
Virtual Environment
The Wolfsting environment consists of a set of virtual machines each of which has a virtual machine snapshot image that can be restored once a run is complete. Wolfsting uses one virtual machine to perform a baseline run in which the execution path of the malware instance is not modified in any way. Wolfsting may then be configured to run the malware instance in one or more virtual machines in which the malware execution may be modified as necessary. The virtual machines are controlled by a centralized component that performs the following operations.
1. Starts and stops the guest operating systems
to a snapshot image after a single run completes.
3. Installs the Wolfsting guest OS components into each virtual machine: The Wolfsting guest OS components including the kernel module, logger, configuration and userland component are loaded into the guest OS before malware is copied and executed.
4.2
Guest OS Components
The components of Wolfsting that operate in the guest OS running on the virtual machines are as follows.
1. Kernel Hook Module: This is the core module of Wolfsting consisting of system call hooks that are loaded into the guest OS kernel memory by the controller. The guest OS snapshot does not have this module pre-loaded, allowing an updated version to be loaded every time a new malware binary is to be processed. The Kernel hook module should be capable of tracking malware process injection and creation, to separate malware activity from benign process or system process activity.
2. Logger Component: This is a simple logger that also resides in kernel memory. It traps messages from the kernel hook module and writes them to a file that is copied out of the guest once the malware execution has completed. The logger component separately
records requests for resources or settings that do not exist during the baseline execution.
This is an important step, since it is these calls that will later result in resources and settings being created for the second run.
4.3
Analysis Engine
The analysis engine consists of two main components:
1. New behavior extractor: Various combinations of traces may be fed into the analyzer. For example, one trace may include output of a virtual machine with no modifications to the malware instance’s execution path, while another trace may be from a run where a certain set of resources were created beforehand. The new behavior extractor compares these traces and extracts new system calls that were generated as a result of modifying the malware’s execution path.
2. Housekeeping/Future-success system call filtering component: Not all resource requests or settings requests will lead to malicious behavior. Calls that are made with slightly differing arguments that fail at one stage and succeed later (see section 4.3.2), housekeeping calls and other non-malware specific calls should not be considered by the resource creation component. These calls must be filtered out to ensure the quality and uniqueness of the extracted new behavior.
The above components are illustrated in Fig. 4.1 and explained in greater detail in the following subsections.
4.3.1 New Behavior Extractor
Trace (XML)
Trace (XML)
VM (Unmodified Execution)
VM (Modified Execution )
Future-Success/ Housekeeping system call filter
Manually created filter list (System calls that succeeded at a later
point or whose exclusive purpose is
housekeeping) New Behavior Extractor
Resource/Setting Creator Wolfsting driver Create/Simulate resources /settings No-create List Analysis Engine Controller (vmrun scripts, userland programs )
Wolfsting driver
Figure 4.1: Design overview of Wolfsting
as a baseline. The algorithm for new behavior extraction is not trivial, due to the following factors:
1. Multi-threaded, divide-and-work strategies employed by modern malware. For example, ZBot distributes several tasks into several different threads, and each thread is executed in a random process’s address space. Thus Wolfsting must recognize that system calls made by different processes in different executions may actually indicate the same behavior. Fig. 4.2 illustrates the challenge of recognizing that two system calls are the same.
Malware
Explorer.exe
Guest Code injected
System call to set autorun Key (ZwRegistrySetValue)
Malware
Svchost.exe
Guest Code injected
System call to set autorun Key (ZwRegistrySetValue)
Figure 4.2: ZBot injects code into different processes on every execution, thus system calls from two different threads may be identical otherwise
particular malware instance.
3. Changes in environment: This is the most difficult factor in identifying behavior that is not new, i.e., additional system calls invoked due to a software environment or the creation of a fake resource or setting. While the memory image of the OS, the hard disk contents and even the system time information is exactly the same in two runs of the same malware, interaction between the malware instance and the network or the Web itself may cause some change in the behavior of the malware. Wolfsting cannot currently recognize the difference between new behavior caused by this factor and new behavior due to Wolfsting-caused modifications. This problem has not been observed in most of the experiments carried out with the Wolfsting prototype; malware samples have a far longer lifetime than the servers they connect to since malware supporting domains are taken down daily; and those that do connect to servers do not change their behavior between two executions that are spaced only a few minutes apart.
<zͺ&/>ͺZd; KďũĞĐƚƚƚƌŝďƵƚĞƐͲхKďũĞĐƚEĂŵĞ͕ ĞƐŝƌĞĚĐĐĞƐƐ &ŝůĞƚƚƌŝďƵƚĞƐ͕ ^ŚĂƌĞĐĐĞƐƐ͕ ƌĞĂƚĞŝƐƉŽƐŝƚŝŽŶ͕ ƌĞĂƚĞKƉƚŝŽŶƐͿ NtCreateFile( FileHandle, DesiredAccess, ObjectAttributes, FileAttributes, ShareAccess, CreateDisposition, CreateOptions, );
Figure 4.3: Certain parameters of the NtCreateFile Windows API are used as part of a key to differentiate between two NtCreateFile calls
execution. For example, in the current implementation of Wolfsting, for the Windows NtCre-ateFile call, Wolfsting uses a combination of parameters as shown in Fig. 4.3 to create a unique key that allows Wolfsting to differentiate a system call between two isolated executions. Thus, to extract new behavior, the algorithm simply compares keys to see if any parameter is differ-ent, i.e. it identifies new system calls invoked during an execution when compared against a baseline, unmodified execution.
4.3.2 Housekeeping/Future-Success System Call Filtering Component
The Analysis engine requires a filtering component that determines which resources and settings need to be created and which do not. While the initial motivation for this component was discussed in section 3.2.4, other factors necessitate its existence as follows:
ample, operating systems have configurable paths where files and settings maybe located. A program may follow a search path order to locate a setting or resource. It is necessary for the filtering component to recognize that a resource that has not been found in one location may have been found in another at some point later in the trace, as the program parses all possible search paths. The filtering component simply parses the trace output of a baseline run and eliminates calls that fit into the categories explained above, allowing the resource creation component to create only those resources or settings that may lead to new malware execution behavior.
It is difficult to recognize which calls are housekeeping calls and which are not. The filtering component contains a manually created list against which it matches system calls found in the traces. This list was created by observing calls made during the execution of benign applica-tions such as those found in the Windows system32 folder, browsers etc., and through some
research on Microsoft’s Developer Network [54]. Note that this list is minimalistic in the cur-rent implementation of Wolfsting; very deep expertise in Windows would be required to create an accurate list, due to the sheer number of documented and undocumented Windows system calls.
4.3.3 Resource Creation
most programs run with default settings that apply to all programs when no additional options are specified. Wolfsting must not create fake entries in these settings; the malware’s execution may be unnecessarily modified without producing any new malicious behavior. Wolfsting must deal with requests for various types of files, directories, processes and other OS objects. The assumption here is that most malware will request and manipulate resources without checking their authenticity. As this falls into the category of implementation issues, a more detailed description can be found in section 6.3.3.
4.4
Summary
Chapter 5
Wolfsting Execution
5.1
Wolfsting Processing
The Wolfsting process consists of at least two executions of the malware binary as shown in Fig. 5.1. After a single execution, the guest OS snapshot is restored, wiping all traces of the execution. During the first execution, the malware binary is run without any modification to the environment; this is termed abaseline run. This is a data collection run where the logging
component records all system calls made by malware processes and threads and outputs them in a trace file. The second execution is preceded by the activation of the resource creation component of Wolfsting in the guest OS. This component creates all the resources (registry keys, files, directories) requested by the malware in the first run.
1. The Wolfsting driver is copied into the guest virtual machine and is loaded into memory. The system call hooks are installed and the logging mechanism is setup to record system call information.
2. If this is a second run for the malware instance, all non-existent resources requested in
the previous run are created or simulated.
local network is protected by rendering access to the gateway and DNS server alone.
4. Various tools are run to simulate user activity by the Wolfsting userland program launcher.
5. All malware process activity is recorded, including threads injected into other processes. If this is the first run, all resources requested but not present on the system are noted in the trace.
6. All calls leading the kernel code execution such as the NtLoadDriver API in Windows machines are rejected. Wolfsting may be configured so that the NtLoadDriver call may be allowed to succeed in the baseline run, and then rejected in the second run, if it is necessary to observe the difference in the traces due to the malicious driver being loaded into memory.
7. The trace output is communicated to the host machine and the guest machine is reset to its clean snapshot.
8. If this is the first run, the trace output is run through an analysis engine that decides what resources need to be created to evoke additional responses from the malware instance. A program is invoked on the guest machine to create or simulate these resources and steps 1 to 7 are repeated as a second run.
9. The trace output from two runs is fed into the new behavior extractor that is part of the analysis engine.
Controller Guest Driver (Hooks) Guest Driver (Hooks) Analysis Engine 3. Resource Creation Information 1. Execute Malware 2. Driver XML Output
5. Create Resources and Re-execute
Malware 4. Restore
Snapshot
Malware Malware
Baseline execution Second execution
6. Driver XML Output
7. New behavior (New system
calls)
Figure 5.1: Wolfsting Processing
Malware Hook Functions Logger Guest OS Kernel Guest OS Userland System Calls Driver BLOCKED Load Driver Attempt XML Output Userland program launcher (NtLoadDriver in Windows)
Figure 5.2: A Wolfsting VM guest with the malware binary, userland component and kernel driver loaded into memory
5.2
Two Runs
Wolfsting executes the malware twice serially because it needs to collect certain information whose state may change over the course of a single execution. Figs. 5.3a and 5.3b illustrate the necessity of a second run. Most operating systems use a set of environment settings that system calls need to parse before loading libraries or executing other programs. For instance, to load a certain library into memory, the system needs to first locate the library in secondary storage. This requires the parsing of possible locations where the library may be located. To Wolfsting, this means that a system call that fails due to a missing resource may succeed subsequently; however, whether it does succeed and the timing of this success is impossible to predict. This is shown in Fig. 5.3. The Windows operating system uses a store of system and application information and settings called aregistry. Applications need to parse the registry - organized as
possible to determine which of the calls will succeed and when; two non-parallel executions are required.
Having more than two runs in the Wolfsting process seems like a good idea; at each step, more malicious behavior may be extracted. However, looking at traces from multi-run exper-iments with real world malware, we noted that most resource or setting queries are terminal, i.e. the first query for a registry key or a file or other resources usually has all the information to locate the resource. Once the resource has been found in the second execution, all behavior due to the presence of the resource is captured, rendering little benefit from additional runs. This point is discussed in further detail in section 8.3.
<Application Directory >\ <SystemDLL .DLL> Windows\ <SystemDLL .DLL> Windows\System32\ <SystemDLL.DLL> Application thread attempting to
locate a system library (Dynamically Linked Library)
Failed Failed
Success
(a) An application attempts to load a library file into memory. It searches sequen-tially for the DLL in the paths listed in the PATH environment variable. Each attempt to locate the DLL is actually a single system call that tries to open the DLL (ZwOpenFile) using a possible path. Wolfsting must note the successful third call and ignore the first two. Thus during the first execution run, Wolfsting should not attempt to create the DLL, because the actual DLL is present and will be loaded later on.
HKCU\Software\Policies\Microsoft \Windows\Safer\ CodeIdentifiers \TransparentEnabled
HKLM\Software \Policies\Microsoft\Windows\Safer\ CodeIdentifiers \TransparentEnabled
Application thread attempting to locate a certain registry key
Failed
(b) An application attempts to locate a registry key which may exist wither
in the HKEY_CURRENT_USER or
HKEY_LOCAL_MACHINE subtrees of
the Windows Registry. The second call to locate and open a handle to the key succeeds. If Wolfsting attempted to create the key during the first run, it would cause unnecessary deviation in the execution path of the malware binary.
Chapter 6
Wolfsting Implementation
The current implementation of Wolfsting consists of the following components, correspond-ing to the design specified in Chapter 4.
6.1
Virtualized Environment
The host machine was an Intel Core 2 Duo 3GHz processor with 4GB RAM and Windows Vista as the host OS. VMWare images with Windows XP SP3 were used as guest machines. This was the hardware and software used to evaluate Wolfsting.
6.2
Guest OS Components
6.2.1 Kernel device driver
calls thatfail during the baseline run due to a resource or setting not being present in the guest
OS are recorded separately (in separate XML tags) to allow the resource creation component
to create them for the second execution.
Most modern malware programs spawn multiple processes and threads and even inject code into other processes. The driver hooks all the native API calls that can lead to creation of new processing contexts such as NtCreateProcessEx, NtCreateThread etc., allowing Wolfsting to the track malware execution across all these threads. Comparing system calls across between process and thread contexts from two different, isolated executions is accomplished by abstract-ing away process and thread information and usabstract-ing only resource identifiers (file names, registry key names) to compare system calls - this is explained in section 4.3.1. Process injection in Windows is almost inevitably done using the CreateRemoteThread userland API, which in the kernel translates to NtCreateThread with a target process id specified that is different from the process id of the process that invoked the userland API call. This information is used to trace remotely injected threads.
6.2.2 Userland program to simulate user actions
This component is a Win32 program that runs applications that would normally be executed by a user trying to analyze or disinfect a malware-infected machine. A separate program to launch these applications renders it easy for the kernel hooks to isolate system calls made by these applications that are not malware related. Section 7.2 details a malware binary that prevents such applications from launching using a subtle Windows registry trick.
6.3
Analysis Engine
6.3.1 New Behavior Extractor
following algorithm to accomplish this task.
1. The XML output trace of the first execution trace consists of entries each of which cor-responds to a single system call. As explained in section 4.3.1, these system calls are converted to keys that are unique to a single execution of the malware instance.
2. All the keys generated in step 1 are added to a hashtable H1.
3. The XML output trace of the second execution is converted to keys and added to a hashtable H2.
4. For each key Ki in H2, if Ki is not present in H1, the system call represented by Ki is
added to the output as new behavior.
6.3.2 Filtering Component
This component is implemented as a .NET executable that parses the XML trace output of a baseline Wolfsting execution trace and eliminates housekeeping system calls that look for non-existent resources, or calls that succeed at a later point in the trace as illustrated in figure 5.3. This is accomplished using the following algorithm:
1. The XML output trace of the first execution trace consists of entries each of which cor-responds to a single system call. As explained in section 4.3.1, these system calls are converted to keys that are unique to a single execution of the malware instance. The keys also include the return status of the system calls.
2. All keys generated in step 1 are added to a hashtable H1.
3. For each key Ki in H1, a check is made if the key matches a system call present in a
manually created filter list F. String matching and regular expression matching are used to accomplish comparison between the parameters in keys. If the system call corresponding to Ki is found in the filter list, Ki is deleted from H1.
6.3.3 Resource Creator
Chapter 7
Results
7.1
Overview
The hardware and software in the experimental setup used to evaluate Wolfsting is described in section 6.1. This chapter discusses the additional behavior discovered by Wolfsting in some malware families. Over 100 malware binaries belonging to various malware categories such as botnets, trojans, worms etc. were run through the Wolfsting process. These binaries were mostly sourced from VxHeavens [37] and Offensive Computing [38]. VirusTotal.com [39] was used to ensure that the malware samples were labeled accurately. The following families were selected as a comprehensive set encompassing the type of results obtained from the experiments.
7.2
Trojan Dropper
Trojan Droppers are a class of malware that act as droppers, i.e., they download other
Figure 7.1: New behavior recorded in Trojan-Dropper.Win32.VB after Wolfsting created the COM3\Debug key.
7.2.1 Analysis by Resource Creation
Microsoft’s Component Object Model (COM) as defined in Wikipedia [27], is a binary-interface standard for software componentry, and is used to enable interprocess communication and dynamic object creation in a large range of programming languages. COM is used in browsers and other Windows applications to interact with system or network wide objects, and possesses a debugging component that allows call tracing and security logging. The settings for debugging are located atHKLM\Software\Microsoft\COM3\Debug, a key in the Windows
registry.
Trojan-Dropper.Win32.VB uses COM to interact with Internet Explorer, to enable mali-cious activity that would allow recording of user activity on banking sites, as well as injecting additional fields into web forms to extract more personal confidential data from unsuspect-ing users. Wolfstunsuspect-ing determined that Trojan-Dropper.Win32.VB attempts to locate debuggunsuspect-ing settings by recording - in the first, baseline run - the Windows Native API call
ZwQueryReg-istryKey, which opens a handle to a registry key. The call failed since the debugging key did
not exist in its registry of the analysis machine, i.e. the guest OS. In the second run, Wolfsting created this key in the registry and observed new behavior as illustrated in Fig. 7.1.
missing registry key. Clustering based on the system call trace obtained from Wolfsting would result in more accurate clusters, and behavioral graphs would have more malicious activity recorded, if constructed from the new trace.
7.2.2 Simulating user actions
Trojan-Dropper.Win32.VB uses several methods to persist on a victim’s system. This in-cludes preventing the execution of certain processes that users run to attempt to kill mal-ware processes or investigate modifications made by malmal-ware to the system. For example, in Windows systems, a program called the task manager lists running processes; users may run the task manager to attempt to kill Trojan-Dropper.Win32.VB processes. Thus it is in Trojan-Dropper.Win32.VB’s interest to prevent this user action from successful completion. Wolfsting was able to extract this new behavior by running the task manager and other pro-cesses during Dropper.Win32.VB’s execution. This is illustrated in Fig. 7.2; Trojan-Dropper.Win32.VB sets a registry key to configure win32d.exe as the debugger program for the Windows Task Manager (taskmgr.exe). This information is retrieved during the first call to query the file attributes of taskmgr.exe, and prevents Task Manager from executing.
Trojan-Dropper.Win32.VB also prevents other processes, such as a program that allows users to edit the registry and another that provides an interface to edit startup settings for Windows. Wolfsting is able to record the actions taken by the user level component to launch task manager, and how Trojan-Dropper.Win32.VB is able to intercept and modify those actions to disable the program as shown in Fig. 7.2. The registry entry modified is shown in Fig. 7.3.
Query File Attributes: Taskmgr.exe (ZwQueryFileAttributes)
Open File: C:\ Open File: C:\Windows
(ZwOpenFile)
Query File Attributes: Taskmgr.exe (ZwQueryFileAttributes)
Create Section (ZwCreateSection)
Map View Of Section (ZwCreateSection)
Create Process : Taskmgr.exe (ZwCreateProcessEx) Open File: C:\Windows\system32\
taskmgr.exe (ZwOpenFile)
Query File Attributes: Taskmgr.exe (ZwQueryFileAttributes)
Open File: C:\ Open File: C:\Windows
(ZwOpenFile)
Query File Attributes: win32d.exe (ZwQueryFileAttributes)
Open File: C:\ Open File: C:\Windows Open File: C:\Windows\Temp\
(ZwOpenFile)
Map View Of Section (ZwCreateSection)
Create Process : win32d.exe (ZwCreateProcessEx) Open File: C:\Windows\system32\
taskmgr.exe (ZwOpenFile)
Create Section (ZwCreateSection) Query File Attributes: win32d.exe
(ZwQueryFileAttributes) Running Task Manager without
Zbot infection
Running Task Manager during Zbot execution
7.3
AVKiller (Agent2)
AVKiller is a generic name given to any family of malware that attempts to disable or destroy antivirus installations. Several methods are used to accomplish this; an analysis by Wolfsting of an instance of this type of malware, labeled as Trojan.Win32.Agent2.cqgi by Kaspersky antivirus [55], is presented in the following subsections.
7.3.1 Analysis by Resource Creation
The Windows operating system has a security model that allows objects on the systems to have Access Control Lists or ACLs. Files, registry keys, processes and other objects have security descriptors associated with them containing ACLs that specify user permissions. Fig. 7.4 shows the security descriptor for the Windows folder on the guest. The upper white box displays the list of users and groups in the system that have permissions assigned for the Windows folder.
This particular instance of malware attempts to locate certain antivirus installations on the filesystem - these queries are intercepted and recorded by Wolfsting. These folders have associated security descriptors that contain ACLs for users in the system. The API used by this malware instance is ZwQueryAttributesFile (succeeds only if the queried folder exists) that retrieves various types of information including the security descriptor associated with the installation folder. Fig. 7.5 shows the folders queried by the malware.
Wolfsting creates the folders listed in Fig. 7.5 in the guest OS before the second execution of the malware instance. The output of Wolfsting as shown in Fig. 7.6 shows additional behavior displayed by the malware instance.
The calls to ZwSetSecurityObject - a native Windows API responsible for setting security
descriptors on OS objects - are intercepted by Wolfsting’s hook, calledHookSetSecurityObject,
Figure 7.4: Security descriptor of the Windows folder, displaying permissions for users in the system
Figure 7.6: Wolfsting XML output shows Agent2 setting bad security descriptors for antivirus folders, making them inaccessible. The exact security descriptor has not been shown for brevity.
security descriptor associated with the folder after the ZwSetSecurityObject system call made by the malware instance.
7.4
Vundo
Vundo is a family of trojan horses that are known to infect a user’s computer through websites that exploit vulnerabilities. Vundo then proceeds to present a fake antivirus UI to the user, claiming that there are several malware infections on the user’s system. If the user clicks on a link on the UI, she is taken to a page where she may be asked to pay for antivirus software that is either fake or completely unnecessary. Vundo has several other payloads and exhibits other malicious actions as described in [33]. Wolfsting analyzed several Vundo variants and found the following new behavior.
7.4.1 Analysis by Resource Creation
Kaspersky’s Antivirus Pro software (KAVP), like most other antivirus software, has an update mechanism that allows it to update its virus signatures. This allows KAVP to detect the latest malware samples. KAVP uses the windows registry to store its configuration, with registry keys indicating where KAVP is installed, how frequently KAVP should scan the machine for viruses and other settings. One particular registry key determines if KAVP should enable the component responsible for updating its signature database. This key is requested for in the baseline execution by the Vundo instance. Fig. 7.8 displays the registry key that the Vundo instance requests, and Wolfsting subsequently creates. In the second execution, Vundo proceeds to set a value in the key that disables KAVP’s update service. This allows Vundo to drop other new malware onto the victim’s machine without being detected by KAVP.
Figure 7.9: Wolfsting output displaying modifications made to Internet Explorer settings, specif-ically the way IE connects to the network and the Internet
The Vundo malware sample also attempts to access settings that configure how Internet Explorer, a popular browser from Microsoft, connects to the Internet. These settings do not exist in every version of Internet Explorer. Wolfsting observes these queries during the first execution and creates the keys in the registry that the malware instance is attempting to access. The keys listed in Fig. 7.9 in the<key>tag were created by Wolfsting. Benign applications
Figure 7.10: Random identifier string (boldface) used by Vundo - the string changes during every execution
7.4.2 Random Identifiers
Vundo creates registry keys with a random identifier string and a DLL with the same random identifier string as illustrated in Fig. 7.10. This behavior was capture by feeding two unmodified baseline execution traces to the Wolfsting new behavior extractor. Note that it is not necessary that the same processes or threads create these files or keys in the two executions of the Vundo instance.
7.5
TDSS
TDSS is a recently developed stealthy trojan horse that displays unsolicited advertising and redirects websites by hijacking browsers on the victim’s machine. TDSS is a good example of how malware is evolving to use stealth (rootkit) techniques to hide from or even disable anti-malware software on end-user machines today. We include this trojan in our resultsnot to
Baseline execution: TDSS driver allowed to load
Second execution: TDSS driver is not allowed to
load
(NtLoadDriver call is blocked by Wolfsting)
Wolfsting Analysis Engine
Figure 7.11: Wolfsting analysis approach for TDSS
block malware from loading windows drivers and help an analyst get an initial idea some of the operations that the rootkit is performing. This is done by comparing the execution of TDSS in the baseline run versus a second execution in which the rootkit Windows driver is not allowed to load as illustrated in Fig. 7.11.
During an execution of malware in the Wolfsting guest OS, the driver loaded by Wolfsting has its hooks loaded into memory for the entire duration of the execution. This implies that there is trace output generated for this entire period, and the last time stamp in the trace output corresponds to the time when the snapshot of the clean guest OS is restored in the virtual machine. However, during the execution of TDSS in the baseline run, it was observed that the time stamp of the last trace output was much earlier than the time when the VM snapshot was restored. This implied that the Wolfsting driver was unable to trace output for the whole duration of TDSS’s execution; something disabled its tracing functionality. However, this problem was not seen in the second execution; the last timestamp of the trace matched the time when the VM snapshot was restored.
Fig. 7.12 displays the last moments of the trace output during the first baseline execution of the TDSS instance as well the second execution during which the driver is blocked from loading. The </scan> tag indicates that Wolfsting trace has ended. The call that loads the
(a) Last moments of trace output of TDSS instance - Driver allowed to load (result = 0 on line 13 indicates success)
(b) Last moments of trace - TDSS driver blocked from loading. (result != 0 on line 30 indicates failure)