• No results found

Our GUI signature approach targets the Windows platform due to both the number of deploy- ments [29] and the number malware campaigns targeting Windows [20]. In Windows, applications with an on-screen presence communicate with the operating system via message passing as shown in Figure 4.2. Messages sent to and from an application dictate how the GUI should be created and what inputs are received. This level of abstraction simplifies the understanding of an application’s objects being created, mouse input, and the translated version of raw keyboard input.

4.2.1

GUI Signature Components

GUI signatures can be used to detect when an application (1) is known but has no on-screen presence (i.e., a known empty signature), (2) is unknown and has an empty or non-empty signature, or (3) is known and has a valid GUI signature. We both build and enforce GUI signatures based on the information received by the collector. If the signature is known, we can determine if the application’s system behavior, in our case network access, corresponds to some path in the signature. Signatures are composed of zero or more paths. Each path is composed the following:

• GUI objects: These are objects that ultimately create the visual representation of a user interface. This may include windows, menus, buttons, text boxes, or other types of objects. As a user interacts with an object, other objects may be created dynamically, such as creating a new window, and then dynamically destroyed. While they exist, a hierarchy of objects can be determined. For example, a text object may be the child of a button and the button the child of a window.

• Interactions: These are the interactions generated by the user. For an application to function, the user must interact with the GUI. Interactions can occur through the mouse or

keyboard . As the user interacts with the application, the application transitions state, such as when new windows are created.

• Sink: The sink is the final node in a given path which represents the targeted system resource that is requested. In our work, the sink is always network interaction.

4.2.2

Signature Paths

As described, signatures are comprised of zero or more paths, which are themselves composed of GUI objects, interactions (transitions), and a sink node. We now describe how we collect each of these to form paths.

GUI Objects

A graphical user interface on a user’s screen maintains a hierarchical structure defined by each individual application. Windows, menus, buttons, and text labels are some of the more common objects. When objects need to be created, an application uses the Windows API to request an object be created. It then receives a WM CREATE message upon the object’s instantiation. When destroyed, it receives a WM DESTROY message. When created, objects have a known parent, such as the Desktop for newly launched applications, or may be children of previously created objects. As such, monitoring Windows messages allows us to understand the structure of a graphical interface. Windows also maintains handles to objects which allows us, given a handle, to access the object and immediately begin traversing up or down the hierarchy as well as accessing its internal state. For understanding objects and their structure over time, we must be able to uniquely identify objects across executions. Although Windows maintains unique handles for objects, these handles are unique per instantiation. A handle will be different even within the same process execution if a window is destroyed and recreated. We create unique identifiers using the following naming convention: object ident = text:class:parent text:parent class:depth where class is the class type of the object and depth is the object depth length from the root object of the process. In this naming convention, it is possible to have ambiguously labeled objects, but in our experience, such collusions occur infrequently. The ambiguity a collision could allow an attacker to take a non-network sink path that ultimately creates network traffic but would be undetected by our approach due to the ambiguity of a legitimate path also existing.

Interactions

GUI objects are created when an application launches and can be dynamically instantiated as the user interacts with the application. Understanding interactions is critical to determining when sink nodes (e.g., network traffic) are reached. In Windows, user input such as mouse and keyboard events are also relayed to applications using the message passing system. In particular, events such as the left click and keystrokes are passed to the application in conjunction with the handler associated with the object receiving the input. This information is then passed as a message. For example, a click on the Print button would result in a Windows message specifying the event type of BM CLICK and a handle to the Print button itself.

By monitoring user generated events and the objects receiving the events, we can understand how a user interacts with an application. During the signature generation phase, we monitor all potential interaction paths that lead to a sink node. We later prune paths that do no result in reaching a sink node in order to reduce the number of total paths in the signature.

Sink Nodes

The final component in a path contained in a signature is a sink node. The focus of our work is to determine when an application should or should not be requesting network access based on a user’s specific interactions. For this, we must link an application’s network traffic and user interactions. Figure 4.2 shows that a local collector application monitors in realtime GUI events as well as monitoring network access in the form of new flow generation. Our approach focuses on detecting new network flows not initiated in response to the user’s interaction rather than monitoring existing flow activity. As such, we avoid the added overhead and complexity of per packet analysis.

The collector records and links applications by their process ID. It records data in a time-series. As such, we know that any time the collector sees an application generating a new network flow that a sink node has been reached. Applications may not necessarily generate a single network connection per interaction path. For instance, a user clicking to send an email may result in multiple network connections such as a DNS request for the mail server and then the TCP connection for transmitting the email to the outgoing mail server. To account for these scenarios, we group the cluster of network connections as a single sink node using time-based heuristics.