• No results found

2.4 Summary

3.2.8 Malware analysis

Malware analysis is used to understand how malware works, create malware signatures so that it can be detected, identify systems that need to be quarantined and patch systems so that they are not infected again. Victims may identify malware during or after an attack that can be inspected by malware analysis techniques. Other technical attribution techniques, such as Monitor Host

(Section 3.2.7, page 50) may identify malware. Analysis of malware may identify clues or indicators that lead to or assist with attribution. Clues may be immediately obvious, such as individuals names, IP addresses, domain names, development environment variables, etc. Clues may be more subtle, such as linguistic properties, unique programming styles and code re-use amongst disparate malware.

A variety of approaches exist for malware analysis, including static analysis, dynamic analysis and tools that semi or fully automate these approaches. Static analysis approaches analyse malware without executing it, using tools such as dissassemblers. Dynamic approaches analyse malware during execution, with sandboxes and debuggers. This section reviews these approaches in light of attribution goals.

Approaches

Sikorski and Honig (2012) identify four approaches for malware analysis; i) basic static analysis, ii) basic dynamic analysis, iii) advanced static analysis and iv) advanced dynamic analysis. These approaches are best used sequentially. They increase in complexity, expertise and time required.

Basic static analysis includes scanning malware with anti-virus (AV), generating hashes and running tools such as “strings” against malware to extract sequences of characters. This can reveal, for example, IP addresses and domain names that the malware communicates with and Windows DLLs that hint at the functionality and purpose of the malware. Detecting packers is also a basic static analysis technique. A packer obfuscates the malicious binary, to subvert signature-based AV and make it more difficult to analyse. Malware is scanned with a program that recognises packer signatures, e.g. PEiD (2014). Basic static analysis is useful for a first glance at malware, to under- stand roughly what it might do and what it might connect to. The disadvantage is that at this point there is no way of knowing what it will actually do. This is where dynamic analysis helps.

Basic dynamic analysis involves executing malware in a sandbox environment and analysing changes made to the environment while it is running. The environment is either an air-gapped non- production physical system or a Virtual Machine (VM). The simplest approach is to submit malware to a service such as Anubis (2014). This approach requires no analysis on behalf of the victim. Once analysed by the service, the victim usually receives a report that describes the malware. Anubis runs the malware in an emulated environment and provides a heuristic description, such as what modifications are made to the registry, file system and generated network traffic. It does have some drawbacks. The sandbox environment might be the wrong operating system or version, not allowing the malware to run. If the executable requires command line arguments, these are not included.

Dynamically executing malware locally is usually the next step. Malware is executed in a sandbox and system analysis tools are used to monitor volatile data. This typically comprises four actions; i) observe a healthy baseline with system analysis tools, ii) execute the malware, iii) see what changed, and optionally iv) create simulated network services for malware to interact with. The state of a VM can often be saved using “snapshot” functionality and recovered with “revert” functionality, so that executing the malware for the first time can be replayed. A collection of analysis tools is the Microsoft Sysinternals suite (Russinovich, 2009). It includes process monitors such as ProcMon, ProcExplorer and ProcDump, registry entry monitors such as Regmon and a rootkit revealer. Tools for packet sniffing and simulating services e.g. INetSim are also useful (Hungenberg and Eckert, 2013). An example of this approach and basic static analysis in practice is Russinovich (2011), the

author of Sysinternals, who analyses Stuxnet using the Sysinternals suite. Overall this approach confirms some of the initial suspicions identified in basic static analysis. Drawbacks are that full code paths are not explored and simulating protocols will only provide non-interactive results.

The next approach, advanced static analysis, encompasses the analysis of disassembly code. The malware is opened in a disassembler, such as IDA Pro (2014) to view the machine code instructions, i.e. what the processor interprets when the malware is executing. Using this approach, full code paths are explored and an understanding of the high level source code, including how it behaves, is gained.

The final step, advanced dynamic analysis, involves the use of a debugger to analyse malware while it executes. A source-level debugger, such as OllyDbg (Yuschuk, 2007), shows the memory addresses and processor registers while malware is executing, allows full control of execution with step through/step into commands and registers can be changed at will.

While advanced static analysis and advanced dynamic analysis require the most investment in time and specialist expertise, these approaches give a more complete understanding of the malware. This is especially useful for unknown malware, that has been authored to specifically target a victim. The manual approaches discussed above are time consuming, even for experts. Parker (2010) notes that they are concerned with understanding what the malicious binary does, rather than attribution questions, such as who authored it or who used it. Several automated approaches have been proposed that are specifically targeted at answering attribution questions. Black Axon is one example (Parker, 2010). It aims to assess the likely technical skill of the author, identify the use of known and unknown techniques and match techniques with the modus operandi of known attack actors. A similar approach, Malware Analysis and Attribution using Genetic Information (MAAGI), uses advanced static analysis to extract disassembly code and compare it to a corpus of samples (Pfeffer et al., 2012). Both of these approaches identify code re-use amongst malware to derive characteristics of the adversary and ancestry of the malware.

Authors of malware avoid analysis attempts using various measures, i.e. obfuscating or encrypting source code, checking for sandbox environments, checking for the presence of dynamic analysis and debug tools, etc (Chen et al., 2008).

Criteria

Attribution artifacts that are collected: Malware analysis can reveal IP addresses, domain names that can be investigated further, i.e. if part of a botnet. Malware stylistic properties, choice of libraries used, development environment properties including software tools and version numbers used for compilation, etc., can also be revealed. These can be compared with other malware to identify similarities. The type of malware, e.g. backdoor, botnet, information-stealing malware, rootkit, scareware, worm, etc., is revealed. This information often helps to identify the intent of the adversary, which in turn helps to build a profile and rule out potential suspects. Analysis also reveals if the malware is widely known or if it is unique. Widely known malware is often part of a mass attack. Unique malware is more likely to be part of a targeted attack.

Technique reliability: Malware analysis is used on a case-by-case basis. Malware may not be recoverable meaning that this technique may not be useful. Malware might not be present, e.g. a dictionary attack of the root user against a Secure Shell (SSH) server resulted in compromise, or malware might have been deleted by the adversary once used. Also, similar to Monitor Host

(Section 3.2.7, page 50), an adversary controls the malware and therefore can modify it to include false flags to implicate an innocent party. For these reasons malware analysis cannot be relied upon to offer reliable attribution on its own merit. It can be considered reliable under a consilience theory, i.e. when used with other techniques, as this makes it much harder for an adversary as they need to dissuade many attribution techniques.

Technique limitations: In mass attacks with known malware, malware analysis can reveal the author of the malware in question, but not necessarily the instigator of the attack, since malware is often sold through underground channels. Additionally the possibility of real-time attribution through malware analysis is slim. The activity, when performed manually, is often time consuming. False flags could be added to malware to implicate parties other than the true adversary. The effectiveness of this technique is reduced when malware deploys anti-analysis techniques such as anti-virtualisation, anti-debugging and anti-dissassembly (Chen et al., 2008).

Legal and ethical issues: There are no known legal and ethical issues regarding the analysis of malware that is recovered from a victim system.

Deployment requirements: There are no particular deployment requirements, no systems need to be modified. However, specialists are required to perform malware analysis and a combi- nation of commercial and open source tools are often used. This specialist skill could optionally be outsourced.

Relevance outside of the laboratory: This technique is highly relevant outside of the labora- tory and is in widespread use. An example is the Stuxnet malware, that has been analysed by many researchers and organisations (Symantec, 2011; Russinovich, 2011). Clues have been identified that tentatively link the malware to suspected parties (Parker, 2011). Many malware analysis services exist, e.g. Virus Total (2014), so that identified malware can be compared to see if it is already known.