Teaching Digital Forensics Techniques Within Linux Environments

(1)

Teaching Digital Forensics Techniques

Within Linux Environments

Lucas McDaniel Department of Computer Science

University of Alaska Fairbanks

Brian Hay

Department of Computer Science University of Alaska, Fairbanks

[email protected] [email protected]

Abstract

Appropriately motivating digital forensics topics in an educational environment is a challenging task for a lecturer. Not only will the skill levels of the students vary widely, but designing a lab exercise that introduces a single concept runs the risk of requiring too much additional knowledge to appropriately describe the task or may easily devolve into a contrived example that does not allow the student to fully grasp the extent of the topic at hand. In some cases, this difficulty is compounded by the sheer amount of misinformation that results from years of common knowledge and research becoming invalid after changes to kernels and operating systems.

Last year, the Honeynet Project Challenge 12 - "Hiding in Plan Sight" – and a computer security workshop sought to introduce some concepts regarding information and process hiding and disguising through a series of digital forensics labs. This paper will describe the components of these labs that were successful at motivating a core concept, as well as those that were not as successful and have been subsequently modified based upon feedback. These findings will be presented through a suggested lecture-lab format, and a series of scoped topics that can be used in other educational environments to motivate digital forensics and anti-forensics concepts. Scripts used to build each lab have also been provided to serve as a point of reference.

Keywords:

Digital Forensics, Malware Forensics, Process Hiding, Linux, Educational Module

1 INTRODUCTION

Within the last decade, the field of digital forensics has seen substantial advancements in terms of novel research, publicly available information and tools, educational curriculum, and standards [22]. In midst of these advances, there are still many subsections of this

field that have not seen the appropriate level of improvement in one or more areas as is the case with process control systems and live acquisition methods [24].

Tools and techniques for filesystem and network forensics, for example, have seen high levels of acceptance in the field. This is understandable due to the pervasiveness of these components within an investigation, the relatively standardized nature of these components, and their ease of accessibility. This is also partly due to practitioners not requiring or utilizing other areas of research during their investigation; however, this should not be viewed as illegitimatizing the practical nature of other research [23].

For instance, most computers utilize network access for some aspect of their operation. Since the network stack and protocols have remained relatively unaltered after years of computer development, tools that seek to analyze network traffic will be applicable in a wide range of settings. This differs from executables on a device that can differ greatly across types of systems (e.g. Windows, Linux, Andriod), distributions of those systems (e.g. XP, Debian, Fedora), and even versions of those distributions. Due to expected lifespan of research being relevant to an aspect of digital forensics, the current discrepancy between the maturity of tools and techniques of some subfields seems reasonable.

As the field continues to advance, not only should the breadth of research in a variety of subfields increase, but also it will become quite necessary as a strategic change in direction, such as live analysis, can drastically improve a practitioner’s understanding of a system that would have been nearly impervious to disk analysis alone.

In order to achieve this increased breadth of understanding, it is necessary for labs and workshops to be equipped with both personnel and environments to help teach and motivate new techniques. It is the goal of this effort to reduce the burden for teachers and lecturers by providing a modern and practical means of motivating and demonstrating the techniques of process detection 2014 47th Hawaii International Conference on System Science

(2)

within a Linux environment through a series of lectures and labs.

Section two of this paper motivates the suggested lecture-lab format based upon experimental evidence and participant feedback. Section three, four, and five will discuss techniques used by malware to hide processes or information during a forensics investigation though loadable kernel modules (LKM), file infection, and obfuscation methods respectively. And the final two content sections discuss the difficulties that arose from developing each lab, and how our experiences can be useful to others trying to replicate similar work.

2 BACKGROUND

The Honeynet Project describes itself as “a leading international 501c3 non-profit security research organization, dedicated to investigating the latest attacks and developing open source security tools to improve Internet security” [1]. Routinely, the Honeynet Project produces a challenge on a given topic that is open to anyone, with prizes being offered to the people or groups who create the best solutions to the challenge. Since the format of the challenge (e.g. a pcap file with questions) normally disallows a great deal of interaction between the participants and challenge developers, it is suggested to have internally motivated tasks within each challenge.

Challenge 12 – "Hiding in Plan Sight" [2] – saw 3 complete (i.e., all questions were answered) submissions, and several partial submissions. Based upon comments received via social networking forums (e.g. twitter), at least as many people attempted the first few questions but did not create a submission. While there are many reasons a participant might not submit a partial submission despite having done the work, a few statements could shed some light. Several participants voiced concerns with unclear questions, what constituted sufficient evidence for their claims, and their thoughts on 'bugs' within the challenge.

The challenge attempted to motivate the analysis of a malware-infested environment through a series of directed questions to determine the malware’s intent, where the successful completion of one question would reveal how the following questions might be answered. This proved to be a reasonable method for participants who were already familiar with such analysis (i.e. those that created complete submissions) as the questions served to highlight important sections where they should focus their efforts. However, this alienated several participants who had difficulty with a particular problem or incorrectly answered a question, and would not be able to proceed further.

During this summer, a security workshop was held for students ranging from undergrads to Ph. D. candidates with a strong academic backing in security. The format

for this workshop included a series of lectures followed by hands on labs further exploring the practical nature of the lecture topic. This differed from Honeynet challenge by ensuring that each of the participants had a baseline of knowledge, and allowed for teachers to be consulted when a problem arose. During the lab portion, they were also given additional reference material including lists of relevant tools and man pages. Not only did the set of background material ensure that a baseline of knowledge could be assumed by the labs, but the exploratory nature required to answer the questions resulted great interest and enthusiasm including several wishing to continue an incomplete lab even after the session had ended.

2.1 Instructional Structure

The suggested lecture-lab format is largely motivated by two key observations regarding the participants’ enthusiasm and level of success: a baseline level of knowledge regarding the environment and appropriate tools ensures enthusiasm for an undirected exploration of an environment, and a directed exploration creates high quality analysis. Thus, we feel the high points of both methods may be achieved letting the students explore this lab environment prior to a lecture on the nature of the lab, and conclude the lecture with a second and more directed attempt at the lab. As such, each of these teaching modules will contain four components: malware, initial live analysis, lecture, and final informed analysis. The following describes the goals of each component:

The malware and hiding techniques should be setup within a virtualized environment allowing each student or small group to gain access to a single VM that has been infected with the malware. The nature of the malware is not directly relevant to the lab structure, but a selection of malware samples have been provided to deploy within an environment [25]. The scripts to build the environment were largely motivated by existing instructional tools that have been successful at teaching malware forensics concepts [3] [4]. The goal of these labs will not be to analyze the malware’s objective as that is more closely related to a reverse engineering class. Instead these labs will focus on observing the techniques malware may use to exist within a system by seeking to detect its presence through those means.

The task for each student during the initial live analysis is to identify what malware is running on the system, and attempt to pinpoint what processes are causing the malware to execute without being specifically instructed to do so. They can be given some form of direction in terms of questions that should be answered within the lab assignment, but the primary purpose of this discovery phase is for them to utilize their existing digital forensics skills to analyze a running system. It is not expected that the students will know how the malware is executing at the completion of this component, but

(3)

instead is intended to generate curiosity and analytic thinking about the possible techniques being employed.

The lecturer portion should allow students to share what they learned during the first analysis, and have them come up with a plan for determining what operations the malware might be using. This will provide the lecturer with an informed understanding of the student’s level of experience and perception of the system so that future material can be more appropriately delivered. The content of the lectures should elaborate on the general topic the malware is working within (e.g. how processes are handled and reported within Linux when the malware is being removed from the proc filesystem). It is not the goal at this point to describe exactly how the malware was working, but merely to provide a guided method of explaining the nature of the environment at hand.

Following the lecture portion, students should be returned to the original lab environment and asked to perform informed analysis. They should have all the knowledge needed to understand the malware operations, or at least the abilities to find that knowledge quickly. After they detect where the malware is residing, they should attempt to remove the malware from the system to fully demonstrate their understanding. Ideally, this step should conclude with a final lecture to allow the class as a whole to share the lessons they have learned and to ensure that each student has the desired understanding of that section.

3 KERNEL HOOKING

Malware will commonly seek to achieve the highest privilege level needed once a system has been compromised. This privilege level varies depending on what operation the malware is trying to achieve, and which privileges are available for it to exploit. For instance, a simple DDoS bot may perform its necessary operation – sending network traffic to a target host – without requiring root permissions, while a Trojan may require increased permissions to install itself on the infected machine. As such, rootkits are a good topic to explore when introducing malware because they commonly require root permissions, often utilize kernel-hooking methods to provide a wide range of interesting functionality, and are a, at least conceptually, something that students commonly have a have basic level of familiarity with..

Linux kernel hooking techniques often rely on a loadable kernel module (LKM) to modify the kernel in order to provide, remove, or modify some feature available to a user-space process. By design, the kernel has a higher level of privilege than user space code, and malicious code running at this level can modify common interfaces into the kernel to efficiently alter the operation of any program that uses this interface. This style of

alteration can increase or decrease the privilege level of a process, hide a process, intercept and redirect information, or any number of other tasks all of which are hard if not impossible to detect purely from user-space. Due to this nature, many common user space tools and techniques that students are familiar with will fail to detect the malware, which can stimulate a discussion of how these tools work.

Conceptually, the labs that fall into this category are easy to introduce, as most students should be familiar with the concept of interfaces into another body of code from use of libraries. This allows parallels between the system call table and library functions, or between procfs and a normal filesystem to be made providing a high level understanding of the goals of the anti-forensics technique without mandating a full technical knowledge of kernel operations. However, it is crucial that the lab developer and lecturer be familiar with the chosen kernel version, as these interfaces have changed with time. Some modifications may be purely cosmetic, but many changes (e.g. those to the scheduler and task_struct) prohibit those techniques from working on modern systems. Each of the example labs will work on Ubuntu 12.04 32-bit systems.

3.1 Editing Proc FS

The proc filesystem (procfs) is a virtual filesystem (VFS) meaning it does not physically exist on disk. It is primarily used to describe some characteristics of the system including statistics regarding running processes, and serves are the primary interface for many common user-space tools such as ps, top, netstat, and others [6]. The malware for this lab should involve creating a LKM that will modify the procfs in a variety of ways including [7]:

• Modifying /proc/modules to remove the record of the malicious kernel module in use. • Modifying /proc/`pid` to remove any

user-space component of the malware from being reported by other system utilities.

• Modifing /proc/`pid` to add new entries of non-existent processes in an effort to confuse common system utilities into reporting misinformation.

The lecture that follows should focus on describing how the kernel manages the procfs [8]. This discussion should describe the methods that a kernel module can use to create a new proc entry, how this entry is written to and read from, and how proc directories are formed. By describing how these read and write functions are managed within the kernel, students should be more easily able to understand the concepts of how it might be possible to add or remove information from procfs.

(4)

with the abilities to understand the concepts and to be able to detect when such methods are being utilized by a system. By motivating the method of wrapping system functions, they should have the knowledge required to understand the process the malware is using, but will still lack the method of detecting when the process hiding technique is being actively applied. This can be remedied by providing a shell kernel module that will wrap the same functions the malicious kernel module has already wrapped, and allow the students to use this source code to identify the existence of the malware during the informed analysis phase of the lab. Even in cases where students lack full understanding of how the LKM is altering operations – which ended up being rather common in our experience – simply knowing the existence of such methods is still an important lesson to learn.

3.2 Scheduler Modifications

Following the motivations presented in the previous lab, a second lab could also contain a method to hide or otherwise alter processes as they appear to user-space utilities to reinforce the point that there are often multiple methods for malware to perform an action that appears the same to the end user [9]. For instance, this lab could consist of a kernel module that will modify the task_struct for a given process to dynamically alter permissions, hide the process, and edit other features provided by the scheduler or current task_struct.

The Linux scheduler has undergone several major revisions with the latest major overhaul occurring with the introduction of kernel 2.6. It should be noted that much of the literature found regarding the Linux scheduler targets the older implementation and may not be relevant to newer systems. However, both of these implementations provide different sets of features that could be exploited, and a comparison of scheduler implementations and discussion of the design choices behind the “fair scheduler” used in modern versions of Linux are beneficial lecture topics that relate to this type of process hiding. Unfortunately, detecting that processes are hiding through scheduler modifications (where relevant) involves a greater level of knowledge than is reasonable, but detecting processes that had other characteristics modified, such as the process owner’s user ID, can be easily done through profiling techniques.

3.3 General Kernel Hooking Techniques

By this point, the students should be comfortable writing kernel code, and analyzing different interfaces that are exposed to kernel- and user-space, and as such we may explore the kernel in greater depth and present more practical methods. A lab to demonstrate how rootkits can more appropriately manipulate the interaction between user-space and kernel-space by replacing entries within the system call table is ideal for deeper understanding

[10]. While this table is no longer exported in modern kernel versions, it can still be found in memory through profiling techniques. This lab could feature a more appropriate malware such as a keylogger outputting to dmesg that the student must identify.

The lecture for this module should focus on other interfaces that are exposed by the kernel either to user-space or to other kernel modules. The sys_call_table should be covered in depth since it is the technique being used by this malware sample (or IDT if using a keylogger [11]), and it is interesting to discuss how system calls work. Other interfaces that could be discussed are those exposed to kernel modules such as the module loading and unloading functions, PCI drivers, network or other types of hardware drivers, file operations, and the dev driver structures [12]. Since these interfaces are normally described by or accessed via a kernel structure, discussing how these structures might be modified to produce functionality of interest to malware writers could also be covered. The purpose is to describe the breadth of interfaces that can be exploited by any given kernel module, while the labs should focus on how a specific interface is being used so the student can generalize how this can extent to other interfaces.

During the informed lab, the students should be given another kernel module that provides a pointer to the system call table, and potentially a few other tables that are easy to traverse (e.g. the net_dev structs, fb_ops structs, etc). The student should be able to analyze these tables of function pointers found the infected system and compare them to a system that has not be infected by the malware in order to determine what has been changed. By fixing the infected tables, this should allow the malware module and user-space components to be discovered and removed from the system. This section could also serve as a good final project where students are asked to describe how malware might seek to abuse other kernel interfaces, and other techniques that can be used to identify and fix those modifications to those interfaces.

3.4 Kernel Hooking Lessons

These types of labs run the risk of requiring extensive kernel knowledge to setup, and for a student to properly analyze. In most cases, students have never written kernel code, so it can be difficult to motivate the use of LKMs in their analysis instead of simply relying on user-space tools. The Honeynet challenge demonstrated that providing a similar reference module (adore) greatly facilitated analysis, and the summer workshop took this a step further by providing example analysis code. This code would setup appropriate interfaces and present important data structures for manipulation while leaving the core analysis to be written by the student, which greatly reduced technical knowledge that would otherwise be required.

(5)

As already mentioned, many participants did not present thorough understanding of the techniques being used to hide processes, and this should most likely be expected. Kernel code from a purely pragmatic point of view can be daunting to most students, and requiring a detailed understanding is not practical. Even the most thorough students and Honeynet challenge participants could not present our expected level of understanding, which serves as a good reminder that these kernel hooking labs – like all labs suggested in this paper – should be designed with these practical limitations in mind.

4 PROCESSES INFECTION

Even in cases where the malware might have root permissions, it may choose to not utilize them fully when the malware’s end goal does not require them. For instance, a Browser Helper Object is a more appropriate method of intercepting encrypted bank information than a root process, and the infection of a service can ensure the longevity expected from a kernel module, while being harder to detect.

These sets of labs focus on how malware may interact with existing binaries to alter their intent, or simply to infect the process and utilize it as a place to live. Students will be introduced to the concepts of shared library loading, dynamic address resolution, and other runtime features of ELF executables. Since the emphasis is on digital forensics, the focus will remain on detect that type of compromise has occurred from both within the executable (if possible), and external to the executable in both dynamic and static analysis. There is an impressive breadth of possibilities for this to occur in real world environments [13]; as such, these will serve as a brief introduction for a few select methods.

Most students who have been exposed to security concepts are aware of potential problems from poorly validated user input, such as that received from stdin, sockets, and files. However, when an attack does not come through one of these methods, many security concepts are completely ignored. For instance, the Honeynet challenge ended with virtunoid, an attack on the emulation layer of KVM [27]. Every participant who attempted to answer the question regarding the nature of the attack was able to successfully identify it as virtunoid, and many even linked to articles on the attack. However, the majority were unable to describe the precise nature of the attack – unplugging an emulated device from a running system – and incorrectly attributed it to any number of other means since it did not follow a preconceived notion of how software gets attacked. Since the surfaces being altered for these labs fall outside the commonly expected attack surfaces, it is crucial to properly inform students of the complete nature of attack

surfaces.

4.1 Bash Alias and Exec

Most times malware is seen within an environment, it is running at some privilege (e.g. root, www_user, etc.) and will use those privileges to monitor processes, or hide itself. There are several techniques that these malicious processes can utilize regardless of the process owner’s permissions, and due to their simplistic nature it is a useful to introduce the topic of process hiding to students. Within a running environment, students should be given an account that has been maliciously modified such as with the following:

• Login and logout scripts – The files ~/.bash_profile and ~/.bash_logout are executed on user login and logout respectively, and modifying these to start or stop malware can serve as an introduction to how malware can be triggered through uncommon means.

• Bash alias – The file ~/.bash_rc is executed on login, and by modifying this script to setup alias for commands such as ls, ps, or cd to start, stop, or restart the malware as is necessary can serve as another unanticipated method to trigger malware.

• Exec – The exec command can be used to start processes with different names, potentially making it hard to find or kill a process that keeps making new threads, or when results of common utilities are being searched through (e.g. ps aux | grep …).

• Cron – Most user accounts have the ability to manage a crontab that can be used to restart malware at specified time intervals.

The malware for this section can be a simple script that attempts to ensure there are several copies running at any given time and will restart additional processes if that number is not reached. This serves as a simple introduction to a Linux environment, which might not be well known by many students, and provides an introduction to several basic utilities.

4.2 Preloading Shared Libraries

When using dynamically loaded libraries, it is possible to alter the linking process to force alternate libraries to replace some, if not all of the original library’s functionality [14]. A malicious use of this functionality could replace all, or a select set of functions within libc, for instance, allowing any program that uses libc – which is most every program written in C – to instead import malicious library code. Simply replacing libc with a

(6)

modified version containing the malicious code can also achieve this.

The lecture for this section should include a description of how a Linux executable is loaded, including how the linker determines which shared libraries to load, and the nature of environment path variables such as the library path, and binary paths. The focus here should be in detailing how a single command issued by a user gets interpreted and handled by the rest of the system to execute a process, and how each of these steps are subject to malicious tampering.

4.3 GOT & PLT

Libraries are an integral part of modern software systems that provide a means for an application to utilize existing software in a simple and efficient manner. Within ELFs, library access is managed by a Global Offset Table (GOT) and Procedure Lookup Table (PLT), which together allow the dynamic linker to determine addresses of library functions at runtime. These tables can be patched by an attacker to point to other functions mapped into the process space during execution or, in some cases, modify the binaries so that these lookups will call malicious versions of functions similar to the LD_PRELOAD method.

The lecture portion for this section should describe the process the linker uses to find and update the GOT for a running program, as well as how a program can issue system calls. By giving a general overview of the means with which a program can access external code, students should gain increased awareness of the attack surfaces exposed by every binary as well as hardening attempts. Methods to detect this style of infection include identifying binary modifications, observing libraries mapped into the processes address space, and determining irregularities within the GOT either by recording relative offsets of known good tables, or observing which libraries are being referenced by each entry.

4.4 Process Infection Lessons

With the exception of techniques that alter the ELF structure, standard tracing tools are sufficient to detect that these operations are occurring. As such, these labs are useful at motivating the importance of these tools. The most challenging part of these lessons involves properly analyzing the different attack surfaces that an application has, such as configuration files, library paths, and other environment variables that are frequently taken for granted as always being safe and properly setup. During the workshop, one lab contained a backdoor within libc and it was not apparent that any student identified where such a backdoor could have originated even though this was covered within the reading material. However, in other cases where there was a bug in the processing of user input through stdin or sockets, students

were able to properly identify the nature of the attack indicating that lack of understanding of the breadth of the attack surface is the key misconception being faced.

In our experience, this lack of understanding can largely be attributed to students improperly placing trust in certain bodies of code, while remaining skeptical at others: init scripts to start and stop services would be trusted whereas the services they were controller were properly analyzed; techniques for generating random numbers were deemed secure and blame placed on prime number generation; and guessable passwords were blamed as the cause of attack on unprivileged accounts. In each of these cases, students largely sought to analyze sections that adhere to their notion of where attacks can occur instead of observing less technical areas.

5 OBFUSCATION

Obfuscation techniques are any method that a binary may utilize to hide or disguise its operation within the binary, including anti-disassembly, anti-debugging, encryption, and steganography. While detection of these techniques does not necessarily indicate malicious behavior (as many legitimate processes may utilize obfuscation methods) knowledge of how to properly detect and circumvent these techniques can be valuable in understanding high-level malware operations.

In cases where a system administrator cannot prevent or detect when malware is installed on a computer, whether through an exploit or otherwise, it is useful to be able to identify the communication channel (if one exists) or other data channels. These sets of labs focus on methods that hinder the detection of such channels, or prevent eavesdropping on such a channel in order to ensure students are familiar with methods that may be utilized within malware samples.

The steganography lab was taken directly from the Honeynet challenge in which the participants were required to extract an encrypted payload from a series of BMP images. This was the most common point at which participants dropped out of the challenge, probably because it was the first time they were required to write code. While the code should have taken no longer than an hour to write, many attempts to use existing tools to analyze the images were employed before attempting to write a custom utility. In fact, a common theme amongst all labs was the difficulty for a participant to change the mode of analysis once started (i.e. once analysis using existing tools proved to be successful for one step, such tools were tried first for all future steps even if a small bit of debugging or programming could have solve the problem quicker). These labs have been changed from their original versions to be solvable exclusively through use of existing tools, as many students were not

(7)

comfortable having to write their own analysis code under any circumstance.

5.1 Steganography

Steganography is the study of methods for hiding information within data structures. The most common application of steganography – and the method that is normally conjured upon hearing the term – is that of embedding arbitrary data inside an image file [15]. This lab consists of a program that requests resources from several different web servers, some of which happen to be images. The challenge for the student is to determine what, if anything, is happening with this program and whether it is malicious in nature. Since reverse engineering of binaries is not the focus,, the lab is performed through analysis of the files the program receives from the network and saves to disk.

The lecturer should focus on describing the nature of steganography to the students with whatever level of mathematical knowledge the students are expected to understand (e.g. graduate students could handle error correcting methods) [16]. Aside from least significant bit (LSB) image steganography, there are plenty of methods for images that are resilient against image modifications or are found within movie formats, music formats, word processing formats, and others. Imagemagick is a common image processing utility that is available in many repositories that has steganography operations built in [17].

5.2 Encryption

This lab is very similar to the previous steganography in that the same client – server model is used, but rather than passing files, it will pass encrypted messages. In this case, it is the job of the student to determine how best to decrypt the files, whether through hooking the libraries with the LD_PRELOAD method, or performing memory dumps on the running process. The lecturer should focus on memory dumping techniques to extract files and other information from the memory such as using strings on the memory dump [18].

5.3 Packers

While encryption techniques and steganography can prove to be problematic for digital forensics, binary packers significantly raise the level of effort required on the part of the analyst. Binary packers are additions to standard binaries that will encrypt, compress, or otherwise alter the original executable file in order to disguise its operation [19]. This could include simply compressing it to bypass static analysis, and decompressing once executed, or searching for known vulnerable applications and only decrypting once they have been identified. This lab will seek to introduce a binary packed piece of

malware that will perform an operation whenever certain criteria have been met.

This lab has been designed to ensure that reverse engineering of the binary will not be necessary to complete this section. Instead, this will focus on identifying digital artifacts within the system that are created by the malware and system calls that are requested before it immediately stops executing. Tools to discuss during the lecture include runtime analysis tools such as strace, and ltrace [20] [21]. This will allow students to identify what requirements the malware sample has during execution, and then modify the system to meet those requirements (e.g., install appropriate versions of libraries).

5.4 Obfuscation Lessons

As mentioned before, many students when presented with a problem that could be solved with minor amounts of programming instead sought to use existing tools in rather creative methods to achieve similar results. Perhaps requiring the students design and implement a program to solve the problem could have yielded good results. However, programming is not the intention of these labs, so they have been changed to ensure that they can be solved by existing tools alone, allowing a larger number of students to gain insight into the underlying concepts.

6 DIFFICULTIES AND SOLUTIONS

One of the central difficulties presented by these projects, and the central motivation for this paper was the challenge in finding existing techniques and source code that worked on a given kernel version or utility. As with many security-related concepts, the tools and techniques will change over time as implementations are modified to add features, or harden systems. The digital forensics topic of identifying malicious processes in hiding was no exception, and it proved challenging to identify what kernel versions rootkits were functional within, and how they may be ported to modern kernel versions when possible. This has been addressed by ensuring that the source code provided should be functional within a vanilla install of Ubuntu 12.04 LTS server or desktop editions (which is scheduled to remain as an available and supported distribution through 2017) without any additional modification..In addition, preconfigured virtual machines with these labs installed are also available for use in the NSF-funded RAVE lab [26]

The second challenge encountered regarded setting up the appropriate user-space tools that will perform the desired actions for the malware, such as steganography libraries. While there are many examples of tools that are widely available and easy to use, there are few cases that were appropriate to use within the lab (e.g. encryption

(8)

libraries offering additional layers of protection to dumping memory). Developing in house tools to perform such operations, and researching additional tools that more appropriately meet our requirements has resolved this issue.

The third challenge encountered involved developing malware that was simple enough to be identified by the students, since these operations were not intended to be difficult, and could still take advantage of the rootkit or hiding techniques being utilized. This was largely solved through designing a few simple malware samples that perform undesirable actions while remaining understandable to the students.

7 FUTURE WORK

We feel that this effort has been successful in the in the stated goal of designing educational modules for students that are relevant, useful, and easy for an instructor to setup and utilize. While the focus has largely been on techniques that allow malware to hide in more modern systems, this could also be extended to include systems that are most commonly found as well as a larger emphasis on usual user-space utilities.

This effort can also benefit from better and more appropriate documentation. There are comments within the source to help identify the tasks different techniques are achieving, but without appropriate background information into some of theses systems it might be too hard for an instructor to adequately understand. Aside from a better description of the background information the instructor is required to possess, it would also be useful to include suggested lecture material, such as PowerPoints, and homework question and answer documents. Some of the above suggestions are currently underway, and will be made available through the RAVE Lab when completed.

8 REFERENCES

[1] “About The Honeynet Project”. Retrieved June 01, 2013 from http://www.honeynet.org/about [2] L. McDaniel, “Hiding in Plain Sight”. Retrieved

June 01, 2013 from

http://www.honeynet.org/node/906

[3] J. Corbet, “A new Adore root kit”. Retrieved June 01, 2013 from

http://lwn.net/Articles/75990/

[4] “SucKIT”. Retrieved Jun 01, 2013 from http://packetstormsecurity.com/files/40690/sucki t2priv.tar.gz.html

[5] Bunten, Andreas. "Unix and linux based rootkits techniques and countermeasures." 16th Annual FIRST Conference on Computer Security Incident Handling. 2004.

[6] Jones, K. "Loadable kernel modules." login: The Magazine of USENIX and SAGE, 26 (7) (2001).

[7] “Fun with Linux Kernel Modules”. Retrieved June 01, 2013 from

http://commons.oreilly.com/wiki/index.php/Net work_Security_Tools/Modifying_and_Hacking_ Security_Tools/Fun_with_Linux_Kernel_Modul es

[8] T. Jones, “Access the Linux kernel using the /proc filesystem”. Retrieved June 01, 2013 from http://www.ibm.com/developerworks/library/l-proc/index.html

[9] “Process Hiding & The Linux Scheduler”. Retrieved June 01, 2013 from

http://www.phrack.org/issues.html?issue=63&id =18

[10] R. Peláez. “Linux Kernel Rootkits: Protecting the System’s ‘Ring-Zero’”. Retrieved June 01, 2013 from

http://www.sans.org/reading_room/whitepapers/ honors/linux-kernel-rootkits-protecting-systems_1500

[11] Baliga, Arati, Liviu Iftode, and Xiaoxin Chen. "Automated containment of rootkits attacks."

computers & security 27.7 (2008): 323-334.

[12] P. Salzman, “The Linux Kernel Module Programming Guide”. Retrieved June 01, 2013 from

http://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg. html

[13] Al Daoud, Essam, Iqbal H. Jebril, and Belal Zaqaibeh. "Computer virus strategies and detection methods." Int. J. Open Problems Compt. Math 1.2 (2008): 12-20.

[14] T. Jones, “Access the Linux kernel using the /proc filesystem”. Retrieved June 01, 2013 from http://www.ibm.com/developerworks/library/l-dynamic-libraries/

[15] Roque, Juan Jose, and Jesus Maria Minguet.

"SLSB: Improving the steganographic algorithm LSB." Proceedings The Ibero-American Congress on Information Security (CIBSI). 2009.

[16] Provos, Niels. "Defending against statistical steganalysis." 10th USENIX security symposium. Vol. 10. 2001.

[17] “ImageMagick”. Retrieved June 01, 2013 from http://www.imagemagick.org/script/index.php [18] Pettersson, Torbjörn. "Cryptographic key

recovery from linux memory dumps." Chaos Communication Camp (2007).

[19] Roundy, Kevin A., and Barton P. Miller.

"Binary-Code Obfuscations in Prevalent Packer Tools." ACM Journal Name 1 (2012): 21.

[20] “strace”. Retrieved June 01, 2013 from http://linux.die.net/man/1/strace

(9)

[21] “ltrace”. Retrieved June 01, 2013 from http://linux.die.net/man/1/ltrace

[22] Raghavan, Sriram. "Digital forensic research: current state of the art." CSI Transactions on ICT

1.1 (2013): 91-114.

[23] Nance, Kara, Brian Hay, and Matt Bishop.

"Digital forensics: defining a research agenda."

System Sciences, 2009. HICSS'09. 42nd Hawaii International Conference on. IEEE, 2009. [24] Beebe, Nicole. "Digital forensic research: The

good, the bad and the unaddressed." Advances in digital forensics V. Springer Berlin Heidelberg, 2009. 17-36.

[25]

http://www.securityworks.com/dl/hicss-47-samples.zip

[26] RAVE Lab, http://www.rave-lab.com

[27] Elhage, Nelson. “Virtunoid: Breaking out of KVM.” Black Hat USA (2011).