Securely Isolating Malicious OS Kernel Modules Using Hardware Virtualization Support

(1)

Available at http://www.Jofcis.com

Securely Isolating Malicious OS Kernel Modules Using

Hardware Virtualization Support ⋆

Zhixian CHEN

1,

∗, Jun CUI

2

, Wei LIU

3

, Bin XU

1

1_{School of Computer Science and Information Engineering, Zhejiang Gongshang University, Hangzhou} 310018, China

2_{No. 28 Research Institute, China Electronic Technology Group Corporation, Nanjing 210007, China} 3_{State Key Laboratory for Novel Software, Nanjing University, Nanjing 210093, China}

Abstract

Kernel-level attacks or rootkits that compromise the security of an operating system are one of the most important concerns in system security at present. By enforcing data-flow integrity or control-flow integrity of an operating system, a variety of solutions have been proposed to defend against these attacks. However, the added overhead likely makes those approaches unsuitable for many real-world applications. In this paper, we present SecEye, a lightweight hypervisor for protecting kernel integrity via securely isolating malicious dynamically loadable kernel modules, using hardware virtualization support. A primitive prototype is then implemented on Linux as a kernel module, which can detect kernel-level rootkits or attacks as well as avoid being tampered by malicious code. Experiment results show the effectiveness and feasibility of SecEye and tolerable overhead imposed.

Keywords: Kernel Integrity; Rootkit; Hardware Virtualization; Hypervisor

1 Introduction

Kernel security is an essential part of the security of an operating system (OS). According to a recent survey, malware infection takes about two-thirds of security incidents in the year 2010 [1]. By subverting the victim OS kernel, a kernel-level rootkit embeds itself into the compromised kernel; modiﬁes kernel-level code or sensitive data to hide various malicious activities, changes victim OS behaviors or essentially takes complete control of the compromised system [5, 12]. Once malicious or defective code achieves the highest privilege of OS, all protected code and data will be in danger with the risk of being modiﬁed. To protect OS kernels from kernel-level rootkits or

⋆_{Project supported by the National High-Tech Research and Development Program of China (No.} 2007AA01Z409), Zhejiang Provincial Natural Science Foundation of China (No. LY12F01018), Commonweal Project of Science and Technology Department of Zhejiang Province (No. 2012C33070), the Scientiﬁc Research Foundation of Zhejiang Gongshang University (No. 1130XJ2012012).

∗_{Corresponding author.}

Email address: czx [email protected] (Zhixian CHEN).

(2)

attacks, there have been recent efforts to protect the integrity of OS kernels. The techniques can be roughly classified into two main categories: those focusing on protecting sensitive data and kernel-level code such as Data-flow Integrity (DFI) [2] and those primarily designed for enforcing Control-flow Integrity (CFI) [3, 4]. Unfortunately, they could be bypassed or disabled in some ingenious ways.

In recent years, along with the development of hardware-assisted virtualization, a new hardware enhancement VMX (Virtual Machine eXtension) has been introduced to the processor. In Intel terminology the privileged mode is labelled VMX root mode whereas the un-privileged mode is called VMX non-root mode, which can be regarded as a new mode of operation with reduced privileges. A hypervisor can run in VMX root mode and be transparent to the guest OS running in VMX non-root mode. With the assist of hardware virtualization, virtual machine monitor (VMM) is able to collect real-time information including memory, registers and instructions in guest OS. There are approaches using VMM to prevent code and data from unauthorized modiﬁcation, however, the added overhead and high performance loss is a major problem.

To address the above issues and protect the integrity of an OS kernel, we need consider both the system performance and security requirements that a security monitor or hypervisor must satisfy minimally. In this paper, utilizing hardware virtualization features available in recent processors, we present SecEye, which is a lightweight in-kernel hypervisor to provide the kernel integrity of an operating system.

The rest of the paper is structured as follows. We ﬁrst introduce related work and motivations in Section 2. Section 3 describes the design goal and illustrates the architecture overview and detailed implementation of SecEye. In Section 4, an experiment and evaluation of its security performance is carried out on Linux 2.4. Section 5 discusses current problems and further work, and concludes this paper.

2 Related Work

A variety of approaches have been proposed to actively or passively monitor and protect ker-nel integrity recently. These security monitoring approaches can be broadly divided into two categories.

a. Monitor resides in the same untrusted environment, i.e., the monitor resides inside the same

operating system it protects and runs at the kernel privilege level. The security monitor retains the eﬃciency to being able to access the system address space at native speed. So it is easy to satisfy performance requirements in this case, however the monitor itself can be compromised by kernel-level rootkits or attacks. Nowadays most kernel-kernel-level rootkits or attacks can eﬃciently destroy kernel-level code or sensitive data, which means that this kind of monitors lack self-protection ability.

b. Monitor resides in a separate trusted environment, which is deployed outside of the protected

kernel to provide independent, trustworthy analysis of the state of the protected OS. Such systems cannot intercept the kernel privileged instructions, and fail to detect malicious kernel attacks. For example, Copilot [6], a coprocessor-based kernel integrity monitor for commodity systems, detects malicious modiﬁcations to a host’s kernel by accessing kernel memory. Copilot’s main advantage is that it is independent of monitored kernel, and has the ability of self-protection. However, the fundamental limitation of a coprocessor-based kernel monitor is its inability to interpose the

(3)

host’s execution. For Copilot, the view of the monitor is limited to main memory; there is no means of suspending the host CPU’s execution or examining its registers [6]. In other words, the monitor in this case lacks the ability of semantic acquisition on kernel.

Xen virtual machine based integrity monitor is another representative, which is capable of analyzing virtualized guest operating systems running on top of the Xen Open Source hypervisor [7]. Unlike Copilot, VMM-based monitor does not require extra hardware and has full access to all of the target virtual machines state, including registers. The disadvantage of this approach is the incurred overhead and the challenges facing the reduction of that overhead. Another challenge is the Xen Dom0 faces many security threats. Furthermore, the data structure of Xen Dom0 is large-scale and complicated, it is diﬃcult for us to formal verify its correctness.

From the above analysis, we can ﬁnd that it is very diﬃcult for current approaches to achieve the ability of reference monitor, semantic acquisition and self-protection simultaneously.

Besides, we noticed that at present most kernel-level attacks or rootkits are implemented as third-party devices drivers or loadable kernel modules (LKMs), which are supported by most current Unix-like systems (e.g., Linux and Solaris), and Microsoft Windows. LKMs are developed to extend the running kernel, or so-called base kernel, of an operating system, and allowed to run with the highest OS privilege, which can be abused by an attacker on a compromised system to modify kernel-level code or sensitive data (such as system service dispatch table, interrupt descriptor table, page tables, registers, and network ports) to hide various malicious activities, change OS behaviors or essentially take complete control of the system. Thus, securely isolating malicious LKMs can eﬃciently defend kernel-level attacks or rootkits.

3 Architecture and Implementation of SecEye

With the assist of hardware virtualization many current approaches can monitor the behaviors of LKMs effectively, however the fine-grained access control will results in a high performance overhead. In this paper, we present a novel approach based on the hardware virtualization tech-nology. The security mechanism resides in the same address space as the victim OS kernel whose privileged instruction and accesses to predefined memory pages or registers could be trapped into the security mechanism with the help of virtualization technology. So, any attempt to attack the hypervisor will be captured and stopped by the security mechanism. The hypervisor could protect itself from attacks. On another hand, the hypervisor is able to obtain precise semantics of kernel objects because they share the same virtual address space. In order to monitor the operations of an LKM kernel module, we define different page tables for trust kernel modules (TKM) and LKM modules, and cancel the writing permission of LKM modules to kernel objects. Any attempt to jump or call to kernel space from LKM modules would be caught by hypervisor to limit the scope of function addresses called by LKM modules so that its power of damaging OS kernel is limited.

3.1 Performance and security requirements

Based on the above discussion, the performance and security requirements of our proposed system can be deﬁned as follows:

(4)

• The data of TKM cannot be maliciously alterable. Every attempt to modify TKM code and data could be captured by hypervisor.

• If a LKM need to call a code segment in TKM, it can only jump to the entry point of export function of TKM.

• When a LKM returns to TKM modules, it can only return to the next instruction after the function called.

• The read operation of LKM is not monitored. The read operation can be executed at native speed without any hypervisor intervention.

3.2 Architecture overview

TKM Data

Trusted Kernel Module TKM

DLKM Data

Dynamically Loadable Kernel Module DLKM

SecEye: Security Monitor based on VMM TKM Code DLKM Code

Fig. 1: Overall Design of SecEye

The goal of our SecEye is to enable security monitors that meet all the performance and security requirements discussed in Section 3.1. In this section, we will describe the design of the SecEye framework based on hardware virtualization features.

The overall design of SecEye is shown in Figure 1. The main idea of SecEye is to create two separate page tables for TKM and DLKM (such as device drivers or other kernel modules that can be added to a running system without rebooting the system or rebuilding the kernel) respectively that map virtual addresses to physical addresses. When an instruction of TKM or DLKM is executed the corresponding page table is used by the hardware to perform address translations.

The page tables and memory mapping mechanism introduced by the SecEye framework is shown in Figure 2. In the ﬁgure, the TKM virtual address space at the left shows that the virtual address space deﬁned by the operating system for trusted kernel modules. The virtual address space created for dynamically loadable kernel modules is shown at the right as the DLKM virtual

address space. For each region in the virtual address spaces, the access rights and mapping that

are set on the relevant pages by the hypervisor are shown.

In DLKM virtual address space, TKM data, user code and DLKM code are all marked as read-only, every attempt to modify them will result in a general-protection fault exception indicating a permission violation. TKM code is not mapped in DLKM virtual address space, so every direct jump from DLKM to TKM code segment will also result in a general-protection fault exception. These two cases will trap to our SecEye hypervisor. In the later case, SecEye will update the

(5)

User Code User Data TKM Code TKM Data DLKM Code DLKM Data User Code User Data TKM Code TKM Data DLKM Code DLKM Data Not mapping Read Only Read and Write

Monitor Code Monitor Data

TKM Virtual Address Space DLKM Virtual Address Space

Page Table for DLKM

Kernel State Page Table for TKM

Fig.2: Conﬁguration of page tables

content of CR3 register to switch address space after positive security checks, so the function call will be executed normally. The mechanism make it sure that DLKM cannot modify or jump to any module in TKM directly, unless it passes the necessary security check.

Similarly, in TKM virtual address space, TKM code is marked as read-only, and DLKM code is not mapped, so every attempt to modify TKM code or every direct jump from TKM to DLKM code segment will result in a general-protection fault exception and then trap to our SecEye hypervisor. For the later, after positive security checks, SecEye will update the content of CR3 register to switch address space and the function call can be executed normally.

Besides, in both TKM virtual address space and DLKM virtual address space, monitor code and data are both marked as read-only, this means that they are not maliciously alterable, every attempt to modify monitor will result in a general-protection fault exception and trap to our SecEye hypervisor.

3.3 Implementation

The implementation details involved in module isolation mechanism are discussed below:

Step 1. Construction and maintaining mechanism of page table

In the stage of the operating system’s initialization, two reference page tables will be constructed and maintained for the kernel virtual address space (3-4GB), with the same mapping from virtual addresses to physical addresses. The page tables are conﬁgured according to Figure 2.

For each process, the page table for kernel space (3-4GB) is the same as two reference page tables, so we can only keep two copies. No need to generate two new copies for each process, and no need to synchronize them among multiple processes. When a new process is created, two page tables will be created for it, in which the page table for user space (0-3GB) is all the same, and the conﬁguration of the page table for kernel space (3-4GB) can copy from the two reference page tables.

When a new kernel module is dynamically loaded by insmod or modprobe, the instruction CPUID is inserted into the entry point of system call init_module, which will result in trap-ping into the hypervisor SecEye before initializing that module. SecEye will calculate memory address that the new module resides, according to module_init, init_size, module_core,

(6)

core_text_size, and core_size achieved by _this_module, and then the privilege rights are set correctly according to Figure 2. When rmmod the module, the page table setting will be restored correspondingly.

Step 2. Isolation for code, data and stack

In order to isolate code, data and stack between TKM and DLKM, individual pages are assigned for code and static data, and kernel stack need be temporarily assigned for DLKM in each process. In addition, a management link will be added in the kernel to manage these pages.

Step 3. Page table switching

When DLKM directly jump to the code region of TKM, or TKM directly jump to the code region of DLKM, a page fault exception will be caused for the destination page is not mapped in the current page table. Thus a VM_Exit will be called and trapped into the hypervisor SecEye. Then SecEye will switch page table via updating the content of CR3 register after necessary security check. Return from hypervisor to kernel by executing instruction VMRESUME, which will cause a VM_Entry.

Besides, although the isolation mechanism allows any jump of DLKM to data region of TKM or DLKM, the privilege right remains the same as in DLKM. This means even the executed code resides in the data region of TKM, it can just run with the same privilege right like DLKM for not switching the CR3 register. This makes it impossible for code in data region to compromise the code and data of TKM.

Step 4. Protection of dynamically allocated data space

The isolation of static code, data, page table and stack has been discussed; next we will describe how to isolate the dynamically allocated data space. In order to facilitate the management, a special code segment for applying space, cmalloc, need be produced for DLKM. Once system function kmalloc or vmalloc is called by DLKM, as discussed earlier, a trap from kernel into the hypervisor will be caused. And our hypervisor will call cmalloc to apply a full page from the kernel. When it receives another request from DLKM to allocate space, it will allocate space directly in that page until the remainder space is not enough, in that case it will call cmalloc to apply another full page from the kernel.

When a new page assigned, the privilege right and synchronization will be conﬁgured according to Figure 2.

In summary, our approach can eﬀectively isolate the code, data, stack and dynamically allocated data space of TKM and DLKM, and prevent destroying of TKM code and data from DLKM. Meanwhile TKM and DLKM can directly access data from each other, which lower the cost of message copy.

4 Experiments and Results

We have tested a collection of 17 rootkits towards Linux kernel in Table 1. Among them, Adore, All-root, Knark, Linspy, Maxty, Modhide, Rial, Rkit and Shtroj2 are installed on Linux 2.4 kernel. Adore-ng, enyelkm, Mood-nt, Override, Superkit, SucKIT2 and Taskigt are installed on Linux 2.6 kernel. We have analyzed the rootkits and found that they modified the function pointers, kernel objects or registers which are not permitted being modified. By enforcing proper pre-defined policies, all the rootkits are detected and deterred by our security solution.

(7)

Table 1: Tested kernel rootkits

Rootkit name Function pointers Kernel objects Critical registers

modification modification modification

Adore D Adore-ng 0.56 D D All-root D D DR D D enyelkm D Knark D D Linspy D Maxty D Modhide D mood-nt D D override D Rial D Rkit D D Shtroj2 D D SucKIT2 D D superkit D D Taskigt D

5 Conclusions

This paper discusses the integrity concept of operating system. We have proposed a lightweight hypervisor for OS kernel integrity monitoring, described the details of monitoring mechanism and implemented a prototype system SecEye based on hardware virtualization features. Through securely isolating malicious dynamically loadable kernel modules, using hardware virtualization support, our SecEye approach can detect all known kernel attacks and possible implementation bugs.

References

[1] R. Richardson. CSI Computer Crime and Security Survey. Technical Report, Computer Security Institute, 2010.

[2] M. Castro, M. Costa and T. Harris. Securing software by enforcing data-ﬂow integrity. In Pro-ceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06), USENIX Association Berkeley, CA, USA, 2006, pp. 147-160.

[3] M. Abadi, M. B. U. Erlingsson and J. Ligatti. Control Flow Integrity: Principles, Implementations, and Applications. In Proceedings of ACM Conference on Computer and Communications Security (CCS’05), 2005.

[4] N. L. Petroni and M. Hick. Automated Detection of Persistent Kernel Control-ﬂow Attacks. In Proceedings of ACM Conference on Computer and Communications Security (CCS’07), 2007. [5] M. Sharif, W. Lee, W. Cui and A. Lanzi. Secure In-VM Monitoring Using Hardware Virtualization.

In Proceedings of ACM Conference on Computer and Communications Security (CCS’09), 2009. [6] N. L. Petroni, T. Fraser, J. Molina and W. A. Arbaugh. Copilot-a coprocessor-based kernel runtime

(8)

[7] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt and A. Warﬁeld. Xen and the Art of Virtualization. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP’03), 2003.

[8] N. L. Petroni, T. Fraser, A. Walters and W. A. Arbaugh. An Architecture for Speciﬁcation-based Detection of Semantic Integrity Violations in Kernel Dynamic Data. In Proceedings of 15th USENIX Security Symposium, 2006.

[9] A. Seshadri, M. Luk, E. Shi, A. Perrig, L. V. Doorn and P. Khosla. Pioneer: Verifying Integrity and Guaranteeing Execution of Code on Legacy Platforms. In Proceedings of ACM Symposium on Operating Systems Principles (SOSP’05), 2005.

[10] T. Garﬁnkel, M. Rosenblum. A Virtual Machine Introspection Based Architecture for Intrusion Detection. In Proceedings of the Network and Distributed System Security Symposium (NDSS’03), 2003.

[11] A. Seshadri, M. Luk, N. Qu and A. Perrig. SecVisor: A Tiny Hypervisor to Provide Lifetime Kernel Code Integrity for Commodity OSes. In Proceedings of ACM Symposium on Operating Systems Principles (SOSP’07), 2007.

[12] R. Riley, X. Jiang and D. Xu. Guest-Transparent Prevention of Kernel Rootkits with VMM-based Memory Shadowing. In Proceedings of the 11th Symposium on Recent Advances in Intrusion Detection, 2008.