Safe Internet Browsing using a Transparent Virtual Browser

(1)

Jeffrey Bickford

AT&T Security Research Center

New York, NY

Paul Giura

AT&T Security Research Center New York, NY

Abstract

With the proliferation of Internet access across the globe, as well as the advancement of many new devices and next generation networks, there is no surprise that malware infection via web browsing is still one of the most significant threats to Internet users today. Over the past several years we have also seen the increase in advanced targeted attacks against corporations which steal intellectual property and sensitive customer information. This problem is amplified as security is bypassed for work productivity and usability, while mo-bile devices increasingly access sensitive information. Though many organizations are beginning to invest significantly into securing their internal corporate network, users are typically given access to the Internet for web browsing purposes, leaving the enterprise vulnerable to drive-by downloads and data exfiltration attacks.

In this work we propose a new method to safely browse the Internet by redirecting web requests to a cloud-based Transparent Virtual Browser. Web browsing requests are au-tomatically redirected to the Transparent Virtual Browser via a transparent network proxy, protecting against user configu-ration errors or malware running on the device. The virtual browsing session is streamed back to the user securely, while maintaining a seamless user experience. Experiments show that our architecture can isolate web-attacks from a user’s machine, protecting enterprises from most of the attacks to which they are vulnerable today. Using a small user trial we tested our solution with several popular web browsers on various operating systems and report on their feedback. Our testing also shows that our prototype only incurs a small initial delay when browsing to a webpage while maintaining a seamless browsing experience for the rest of the browsing session.

I. INTRODUCTION

Web-based attacks continue to be a significant threat to In-ternet users today. Users can be easily tricked into downloading malicious software by downloading a file, opening an email attachment, or by visiting a compromised website hosting a drive-by download attack. Though many users and organiza-tions rely on malware protection through typical signature-based anti-malware systems, it is common knowledge that at-tackers can easily bypass these mechanisms through techniques such as encryption and packing [1], [2].

Over the past several years, enterprises have been the vic-tims of advanced targeted attacks, called Advanced Persistent Threats, who’s goal is to steal intellectual property and cor-porate data. Attackers typically gain entry into the enterprise using social engineering techniques to trick employees into

browsing to a compromised website. By utilizing drive-by download techniques and zero-day exploits, these compro-mised websites infect the employee’s machine with customized malware giving the attackers access to the enterprise network. Once inside, attackers use open outgoing ports for command and control (C&C) communications (e.g. HTTP on port 80), and stay below the radar while gaining a significant foothold within the network for long periods of time (e.g. months or years)[3], [4], [5], [6]. As more and more high profile attacks become exposed [7], [8], the top concern of an enterprise security team has been to protect their network against these attacks.

Current solutions to protect against APT attacks can be grouped within the following different categories: application sandboxing [9], [10], malware analysis [11], [12], inbound /outbound traffic analysis [13], anomaly detection using ma-chine learning and data mining techniques [14]. Even though these methods might detect some of the attacks, they are not efficient against more sophisticated APTs. On one hand, it was shown that a well crafted malware is able to avoid sandbox detection by delaying the execution of malicious code until the sandbox times out, rendering the malware detection inefficient. On the other hand, anomaly detection that uses data mining and machine learning techniques is severely limited because of the very small sample of confirmed attacks [15]. As enterprises move to a model of accessing corporate data from any device, on any network, at any time, various solutions look to access corporate data via virtualized desktop infrastructure (VDI) and application environments. Though these solutions provide some increased security for access from untrusted devices and networks, these desktop environments are typically still given access to the normal enterprise network and the Internet, leaving the VDI environment still vulnerable to web-based malware infections and APT attacks.

Instead of designing a new mechanism for detecting these types of infections and attacks, we utilize a preventative approach to securely browse the Internet without the risk of malware infection. Instead of allowing an organization’s users to browse the Internet via their native browser on the device, external web pages are redirected and viewed in a separate Transparent Virtual Browser (TVB). The TVB will communicate with the user’s native browser, from where the initial request was made, via an RDP-like [16] protocol that will send back only images of the rendered webpage to the user’s browser. This results in a complete isolation between the potentially compromised website and the user’s corporate environment (i.e. laptop, mobile device, virtual desktop, etc). In this way any web-based malware will have no access to the data that resides in the protected user environment nor will it have an entry point into the enterprise network. Moreover,

(2)

we utilize a transparent network-based proxy, as they exist in most enterprise networks, to decide when to redirect browsing requests to the TVB. By not requiring a proxy setting on the end device, our method is resilient against situations where, due to user configuration changes or malware via other infection vectors, the device bypasses the proxy and exposes the user’s device to infections or command and control.

When designing the TVB, our goal was not only to protect against web-based attacks, but also to provide a seamless user experience and easy method for deployment within an enterprise architecture. A browsing session within the TVB looks as close as possible to a regular browsing session, only with slightly less functionality and a slight latency when scrolling. The TVB architecture can be easily integrated with any HTTP network proxy and is specifically built to work on cloud infrastructure. The network proxy gives the enterprise the ability to utilize the TVB for any set of URLs they wish, going all the way to restricting access to the whole Internet through the TVB. Internal websites render in the native browser as normal maintaining truly seamless experience for enterprise applications. Our prototype is fully network-based and only requires an HTML5 compatible browser on the user’s end device. Using a small user trial we tested the TVB with Firefox, Internet Explorer 9/10, Chrome, and Safari on various operating systems such as Windows, Mac OSX, and Linux. Testing shows that our prototype only incurs a 2.5 to 3 seconds initial delay when browsing to a webpage, then providing a seamless browsing experience for the rest of the browsing session. With our proposed solution we make the following key contributions:

– Eliminate the possibility of malware infection via browsing, while maintaining a seamless experience for the end user regardless of the native browser used (i.e. Internet Explorer, Firefox, Chrome, Safari, etc). – Minimize the potential vectors of data exfiltration

and command and control from/to the users working environment, i.e. restrict all access to the Internet from the working environment.

– Provide a network-based solution that is resilient mal-ware infection or user configuration changes on an end device.

The rest of the paper is structured as follows. Section II describes the threat model, the architecture of the proposed system along with the security analysis of our solution. Sec-tion III presents the implementaSec-tion details and SecSec-tion IV the evaluation results. Section V presents the discussion over several edge cases, Section VI summarizes the related work and, finally, in Section VII we present our conclusions.

II. PROPOSEDSYSTEM

We propose a secure method to browse the Internet by redirecting the execution of a webpage to a separate isolated Virtual Machine (VM). We utilize a network-based proxy to seamlessly redirect web browsing sessions from a user’s native browser to a single-use VM in the network. Moreover, the rendering of the webpage within this VM is streamed back to the user’s native browser, where the initial request was made, via an HTML5 RDP-like [16] protocol that will send back

Fig. 1. Architecture - User traffic is automatically routed through a proxy which decides to forward a web request either to the user’s native browser or through the TVB. The policy engine can decide where to route the web request based on various services such as blacklists, URL categorization services, and popularity.

only images of the rendered webpage. Using this approach, all of the webpage’s HTML, JavaScript, Flash, etc. is executed within the isolated VM, leaving the user’s native environment protected from any potential web-based attack. Our end goal is to maintain a truly seamless web browsing experience for the user, while maintaining the highest level of protection against web-based attacks.

A. Threat Model

This work provides a solution for web-based attacks that attempt to install some malicious software on a user’s end machine with or without their consent. Our goal is to pro-tect against malware infection via drive-by download attacks. Though drive-by downloads can occur via various methods, such as prompting the user to download a trojan, arbitrary code execution via zero-day exploits, malicious JavaScript, cross-site scripting (XSS) attacks, etc., our solution does not differentiate between the attack methodology. For simplicity sake, we define a drive-by download as any arbitrary code, usually native and non-HTML, that is executed due to brows-ing a webpage and has some malicious intent. This covers a wide-range of attacks such as botnet infections, targeted APT attacks, ransomwear such as CryptoLocker, general trojans, adware, and more.

Our method does not protect against certain other web-based attacks that do not involve drive-by downloads. For example, since the user is still allowed to browse a webpage and enter information into forms, we do not protect against phishing attacks which attempt to steal login credentials, personally identifiable information, and banking information. We also do not prevent against XSS session hijacking attacks, which steal the current cookie for a user’s logged in web session. We note though that since our VMs are destroyed after each use by default, a session hijacking attack could only hijack the current webpage that a user might be logged in to in this case.

Because user’s machine can be infected via vectors other than web browsing, such as email attachments and removable

(3)

USB drives, we do not rely on the integrity of the user’s system to enable our virtual browsing solution. Our decision to use a transparent network-based proxy to redirect browsing sessions to our infrastructure places the solution completely within the network and not on the user’s device. We do not rely on any proxy setting or specialized software that could be potentially disabled by malware if infected through other means. We understand that devices can be connected to other networks, therefore bypassing TVB protection, and we address this concern in Section V-B.

We do realize that there is still a possibility of the user’s system being compromised for a highly sophisticated attacker, but we believe this attack to be highly unlikely at this time and potentially detectable with more research in this area. An at-tacker would need to execute arbitrary code on the VM as well as exploit a vulnerability within both the RDP-like viewing protocol and the native browser’s HTML5 rendering engine. Though there have been exploits in remote viewing protocols, such as CVE-2008-5903, CVE-2012-0002, and CVE-2004-0962 [17], an attacker would have an additional step of also exploiting the native browser’s HTML5 representation of that remote viewing protocol. We leave an analysis of this potential threat vector as future work once our deployment architecture is finalized.

B. Architecture

Due to its seamless user experience and network-based design, we have named our solution the Transparent Virtual Browser (TVB). Our architecture, shown in Figure 1, has been designed for use within a private internal network, such as a corporate enterprise network, university network, or even a private home LAN. For the reminder of this paper, we will consider TVB usage within an enterprise network as the core motivation and driver of this work. Because users use many different devices today, as well as the recent shift for employees to “bring your own device” (BYOD) to work, our architecture is designed to handle any type of device as long as there is an HTML5 capable browser installed and it can access the internal network (e.g. via Ethernet, VPN, local wireless, etc.). By using a transparent network proxy, all web browsing traffic (typically HTTP) from a user’s device is automatically routed through the proxy.

When the proxy receives a new web request, this request is analyzed by aPolicy Engine. The goal of the Policy Engine is to determine where the webpage should be rendered; either within the user’s native browser or remotely within a virtual browsing session. The Policy Engine must make a decision based on the HTTP URL that the user has requested to browse based on some prior knowledge, reputation, popularity, and categorization of the URL. The Policy Engine can be an entity that exists within the network proxy itself or can be some other service either managed internally or by a third party that can be called on via APIs. In some cases, the Policy Engine may be a mixture of multiple data sources and services that when combined can make a determination.

Because of cost, deployment feasibility, usability, and user adoption rate, it may not be feasible to utilize the TVB for every webpage on the Internet. From a usability standpoint, there will also be some additional latency and lack of features

Fig. 2. Security Knob Concept - Based on the enterprise’s decision, the policy engine can be progressively set to various levels of security from left to right. For example, the proxy can be configured to route only suspicious or unknown URLs to the TVB. In the most secure case, all web requests could be rendered using the TVB.

when compared to a native browser. The Policy Engine’s determination is up to the enterprise and how they decide to configure the policy based on this tradeoff. Figure 2 represents a methodology in which enterprises can choose to configure the Policy Engine. For example, on the left side of the

security knob, the TVB provides no additional security when all Internet access occurs through the native browser. As we move to the right, the Policy Engine is configured to route more web requests through the TVB architecture instead of the native browser. For example, the Policy Engine could redirect requests when the URL matches a known suspicious URL or even unknown URLs that the Policy Engine has never seen before. As we move towards the right, every URL other than very popular ones could be viewed within the TVB. In the most secure case, all webpages could be viewed via the TVB. The decision on where to place this knob must be based on a tradeoff between security, user experience, and cost. The highest security level also requires the highest cost in terms of resources because for each webpage, visited by any user, that webpage has to be rendered in a dedicated Virtual Browser, thus a dedicated VM. However, the architecture allows for an intermediate security level by requiring only a specific set of URLs (e.g. suspicious and unknown) to be redirected and rendered by the TVB.

If the Policy Engine decides that the request should be rendered within the user’s native browser, everything occurs as normal. If not, the request gets forwarded to a Virtual Browser. The implementation of a Virtual Browser could be fairly different across TVB implementations. In general, the Virtual Browser could be an instance of a browser running in it’s own isolated temporary VM. Another possibility is to have a powerful server host multiple browser instances and use process and user id based isolation to isolate different virtual browsing sessions. The request will be forwarded to a Virtual Browser cluster that will select an available Virtual Browser to render the webpage. The Virtual Browser will then access the Internet to download and render the webpage as normal. The rendered webpage canvas (e.g. the viewable area of the browser not including the control bars) is streamed to the user’s native browser over an RDP-like protocol and allow the user to browse the page as normal. In this way, all code that

(4)

is executed in order to render the webpage will occur in the Virtual Browser. Therefore, any malware that may have been embedded in the webpage will be downloaded and installed on the Virtual Browser, completely separate from the user’s system.

Since the Virtual Browser has no access to the enterprise network or the user’s system, any attack that attempts to steal or destroy enterprise data will be rendered useless. Once the user closes their native browser window (or tab) displaying the webpage, the Virtual Browser will be destroyed and a new Virtual Browser will spawn from a fresh state back into the pool of available Virtual Browsers. When using a separate VM per Virtual Browser, malware will not persist across reboots. If the Virtual Browser does get infected by malware, the effectiveness of any possible piece of malware will be limited in both, space - the resources a single Virtual Browser environment has, and time - the duration of the user’s particular browsing session. Additionally, by using the proposed architecture one might be able to more efficiently monitor the behavior of previously unknown websites from a well controlled environment. Moreover, if the Virtual Browser environment can identify malicious webpages via various de-tection algorithms, the TVB can provide feedback back to URL filtering services, blacklists, and the Policy Engine. For our initial design and prototype, we are not focused on providing such functionality and we leave this as a future enhancement of our system.

III. IMPLEMENTATION

Fig. 3. Implementation - We instantiate a pool of Browser VMs that stream webpages back to a user’s native browser using a Guacamole HTML5 clientless RDP protocol. The proxy redirects uncategorized URL requests to a VM Manager, who selects the next available VM for the user.

In order to test the concept of the Transparent Virtual Browser, we developed a prototype in a corporate lab en-vironment as shown in Figure 3. In our initial prototype, we assume that employees are using devices connected to the corporate network. In most enterprises these are typically Windows laptops, but the prototype architecture also supports, and has been tested against, a VDI environment and other popular desktop or laptop operating systems, such as Linux and Mac OSX. Though we focus on the enterprise use case, the concept can be abstracted to other types of devices and networks as well.

Each device using our prototype TVB is configured to route its traffic through a proxy server. In an actual deployment, a transparent network proxy can be utilized so that all traffic is automatically routed through the proxy and no configuration is needed on the end device. This protects against potential user configuration errors or malicious setting changes to avoid the TVB. In order to redirect users to the TVB, we utilize a popular third party commercial proxy server. The proxy server is typically used for web filtering and utilizes the third party’s web analysis platform and database to categorize every URL seen throughout the day. For example, based on the enterprise’s policy, the proxy can be configured to block access to websites that the proxy provider has categorized as Social Networking

or Phishing. Due to the fact that there are billions of websites across the Internet with new ones being created every day, the proxy does not have an understanding of every URL on the Internet. These URLs are unknownto the proxy therefore are considereduncategorized. How the proxy categorizes these websites is out of scope of this paper and should only be thought of as one possible method to determine when to utilize the TVB.

In our prototype, we redirect URL requests to our system when the proxy determines that the URL is uncategorized. Though we understand that this particular implementation does not protect the corporate network from all web-based malware attacks, this decision point on the tunable knob in Figure 2 represents the most likely path to deployment in our production enterprise network. Once deployed and as adoption of the TVB increases, we can plan to increase the number of websites redirected to the TVB with the hope that eventually all external web browsing occurs though the TVB.

When a user browses to an uncategorized URL, the proxy redirects this request to a VM Manager. The VM Manager is a self-developed python daemon, built from 310 lines of code, and firewall that runs on a virtual machine and acts as a gateway into our system. The VM Manager manages a pool of Browser VMs and redirects browsing requests to the next available VM in the pool. We define next available as a VM that has previously been booted from a fresh state and has not been used yet to browse a URL. In our initial prototype, each instance of a browser is running in its own separate VM running above a VMware ESXi hypervisor. We currently support 20 concurrent VMs, but this can be easily scaled as hardware allows. Each Browser VM is booted from a pristine disk image in non-persistent mode such that no changes are ever saved to disk. The Browser VM itself runs an Ubuntu 12.04 Desktop operating system with 2 GB of RAM and 2 CPU cores. We note that in an actual deployment, we would use a more lightweight operating system in order to minimize the amount of resources required for each TVB instance, thereby minimizing the cost of deployment.

Each Browser VM hosts its own HTML5 client-less RDP server called Guacamole [18]. Guacamole allows a client machine to connect to an XRDP session running on the Browser VM via its native browser. When the VM Manager receives a request from the proxy, it selects the next available Browser VM, as mentioned above, and redirects the user’s browser to Guacamole. At this point, the user’s browser receives an HTML5 XRDP session into the Browser VM. The uncategorized URL, that the user attempted to natively

(5)

Fig. 4. A sample webpage displayed using the TVB system in a native Firefox browser.

browse, is passed from the VM Manager to a python daemon running on the Browser VM. The daemon on the Browser VM is 150 lines of code and provides interaction between the VM Manager and the Browser VM. Once the user’s native browser logs into the XRDP session, the python daemon spawns a Chromium browser and renders the uncategorized URL. In order to provide a seamless user experience, we hide the additional address bar and control buttons by spawning Chromium in full screen kiosk mode as shown in Figure 4. The XRDP session is customary configured without a desktop and disables many keyboard shortcuts so that the user cannot close the browser, start other programs, create new tabs, etc.

In order to abide by company policy, web traffic from the Browser VM still traverses the proxy so that content filtering is still enabled. Once the user enters a virtual browsing session, they must close the tab or browser to close the session. When the virtual browsing session has ended, the Browser VM is rebooted from a pristine image and added back into the available Browser VM Pool for future use.

IV. EVALUATION

In this section we present the evaluation of our proposed TVB system. We start with the security evaluation where we present our experiments of using the prototype TVB to protect against real world web attacks. Then, we present the experiments conducted to assess the user perceived latency. Next, based on current web traffic of a large enterprise, we evaluate the infrastructure overhead incurred by our system for a time period and, based on this, we make recommendations about the necessary resources needed to deploy the TVB in an enterprise. Finally, we present the results of a small user evaluation.

A. Security Evaluation

In order to evaluate the security of our architecture, we decided to test the TVB against various real world web-attacks and compare its effect on the system when compared to native browsing. We utilize an in-house sacrificial VM environment, which is typically used for manual malware and URL analysis. This environment records statistics of a Windows 7 system, such as file system events, unique network

connections, and registry modifications. The implementation details of this environment are out of the scope of this paper, but can be assumed to be similar to other malware analysis platforms [19], [20], [21], [22].

Our evaluation is as follows. We utilize the Malware Domain List [23] in order to obtainlive URLsthat host some type of web attack. Due to the fact that most URLs only stay alive for a limited amount of time, we also created a VM hosting the Metasploit framework [24] to simulate live attacks not found on the Malware Domain List. As seen in Table I we were able to evaluate different web attacks such as Java drive-by downloads and trojan downloads. Though we attempted to test against other attacks such as exploit kits and ransomwear, none were triggered within our environment. We measure the effects of the system in two scenarios; browsing the URL using the native Mozilla Firefox 3.6.3 browser in the sacrificial VM environment and via our TVB architecture using the Firefox browser within the sacrificial VM environment as the native browser. In each case, we measure the number of files accessed, processes created, network connections, and registry keys changed while loading a single webpage.

In general, across all types of web attacks we tested against, we see that the TVB does indeed isolate the system from potential compromise. In all cases, the number of files accessed and the number of network connections is significantly higher in the native browsing case than when browsing within the TVB. When using the TVB the files accessed are typically files related to Firefox caching or background system events. In the native browsing case, we see additional file activity due to payload drops and malicious activity once the malware has begun executing. With regards to network connections, we measure the number of unique destination IP and port pairs that the sacrificial VM environment connects to. We see that when using the TVB, the sacrificial VM environment only connects to 2-3 destinations, 2 of which are correlated with our TVB architecture. In cases where a malicious payload actually makes changes to the system, we see many registry changes when natively browsing. The most obvious result is with regards to process creation, where we see many processes being created during a web attack and those processes being isolated from the system when utilizing the TVB.

To provide more insight into the results presented by table Table I we will explain the Java drive-by download scenario in more detail. To replicate a Java drive-by download attack we use the java_signed_applet attack module within the Metasploit framework on a VM hosted in the cloud. When a user browses to our attack server, the user is prompted to run a Java applet. Here we assume the user selects OK (as commonly occurs due to social engineering methods) and the Java applet is executed. Through the Java applet, the attack server pushes down a reverse shell executable which has been obfuscated to bypass typical anti-virus signature checks. The executable creates a reverse shell back to the attack server and we obtain a shell into the sacrificial VM environment.

We can see from Table I that 5 processes were created when using the native browser. From analyzing our results, we see that two instances of java.exe, two instances of conhost.exe, and one instance of OjYVpKPF.exe were created. OjYVpKPF.exe is a temporary file pushed down via the Java applet and runs in order to generate a reverse

(6)

Attack Type Files Accessed Processes Connections Registry Keys

Native TVB Native TVB Native TVB Native TVB No Attack 42±3 24±6 0 0 35±5 3±1 24±3 0 Java Drive-by Download 34±5 17±5 5 0 25±2 2±1 0 0 Trojan Download 90±9 21±9 3 0 15±4 3±1 33±3 0

TABLE I. RESULTS COMPARING NATIVE BROWSING VERSUSTVBBROWSING.

shell. conhost.exe manages access to instances of the Windows console, as explained in [25], and is used during the initial payload drop and the reverse shell. When using the TVB, no processes are created on the host OS since the Java applet is executed in the remote VM. Once the Java applet is executed, the Metasploit module redirects to a popular website, resulting in about 25 unique network connections in the native browsing case. When using the TVB, the number of network connections is significantly less due to the fact that it only requires connections to the TVB.

B. Latency of Initial Web Request

During this section, we will determine the user perceived latency when utilizing the TVB to browse a webpage. In order to identify the potential overhead incurred by our system, we consider the steps shown in Figure 3. We define the time required to render a webpage from the initial request as TN B =t1+td+tr for the native browser, and TT V B =

t1+t2+t3 +t4+t1+td+tr for the TVB, where t1 is the time to send the URL request from the user device to the proxy,t2is the time needed to redirect the HTTP request from the proxy to the VM Manager, t3 is the time to pick the next available VM and redirect the HTTP request to it, t4 is the time to establish an HTML5 client-less RDP session,td is the

time to download the webpage code, tr is the time to render

the webpage in a browser. Note that t1 is the same for both cases and, for simplicity we assume that td andtr have equal

values for both NB and TVB because one can use the same browser to download and render the webpage in both cases. Then we can define the user perceived latency when using TVB as L=TT V B −TN B=t1+t2+t3+t4. Because t1, t2 and t3 represent constant time quick and simple redirect operations, we believe that the dominant delay is generated by the time it takes for the HTML5 client-less RDP session to be established. Therefore the approximate latency of an initial web request, L, is on the same order ast4.

In order to measureL, the initial delay of our system, we add logging mechanisms at as many steps of our architecture as possible. Specifically, we record a timestamp att1,t3, and t4. Our timestamp att4occurs directly before the web browser on the Browser VM is spawned. In common language, L is the amount of time between when the user attempts to browse a page and the time at which the web browser on the Browser VM is spawned. Since these timestamps are generated on three different systems, the user’s client, the VM Manager, and the Browser VM, we synchronize all our systems using NTP [26]. Figure 5 shows our results for the initial delayL. We test on three popular websites across multiple different operating system and browser configurations. For each website and OS/browser configuration, we perform each experiment 10 times and report an average of each. Across all cases, the delay incurred due to the TVB is 2.5 to 3 seconds. It is important to note that once this initial delay occurs, the user’s browsing experience is now seamless and comparable to a

Fig. 5. Initial latency when browsing to popular websites using different OS/browser combinations.

normal browsing session. In general, we see that across the same platform and browser, the initial delay is consistent across all webpages. This is due to the factLis not dependent on the time required to download and render the webpage. Across different OS and browser versions we do see some slight variations in time. More research is needed to understand why this variation occurs, though we do believe it is due to the differences in how different operating systems are handling network traffic, process creation, and scheduling differently while establishing the RDP session.

C. User Evaluation

We asked 15 users to use the TVB during a multiple week trial period. Our goal was to evaluate the user experience of the TVB when compared to normal web browsing. We configured the network proxy to utilize the TVB on various popular URLs, such as http://cnn.com, http://amazon.com, and http://espn.go.com, as well as uncategorized URLs. Each user browsed to these pages and gave us a qualitative feedback on their experience. We also asked the users to use our prototype implementation during their normal work day and report back if they experienced the TVB when attempting to travel to an uncategorized URL.

Feedback was fairly positive, though most trial participants noted that some normal browser features are missing while inside the TVB, such as the back button, seeing the URL when hovering over links, seeing the current URL in the native browser’s URL bar, and the use of multiple tabs. Because of this, most report that they are comfortable with using the TVB for new links they have never visited before, but for popular websites that they visit every day, they would still prefer using their native browser.

We understand that these features are expected and have plans to extend Guacamole and the Browser VM so that these features exist via the native browser. For example, in order

(7)

to show the actual link the user is browsing, we plan to modify Guacamole’s URL (seen in the native browser) such that it is extended with the current virtual browser URL using HTML5’s history.pushState method. We also plan to use HTML5’s history API to translate the back and forward button presses on the native browser to the browser within the VM. Though these are important implementation details to make the user experience as good as possible, the underlying concept and architecture is still the same. Our end goal is to make the TVB experience as seamless as possible such that the user does not really notice the difference.

Throughout this small trial, users used a variety of different native browsers and versions, such as Firefox, Chrome, Safari, and Internet Explorer. They also used different operating systems such as Windows 7, Linux, and Mac OSX. The benefit of using Guacamole is that any browser that supports the common features of HTML5 will work as expected. In our testing we found every popular browser to work other than Internet Explorer 8, due to the fact that it does not support HTML5. Interestingly enough, users that used older browsers, such as an old version of Firefox, had a worse experience from a speed of scrolling standpoint than a newer version. Though we have no quantitative results comparing browsers, we believe the speed and usability of the system is very dependent on Guacamole’s translation protocol from XRDP to HTML5 as well as the browsers ability to render this HTML5 quickly. Thus, different browsers may have a different user experience due to the speed of their HTML5 rendering engine. Our hope is that as each technology improves over time the user experience and latency of the TVB will also improve.

D. Infrastructure Overhead

The main cost of the TVB architecture is directly pro-portional to the number of continuous virtual browsing ses-sions needed at one time. If all Internet access was forced through the TVB, then each employee would require their own dedicated virtual browsing session. As mentioned above, our implementation only initiates the TVB on uncategorized URLs due to deployability reasons. In this case, we do not have to support a virtual browsing session for each user at the same time, making deployment more cost effective and feasible in the initial stages of adoption. As we increase the number of websites that utilize the TVB, the number of supported concurrent sessions will increase and the hardware needed can be purchased incrementally over time as budget allows.

Since our implementation uses one VM per secure brows-ing session, we wanted to get an approximate number of VMs needed for a significantly large enterprise. The current proxy architecture deployed in our production enterprise network supports most employees and can count how many users have traveled to uncategorized URLs over time each day. For example, in a single day the peak number of concurrent users traveling to an uncategorized URL was approximately 5500 in one hour. Within this hour, there was a peak of 150 users per minute. On average, most people stayed within a web session for about 3 minutes, up to a maximum of about 15 minutes.

By taking our one day example above, if we assume an average 3 minute secure browsing session per user, we can approximate that their VMs will be rebooted and placed back

into the pool after approximately 5 minutes. We currently keep the web session open for some additional time in order to ensure the user has actually disconnected from the VM. By taking our peak traffic of 150 users per minute and assuming a uniform distribution across a single hour, we would need 750 VMs to handle 150 users per minute. Since each VMware host is limited to 512 VMs each, we would require two physical blades. Due to the fact that the distribution will never be uniform, we leave to future work the development of algorithms that can determine how many concurrent web sessions are needed based on our proxy data. These algorithms would also be used for allowing the expansion of resources as needed. For instance, one may decide to increase the number of available VMs when there are 90% in use. We can also model how many VMs would be needed as we increase the number of URLs that get redirected through the TVB. In the worst case scenario, the organization will need one VM per employee. As memory sharing and rapid VM cloning techniques [27], [28], [29] become available in commercial hypervisors, the number of supported VMs per host should drastically increase (since the VMs are all based off of the same image), reducing cost and making this type of deployment even more feasible.

V. DISCUSSION

In this section we discuss several cases, challenges and possible extension that need special attention and consideration for the completeness of our proposed system.

A. Redirecting all Internet-based Web Browsing Through the TVB

As mentioned previously, in order to protect against all possible malware infections due to web attacks, all Internet-based web browsing must occur through the TVB. Forcing all Internet access through the TVB will require several additions to fix the usability issues as explained in IV-C, such as maintaining consistency in the URL bar and ensuring the native forwards and backwards buttons transfer to the virtual browser. We believe there are also several other concerns that must be addressed before being able to redirect all web browsing through the TVB. The goal is to eventually restrict all outbound IP connections at the network’s edge in order to prevent malicious C&C channels, data exfiltration, and other unwanted outgoing or incoming network traffic.

When users browse the web, they frequently need to download documents, images, or other files for use in their daily work. Though in most cases, viewing these files within the Browser VM over the Guacamole RDP session may be good enough, in some cases, the users may need to actually download the file for use in their working environment. For this reason, we have explored the possibility of adding a file download mechanism in the TVB architecture. One possible implementation would be to mount the Downloadfolder of the Browser VM as read-only in user’s environment when connected to the Browser VM. This would allow users to copy any file downloaded in the Browser VM to their native machine. Other possibilities also exist such as analyzing files in a sandbox in search for malicious behavior, before the download is allowed on the user’s device. However, we note that such as a solution may limit the effectiveness of the TVB as this download mechanism may provide a vector

(8)

for malware infection. We believe downloads from the TVB should only be allowed when all outbound traffic from the internal network is disabled so that even if a user’s machine is infected, C&C channels are blocked preventing the attack and the attacker’s ability to remotely control their malware. If not, then a malicious file could be downloaded through this mechanism and give attackers an entry point into the enterprise. We also believe an upload mechanism may be possible, though this would need to go through some type of rigorous vetting process to ensure that corporate data is not being wrongfully exfiltrated.

In some cases the biggest concern from a deployment perspective is the fear of breaking internal applications and ser-vices that require access to the Internet for reasons other than normal web browsing. There are many cases where internal applications, internal web services, and servers need outbound access to the Internet for features such as updates, access to 3rd party web services, etc. Due to complexity issues in network management and firewall policy construction, it is hard to restrict access only to the specific Internet resources each internal service requires. We believe that the complexity issues must be addressed with further research so that these outbound access policies can be easily constructed and implemented. A server within an enterprise network with uncontrolled out-bound access to the Internet could eventually end up being a data exiltration and C&C gateway for an advanced attacker. We plan to further analyze enterprise architectures to understand and develop solutions where this can become a reality.

B. Using TVB with a VDI solution

Another challenge to the effectiveness of our solution is the fact that users may not be connected to the enterprise network at all times, in which case there will be no clear way to redirect web requests to the TVB. In such a case, a user might connect to a public WiFi access point, bypass the enterprise proxy, browse to a malicious Internet website and get their device infected. Then, when the infected device is connected back to the enterprise network, the malware might get access to sensitive data and internal enterprise resources.

Due to the fact that employees would like to access corpo-rate resources from any device as well as the significant push towards BYOD, enterprises are moving towards deploying Virtual Desktop Infrastructure (VDI) solutions. This results in users accessing corporate data and resources securely from any device, any location, and using any network connection, even public WiFi hotspots. In such a case, the user will access the corporate environment from a VDI client on their device. The benefit of VDI is that due to using an RDP-like protocol, only images of the virtual desktop is streamed back, leaving no proprietary data behind on the potentially untrustworthy end device. From the virtual desktop, the user can access corporate resources and work as normal.

Though VDI provides a feasible solution for accessing corporate resources securely, it does not solve the problem that our TVB architecture addresses. VDI infrastructure is typically placed within the enterprise environment and given the same outbound Internet access as any other normal system within the network. This means, users are subject to the same types of web attacks and malware threats as previously. Though

VDI can be a step forward in some cases due to consistent reboots from a fresh image, application whitelisting, frequent updates, and single point of control, they can still provide an attacker entry point into the enterprise network and are prone to Internet-based malware infection.

Because virtual desktops always reside within the enter-prise network, they are a prime target for adoption of our TVB solution. Moreover, by leveraging the fact that VDI is always connected to the enterprise network, we can guarantee that all the Internet bound requests go through the network proxy regardless of the network that the native device is currently connected to. In this scenario, even if the user’s native device is compromised with malware, only images are streamed via the VDI client, severely limiting the possibility of data exfiltration. We leave to future discussion a target architecture which mandates all employees access corporate resources through VDI with all Internet access restricted through the TVB.

C. Malware Detection and URL Categorization

The TVB can also be used as a feedback mechanism for the network proxy and URL filtering/categorization services. For example, in our prototype implementation, the TVB was used for URLs that the proxy service had never seen before and were therefore uncategorized. In cases where the URL actually directs the user to a malicious website, the Browser VM can be used as a malware analysis platform in order to detect web attacks and drive-by downloads. This information could be used to feed into other security services within the enterprise such as IDS systems, application firewalls, and other third party appliances.

In some cases, users could potentially be requested to provide feedback on what type of website they are browsing. This user feedback could be used to help improve the proxy service’s URL categorization algorithm. Because our proposed method it not intended to detect URLs that host phishing web-sites, one can imagine a mechanism of allowing users feedback that reports such URLs to the categorization engine. Once the reported phishing URL is verified it could be immediately added to a blacklist and all the subsequent requests to the phishing URL will be blocked.

VI. RELATEDWORK

The Advanced Persistent Threat problem was defined and presented in many security professional forums and by many organizations in recent years [6], [30], [14], [5], [3], [31]. Solutions currently proposed, such as FireEye and SpyProxy, rely on detecting attacks via the execution of any Internet-based program in a sandbox VM [11], [12]. Other solutions rely on the active detection of the malicious activity by means of analyzing outbound traffic in search for anomalous activity [13], or by employing large-scale data processing using Big Data tools [4], data mining and machine learning [14]. On one hand, it was shown that a well crafted malware is able to avoid sandbox detection by delaying the execution of malicious code until the sandbox times out, rendering the malware detection method useless. On the other hand, by the nature of these detection solutions, there is still a risk of not being able to detect highly targeted, low and slow type of attacks, and in the same time the need to mitigate the inherent false positives.

(9)

Moreover, even though these solutions might be efficient in detecting APTs, the delay in the reaction time still allows the attackers to make some damage to the targeted organization and to have an associated cost with the potential data breach. In comparison, our solution has the potential to prevent any web-based APT, thus eliminating the need to handle any false positives or the time of exposure to the attack.

Similar in spirit to the TVB, Zavou et. al. [32] use a split browser, to protect against web attacks and data leakage. Split browsers, such as Amazon Silk [33], render a web session within the cloud and stream an embedded version of the web page back to the user’s mobile device. Other split browsers such as MarioNet [34] and Opera mini render a web page on a remote server and stream back only a bitmap image of the web page. In this work, Zavou et. al. use the fact that pages are rendered within the cloud to apply costly protection mechanisms, such as data flow tracking, to the rending engine and protect against common web attacks such as SQL injection and XSS exploitation. In comparison, in our method we don’t rely on a malware detection mechanism to protect the user. Instead we prevent any Internet-based malware to be executed in the user native browser, thus protecting the user form any possible malicious program without the need to identify it.

Most common to our approach is a new startup company called Light Point Security [35]. Light Point Web is a plugin for Firefox and Internet Explorer which allows the user to offload web browsing to a single-use virtual machine. They use a proprietary protocol to stream back the session to the user’s native browser. Though the end goal is the same, our TVB solution differs in the following ways. By using RDP and HTML5 to view a virtual browsing session, we do not rely on a plugin and can support any HTML5 compatible browser. We also utilize a network proxy to redirect sessions to the TVB, removing the decision point from the client and allowing the enterprise to manage TVB use via the network. Light Point is more limited as they force the user to browse everything via their plugin other than an enterprise specific pre-defined whitelist. By relying on a client-side plugin, malware infections via other vectors, such as email, could disable the protection mechanism.

Previous work presented by [20], [36], [37], [38], [39], [40], [41] all follow a long line of research which detects or prevents web attacks through JavaScript emulation, machine learning, static analysis, and dynamic analysis. Utilizing these detection mechanisms within a network proxy could be an alternative to the TVB. For example, CUJO [37] is a system which embeds itself inside of a web proxy and blocks the delivery of malicious JavaScript code using static and dynamic analysis techniques. The overhead of a CUJO is around 500ms, but similar techniques have addressed the attacker’s ability to use novel approaches to bypass the detection algorithm or the JavaScript emulation itself [20]. For slightly higher delay (that should improve significantly over time), we provide the guarantee that even if attackers use a new method which happens to bypass one of the algorithms above, the attack is still isolated within the TVB.

VII. CONCLUSION

In this paper we have presented a novel system, called the Transparent Virtual Browser, that allows a user to safely

browse the Internet by rending webpages in a remote virtual machine and streaming them back both seamlessly and se-curely. Redirection to the TVB is decided by a transparent network proxy that is not vulnerable to any malware or configuration changes on the user’s end device. By using the transparent network proxy and the HTML5 functionality of current popular browsers, our system enables a seamless user experience while providing complete protection against drive-by download attacks in the background.

We tested our solution in a small user trial with several popular web browsers running on various operating systems. The testing shows that our prototype implementation incurs only between 2.5 to 3 seconds initial latency when browsing to a webpage while maintaining a seamless browsing experience for the rest of the browsing session. Moreover, the initial delay is not significantly influenced by the browser/operating system combination nor by the webpage visited. Additionally, we tested the solution against real world web attacks and we show that the attack has no effect on the user’s system and provides complete isolation. Through future work we plan to enable more functionality to the TVB system so that it can be adopted for all Internet-based web browsing, protecting enterprises against a wide range of potential threats.

REFERENCES

[1] M. Christodorescu, “Behavior-based Malware Detection,” inUniversity of Wisconsin-Madison, August 2007.

[2] K. W. Hamlen, V. Mohan, M. M. Masud, L. Khan, and B. Thurais-ingham, “Exploiting an Antivirus Interface,”Computer Standards and Interfaces, November 2009.

[3] Dmitri Alperovitch, “Revealed: Operation Shady RAT,” http://bit.ly/ r555RE, 2011.

[4] P. Giura and W. Wang, “Using Large Scale Distributed Computing to Unveil Advanced Persistent Threats,” Academy of Science and Engineering Science Journal, vol. 1, no. 3, pp. 93–105, December 2012. [5] McAfee Labs and McAfee Foundstone Professional Services., “Protect-ing Your Critical Assets: Lessons Learned from ”Operation Aurora”,” http://bit.ly/x5DUXE, July 2010.

[6] B. Krekel, G. Bakos, and C. Barnett, “Capability of the People’s Republic of China to conduct cyber warfare and computer network exploitation,” The US–China Economic and Security Review Commis-sion, Washington, DC, Research Report, 2009.

[7] Mandiant, “Apt1: Exposing one of china’s cyber espionsage units,” Mandiant Intelligence Center Report, February 2013.

[8] K. Lab, “Unveiling ¨careto¨- the masked apt,” http://www.securelist.com/ en/downloads/vlpdfs/unveilingthemask v1.0.pdf, February 2014. [9] “Bromium,” http://www.bromium.com/.

[10] “Invincea,” http://www.invincea.com/. [11] “Fireeye,” http://www.fireeye.com/.

[12] A. Moshchuk, T. Bragin, D. Deville, S. D. Gribble, and H. M. Levy, “Spyproxy: Execution-based detection of malicious web content,” in Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, ser. SS’07. Berkeley, CA, USA: USENIX Association, 2007, pp. 3:1–3:16.

[13] SANS Technology Institute, “Assessing Outbound Traffic to Uncover Advanced Persistent Threat,” http://bit.ly/mnBuTc, May 2011. [14] Damballa, “The Command Structure of the Aurora Botnet,” http://www.

damballa.com/research/aurora/, March 2010.

[15] R. Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” inSecurity and Privacy (SP), 2010 IEEE Symposium on, May 2010, pp. 305–316.

[16] Microsoft, “Microsoft Remote Desktop Protocol (RDP),” http://bit.ly/ UGGZCy, 2011.

(10)

[17] “Cve - common vulnerabilities and exposures,” http://http://cve.mitre. org/.

[18] “Guacamole - html5 clientless remote desktop,” http://guac-dev.org/. [19] N. Provos, D. McNamee, P. Mavrommatis, K. Wang, N. Modadugu

et al., “The ghost in the browser analysis of web-based malware,” in Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets, 2007.

[20] M. Cova, C. Kruegel, and G. Vigna, “Detection and analysis of drive-by-download attacks and malicious javascript code,” inProceedings of the 19th international conference on World wide web, 2010.

[21] C. Willems, T. Holz, and F. Freiling, “Toward automated dynamic malware analysis using cwsandbox,”IEEE Security and Privacy, 2007. [22] “Cuckoo sandbox,” http://www.cuckoosandbox.org/.

[23] “Malware domain list,” http://www.malwaredomainlist.com/. [24] Rapid7, “Penetration testing software - metasploit,” http://www.

metasploit.com/.

[25] “Windows 7 / windows server 2008 r2: Console host,” http://goo.gl/ X9cbm2.

[26] Network Time Protocol (NTP), 2014.

[27] D. Gupta, S. Lee, M. Vrable, S. Savage, A. C. Snoeren, G. Varghese, G. M. Voelker, and A. Vahdat, “Difference engine: Harnessing memory redundancy in virtual machines,” vol. 53, no. 10, 2010.

[28] T. Wood, G. Tarasuk-Levin, P. Shenoy, P. Desnoyers, E. Cecchet, and M. D. Corner, “Memory buddies: exploiting page sharing for smart colocation in virtualized data centers,” in Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, 2009.

[29] H. A. Lagar-Cavilla, J. A. Whitney, A. M. Scannell, P. Patchin, S. M. Rumble, E. De Lara, M. Brudno, and M. Satyanarayanan, “Snowflock: rapid virtual machine cloning for cloud computing,” inProceedings of the 4th ACM European conference on Computer systems, 2009. [30] Verizon, “2010 Data Breach Investigations Report,” http://vz.to/cGCuf0,

July 2010.

[31] Richard Bejtlich, “Understanding the advanced persistent threat,” http: //bit.ly/TLW41a, 2011.

[32] A. Zavou, E. Athanasopoulos, G. Portokalidis, and A. D. Keromytis, “Exploiting split browsers for efficiently protecting user data,” in Proceedings of the 2012 ACM Workshop on Cloud computing security workshop, 2012.

[33] “Amazon silk,” http://amazonsilk.wordpress.com/.

[34] “Marionet split web browser,” http://en.wikipedia.org/wiki/MarioNet split web browser.

[35] “Malware protection from light point security,” http://lightpointsecurity. com/.

[36] M. Egele, P. Wurzinger, C. Kruegel, and E. Kirda, “Defending browsers against drive-by downloads: Mitigating heap-spraying code injection attacks,” in Detection of Intrusions and Malware, and Vulnerability Assessment, 2009.

[37] K. Rieck, T. Krueger, and A. Dewald, “Cujo: efficient detection and prevention of drive-by-download attacks,” inProceedings of the 26th Annual Computer Security Applications Conference, 2010.

[38] C. Curtsinger, B. Livshits, B. G. Zorn, and C. Seifert, “Zozzle: Fast and precise in-browser javascript malware detection.” in USENIX Security Symposium, 2011.

[39] P. Ratanaworabhan, V. B. Livshits, and B. G. Zorn, “Nozzle: A defense against heap-spraying code injection attacks.” in USENIX Security Symposium, 2009.

[40] K. Borgolte, C. Kruegel, and G. Vigna, “Delta: automatic identifica-tion of unknown web-based infecidentifica-tion campaigns,” in Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, 2013.

[41] A. Kapravelos, Y. Shoshitaishvili, M. Cova, C. Kruegel, and G. Vigna, “Revolver: An automated approach to the detection of evasive web-based malware,” inUSENIX Security Symposium, 2013.