Client Measurement Accuracy - Cloud Middlebox Placement

8.5 Cloud Middlebox Placement

8.6.2 Client Measurement Accuracy

The “last mile” connectivity between the residential user and the user’s ISP is often the throughput bottleneck. When this is the case, the choice of a cloud-based middlebox should have relatively little impact on throughput. However, in cases where this is not true, such as a congested peering point, the selection of a middlebox may be significantly more complicated. This approach argues for using client-side measurements to help determine the best throughput performance when proxying via different cloud middleboxes. Unfortunately, client throughput measurements may not be reliable in some cases.

In our study, we used JavaScript in web browsers to help us determine the cloud server that offered the highest bandwidth. Most clients had servers that were very close in bandwidth, making measurement precision important for selecting the best server. When we compared the real-time selection from the client’s JavaScript with our post analysis via the packet captures at our VMs, we noticed that clients occasionally selected a slightly suboptimal server as their “best” server. This had ramifications for proxying, since the bottleneck between the client and the suboptimal server ensured that the results with other servers, such as the actual optimal server, would necessarily

degrade. In production, such measurements would need to be done in a high precision way, such as with low-level packet analysis, to ensure an optimal selection.

During our experiments, we also did not allow any two servers to reside on the same physical cloud location. In a large-scale deployment, users may avoid degradation in some scenarios in which a middlebox server is deployed in the same cloud data center as the destination server. We expect this co-location deployment to be feasible for a large number of destinations in the near term given the continuing cloud outsourcing employed by enterprises [1].

8.7 Conclusion

In this work, we characterize residential network connections to cloud infrastructure. Using Ama- zon’s Mechanical Turk, we recruit 270 participants across the United States and use in-browser instrumentation to direct participants to connect to various cloud instances hosted by 4 major providers in different geographical location. We characterize the connections using JavaScript measurements reported by the client and packet captures on the servers we controlled. With this data, we examine the OpenFlow controller placement problem for residential SDNs and found that 90% of users were within 50 ms of a cloud instance. While this latency is most likely to affect the web browsing experience, due to its interdependent objects and connection characteristics, our subsequent analysis shows this impact primarily slows only advertising and analytics connections. We then examine how best to place middleboxes in cloud environments and find well placed middleboxes have the potential to improve end-to-end connections. With these results, we conclude that residential SDN and middleboxes are feasible for roughly 90% of US users even when limited to publicly available cloud VMs.

Chapter 9 Towards a ReSDN Testbed

9.1 Introduction

In Chapter 5, we introduced our ReSDN infrastructure and in Chapters 6 and 7 showed novel applications of our approach. However, that work has been limited to small scale service deployments, often consisting of a single participant. We explore our vision of a large scale deployment by in- crementally deploying our ReSDN infrastructure to participants using an IRB-approved study. By deploying a larger testbed, we can address the following limitations of a single ReSDN deployment:

• Homogeneous workloads: A single deployment provides homogenous workloads. Typ- ically, our application testing, experimentation, and verification drives the network traffic being generated. While targeted traffic generation simplifies testing and debugging, it may also subtly influence experimental results. Since workloads are only being generated to verify a particular application, the tests are not longitudinal. Accordingly, the long term effects are not well-studied. Having a larger testbed allows us to deploy and test a wider array of applications with traffic generated naturally by participants.

• Homogeneous devices: A single deployment testbed reduces the number of devices our controller infrastructure sees. In 2015, the average number of devices on residential networks grew to 5.7 [19]. Device heterogeneity leads to real-world security concerns, particularly with respect to IoT devices [61]. We will miss or be unable to address these concerns with a small testbed. A larger testbed allows us to leverage device heterogeneity for discovering, building, and testing new security solutions.

• Unrealistic privacy expectations: The flow-based middlebox infrastructure that we built in the cloud has assumed a holistic view into the residential network’s packets. Under this assumption, we were able to freely build applications that consumed the payload of packets such as DNS. In an actual deployment where a third-party is responsible for the control and management of the network, users may not be immediately willing to allow this. In- deed, Feamster’s [91] position paper notes that privacy is a major concern that needs to be addressed in outsourcing network security.

In document Software-defined Networking: Improving Security for Enterprise and Home Networks (Page 98-101)