• No results found

4 Exiting Evaluation Methods And Tools

n m

x

region agent

request propagate

links between regions

knowledge propagate

Figure 3-3: Request and knowledge propagation in the spec-KPs. This shows a request from agent m and a piece of knowledge from agent n meet at agent x.

of an agent, the cost model, and even the diameter of a specialized KP, etc. In Chapter 6 and Chapter 7, I present different propagation mechanisms based on the functionality and constraints of the spec-KPs.

Chapter 4

Cross-Region Organization

In this chapter, I focus on a network-topology-based distributed hash table for cross-region organization among the regional leaders, and present a hybrid neighbor selection mecha-nism using Autonomous System information. We find that, by combining the distributed hash table and network topology knowledge effectively, we can have a scalable, efficient, robust and non-intrusive organization among regions.

4.1 Design Rationale

Cross-region organization is a core issue in the NetKP, as agents in different regions need to collaborate with one another in order to address the problems that span the regions.

Unlike the organization within a region where each region may choose its own organizing structure, we need a unified approach among regions.

As mentioned in Section 3.2, four properties are important to the agent organization in the NetKP: efficiency, scalability, robustness, and non-intrusiveness. Section 3.3.2 briefly discusses two options for the organization among the regional leaders, and we elaborate them here. The first is a network-topology-based structure. In this structure, regions con-nect to each other following the network topology. Specifically, a region concon-nects to other regions that are in the same AS and the neighboring ASes. The neighboring ASes include its providers, customers and peers, or ASes that are closest in any of the three kinds. The advantage of this approach is that this structure follows the network topology naturally, and

thus is efficient for aggregating information and suppressing redundant requests. However, there are several problems with this structure. First, this approach may lead to an unbal-anced structure. We know that the Internet topology at the Autonomous System level can be described efficiently with power laws, where some ASes have many neighbors while many other ASes have only one or a few neighbors [116, 123]. As a result, this approach may lead to a structure in which some regions may have to connect to a large number of regions, while some can only find a few. For example, a region in a top-tier AS may have hundreds of neighboring regions, while a region in a small bottom-tier AS may only con-nect to its provider. Second, it is hard to maintain this structure in case of churn. If a region disappears due to agent departures, region merging or network partitions, its neighboring regions will need to discover other regions to maintain the connectivity. Third, when the regions are sparse in the Internet, the connectivity of this structure is not well defined. For example, if there are no regions in the provider’s AS, a region needs to find other regions in two or more AS hops. All these problems can be fixed, but obviously a more robust and clean design is needed.

The second choice is a distributed hash table (DHT). Distributed Hash Tables, such as CAN [104], Chord [122] , Pastry [109], and Tapestry [133], provide a scalable and robust organization, in which any information is guaranteed to be located in a limited number of steps (usuallyO(logn)). These systems provide a robust self-organizing infrastructure that is expected to become the fundamental component of large-scale distributed systems.

However, a pure DHT structure is not enough for our purpose.

DHTs provide scalability and robustness, but they often rely on active probing to achieve efficiency. Most topology-aware DHT lookup algorithms proposed so far, such as proxim-ity neighbor selection and proximproxim-ity routing [54], require each peer to probe other peers directly to discover proximity. Such probing generates a considerable amount of network traffic. Similarly, many network applications and services, such as end-system multicast [58], DHT-based DNS [103], and content delivery networks [12], require efficient organi-zation among the participants, and several previous research efforts focus on constructing network overlays in order to route traffic optimally, such as RON [16]. They all rely on active probing using ping or traceroute to measure path quality and to detect anomalies

[130, 39]. As a result, 1GB of ping traffic was observed daily on PlanetLab in 2003, which equals to one ping per second per node [82]. As another example, RON periodically mea-sures all inter-node paths to discover link failures; due to its probing traffic, a 50-node RON requires 33Kbps bandwidth for each node, which prevents it from scaling to a large size.

As overlay networks grow in popularity and sizes, this may lead to a significant increase in network traffic and contention for network resources, which may cause network instability.

We choose a hybrid structure that combines network topology and distributed hash ta-bles. DHTs provide scalability and robustness, and we need a design for efficiency and non-intrusiveness. In this chapter, we demonstrate the use of network topology knowledge to improve the efficiency of the DHT-based cross-region organization, while maintaining low overhead. Specifically, we design a hybrid proximity neighbor selection algorithm that uses Autonomous System information to estimate network latency. Proximity neighbor selection (PNS) in DHTs is, given a number of neighboring leaders, which ones a node should choose as the neighbors in its DHT routing table. This is an important issue, as it determines the efficiency of the region-level organization among leaders. We use the AS-path length as a proxy for network latency to filter out unlikely candidates without probing them during the selection process, and only a small number of leaders who pass the filtering will be probed. Compared with those approaches based on active probing, our algorithm can significantly reduce the amount of probing traffic without greatly undermin-ing DHT lookup performance, and our savundermin-ings on probundermin-ing traffic increase with network size, as demonstrated in the experiments. Note that in this organization, each regional leader also maintains a number of its topological neighboring regional leaders, as it is con-venient to resolve the requests that follow network topology whenever possible, but this list of neighbors is not required.

The rest of the chapter is organized as follows. Section 4.2 and 4.3 present an overview on distributed hash tables and the Internet topology, respectively. Section 4.4 presents a hybrid PNS algorithm using network topology information. Section 4.5 evaluates the performance of our approach. Section 4.6 discusses the policy implication of our approach.

Section 4.7 reviews related work. We conclude in Section 4.8.