5.5 Research Summary
5.5.2 Peer-to-peer Methods
This section addresses publish-subscribe solutions that do not use any managed brokers but instead relies solely on the peers of the system. Peers can be used to create a publish-subscribe solution on top of the application infrastructure building an application-level net-work of peers as brokers. This can be very effectively used to create large-scale information dissemination that is reliable and cheap as there does not exists any other components except for the clients.
Pastry
Pastry is an object location and routing substrate for large-scale distributed peer-to-peer applications. Pastry makes use of peer-to-peer communication to create an application-level topology over the network that can be used to implement a wide range of functionality such as global data storage, group communication and naming. It is used in a notification infrastructure, called Scribe, to provide a peer-to-peer publish-subscribe solution.
Pastry uses a 128-bit value to identify each node in its overlay topology. These node ids are assigned randomly to nodes and it is assumed that the generation of ids is uniformly distributed in the 128-bit space. An id could be generated using a hash function on a node’s public key or IP address which creates a high possibility that neighbouring nodes, those with adjacent ids, are diverse in, for example, geography and network attachment.
Messages contains a key K which is the node id of the recipient. Node ids and message keys are seen as a sequence of digits with base 2b. This sequence is used by Pastry to route a message to a node with a node id that is numerically closest to the message key.
A Pastry node maintains a leaf set and a routing table that it uses to route messages in the system. The leaf set L contains the |L|/2 nodes which have the numerically closest smaller node ids to the node and the |L|/2 nodes which have the numerically closest larger node ids to the node. |L| are typically 2b or 2 ∗ 2b. When a node receives a message, it first checks if the message key is within range of the leaf set and, if that is the case, routes it to the numerically closest node in the leaf set. The routing table consists of blog2bN c rows with 2b− 1 entries in each row where each entry on a row n shares the node’s id in the first n
5.5. Research Summary 53 Table 5.1: Routing table for a node i in a Pastry overlay network.
digits, but do not share the digit at position n + 1. Each entry in the routing table contains the IP address of one of the nodes that has fits this prefix. See Table 5.1 for an example of a routing table for a node i with id Ki = 31102951. When a message M with key K is received on a node A, and the key is not within range of the leaf set, the node checks the routing table and forwards the message to a node that shares a common prefix with the key by at least one more digit. Pseudo code for the core routing algorithm in Pastry is shown in Algorithm 3.
Algorithm 3: Routing algorithm for a node in a Pastry overlay network.
Data: A message M with key K that has arrived to the node with node id A Data: The entry Ril at column i, 0 ≤ i < 2b and row l, 0 ≤ l < b128/bc in the routing
table R.
Data: The i-th closest node id Li in the leaf set L, −b|L|/2c ≤ K ≤ b|L|/2c.
Data: The value of the digit Kl at position l in the key K.
Data: shl(A, B): the length of the prefix shared among A and B, in digits.
if L−b|L|/2c≤ K ≤ Lb|L|/2c then
Scribe[39] is a peer-to-peer topic-based publish-subscribe solution that is based on Pastry to create a fully decentralised application-level network overlay topology. It sets a rendezvous point for a topic and uses that to build a multicast tree by joining the Pastry routes from each subscriber up to the rendezvous point. Scribe consists of a Pastry network of peers where peers have equal responsibilities. Scribe adds two more types of functionality to the each node, namely the forward and deliver methods. The deliver method is invoked when a message arrives at a node with a node id numerically closest to the key of the message, or when a message was sent to the node with Pastry’s send operation. If a received message should not be delivered and instead forwarded, the node invokes the forward method. These methods will carry out a specific task depending on the message type which could be:
CREATE, SUBSCRIBE, PUBLISH, and UNSUBSCRIBE. The forward and the deliver methods are described in Algorithm 4 and Algorithm 5 respectively.
Each topic in Scribe has a unique topic id in the same format as a node id and a message key. The Scribe node that is numerically closest to the topic id acts as the rendezvous point for the topic and forms the root of the topic’s multicast tree. To create a topic, a
54 Chapter 5. Publish-Subscribe Pattern
Algorithm 4: Forwarding algorithm for a Scribe node.
forward(msg, key, nextId):
switch msg.type do case SUBSCRIBE
if msg.topic 6⊂ topics then add msg.topic to topics msg.source = thisNodeId route(msg, msg.topic) end
add msg.source to topics[msg.topic].children nextId = null
endsw endsw
Algorithm 5: Deliver algorithm for a Scribe Node.
deliver(msg, key):
switch msg.type do case CREATE
add msg.topic to topics endsw
case SUBSCRIBE
add msg.source to topics[msg.topic].children endsw
case PUBLISH
for every node in topics[msg.topic].children do send(msg, node)
end
if subscribedTo(msg.topic) then
invokeEventHandler(msg.topic, msg) end
endsw
case UNSUBSCRIBE
remove msg.source from topics[msg.topic].children if |topics[msg.topic].children| == 0 then
invokeEventHandler(msg.topic, msg) msg.source = thisNodeId
send(msg, topics[msg.topic].parent) end
endsw endsw
5.5. Research Summary 55
Scribe node uses Pastry to route a message with message type CREATE and topic id as the message key. The numerically closest node then adds the topic to the list of topics it already knows about using the deliver method. The id of a topic is hashed so that topic ids and consequently rendezvous points are uniformly distributed over the nodes of the Pastry network.
The multicast tree of a topic is built by joining the Pastry routes from the subscribers to the rendezvous point and for each node on the way, the forward method will be invoked. If a forwarding node is not already a member of the multicast tree of the topic, it will set itself as a forwarder of the tree and route the message forward to a closer node. The forwarder will then add the sender to its children of that topic.
When a publisher publishes a notification, it first checks to see if it knows the IP address of the rendezvous point, in which the publisher just sends the PUBLISH message directly to it. If the publisher does not know the IP address, it uses Pastry to route a message to the rendezvous point, asking for its IP address. When a rendezvous point receives a PUBLISH message it disseminates the notification using the constructed multicast tree for that topic.
Authentication
The multicast trees that are built when providing publish-subscribe functionality in an application-level overlay network such as Pastry, could provide further security by using a Merkle tree[64]. A Merkle Tree is a binary tree composed of cryptographic hash values, where leaves contain cryptographic hash values of data blocks, the internal nodes contain the hash concatenation of the children values and the root contains the content public key.
Publishers create a set of private keys and generate a hash tree of these keys where the paths to the top hash, called public key or root key, is used as authentication paths. Leaves contain the hash values of the private keys and the nodes between the leaves and the root contain the hash of the concatenation of their two children. When a publisher wants to publish a notification, he first chooses one of the private keys and signs the notification with it. He then calculates the authentication which is the list of hash values needed to reach the top hash or the public key. This can later be used by a subscriber to verify the notification.
This type of security model makes use of a separate authentication service where pub-lishers and subscribers of a topic first must authenticate themselves to. This service stores information about the topic that is needed by the publishers and subscribers to be able to securely send and receive notifications. For each topic it stores the following:
1. Spread function. A mathematical function that lists the sequence of data blocks identifiers forming the content. It is used to avoid storing all data block IDs in the authentication service.
2. Root hash. The root hash of the Merkle tree used to authenticate the content.
3. Public Key. The public key of the content provider.
4. Signature. The signature of the content.
Results
The documentation of Pastry includes an evaluation of an implemented version of the so-lution. It was written in Java and uses network emulation environment to be able to test it with up to 100000 Pastry nodes. Each Pastry node were randomly assigned a location
56 Chapter 5. Publish-Subscribe Pattern
on a plane in the emulated environment before the Pastry system were tested for different performance aspects such as routing and locating a close node.
In this documentation, the overall routing performance were measured in the number of routing hops between two random Pastry nodes using 1000 to 100000 number of nodes in the network where b = 4 and |L| = 16. From the 200000 trials, it showed that the maximum number of hops required to route in a network of N nodes were as expected dlog2bN e. It further showed that the number of route hops scale with the size of the network as predicted.
Another performance aspect that was evaluated in Pastry was the ability to locate one of the 5 closest nodes near the client. It was tested in an environment of 10000 nodes with b = 3 and |L| = 8, where a randomly selected Pastry node sent a message to a randomly selected key. The test recorded the first 5 numerically closest nodes to the key that reached along the route. The results showed that Pastry is able to locate the closest node 68% of the time and one of the top two nodes 87% of the time.
An experimental evaluation of Scribe[7] presented results and conclusions about the performance of an implementation of Scribe. Three metrics were used to measure the performance of Scribe, namely the delay to deliver notifications, to group members, the stress on each node and the stress on each physical network link. These were tested using a simulation of a network with 5050 routers and 100000 end nodes that were randomly assigned to the routers. Multiple test runs were used with a varying number of groups and group sizes.
The delay when disseminate notifications to a group using Pastry were tested and com-pared against the delay of regular IP multicast. The relative delay penalty (RDP) for Scribe against IP multicast showed a mean value of 1.81 and more than 80% had an RDP less than 2.25. The stress on a node were measured by counting the number of groups that had non-empty children tables and the number of entries in children tables in each node. Using 1500 groups and 100000 nodes shows a mean number of non-empty children tables per node of 2.4 and a mean number of entries in all the children tables of any node of 6.2.
Discussion
Pastry shows a very cheap and effective way of creating an application-level overlay network for large peer-to-peer solutions. It scales well with the network without significantly reducing latency as it do not require more than dlog2bN e for a network with N nodes. This would be a very viable solution when an application can not use a regular client-server model with an intermediate layer of managed components. Scribe shows that Pastry is a powerful tool that can be used to create highly scalable communication solutions.
Scribe uses Pastry to maintain groups and group membership and creates a very effective and scalable publish-subscribe solution that only relies on the peers of the system. Scribe can concurrently support many different types of applications as it can efficiently handle large number of nodes, groups and groups sizes. Scribe could be used to leverage a client-server model where clients may want to set up a communication channel between themselves that does not need much supervision or participation by the back end.
A publish-subscribe system using Pastry could be successfully used with an authentica-tion soluauthentica-tion that do not require different responsibilities of nodes and that keeps the nodes equal. Authenticating the communication channel of a topic makes sure that the content that is being delivered in the group are introduced following the rules by a creator or a manager of the group and not just from everyone.
5.5. Research Summary 57