5.2 M EDINA Design
5.2.2 Hash-based coordinated packet selection
During their forwarding operation, MEDINAnodes compute an N bit hash H(k) on a
subset k of each packet. We call this subset hash key. The packet is captured only if the computed hash value falls within a specific range determined through the traffic assignment algorithm. The hash range is different depending on the route the packet is traveling through within the MEDINAdomain: generally speaking, the larger the
number of MEDINAnodes a flow of packets travels through, the smaller the number
of packets each node needs to capture. Moreover, the hash range in each node might be affected by policies and available resources (in the node itself and in other nodes on the path of the packets). Specifically, a MEDINAnode determines its hash range
as follows:
1. All possible Ingress-Egress (IE) pairs3in the MEDINAdomain are determined starting from the available topological information. If B is the number of edge nodes in a directed graph, the number of IE-pairs is B(B − 1).
2. For each IE-pair, the node computes the forwarding path from ingress node to egress node. It is worth noting that, because of the operating principle of
2Details of the operations of MEDINAnodes and an analysis of the trade-offs among different
options is beyond the scope of this chapter and is left as future work.
3While [81] and [76] denote a path as Origin-Destination pair, we prefer the name Ingress-Egress
pair to highlight the difference between the original source (destination) of the packet and the ingress (egress) node in the MEDINAoverlay.
the shortest path algorithm that is deployed by routing protocols, a packet that is forwarded to its destination through an IE-pair follows the same route as packets coming from the ingress node and addressed to the egress node. Using the Dijkstra’s algorithm the single-source shortest path can be computed with worst-case performance O(E +V logV ) [83] (E and V are the number of links and nodes, respectively). Therefore, the computation for all the IE-pairs can be done in O B(E +V logV ).
3. If the node is associated to an IE-pair (i.e., it is included in the IE-pair forward- ing path), it computes the range for packets on that path by executing a traffic assignment functionthat, in addition to the number of nodes on the path, takes into account a set of constraints, such as policies and resource availability. Given that the number of IE-pairs containing the node are at most B(B − 1) and the number of nodes in a path are at most V , this operation has complexity O(B2V).
The algorithm for the computation of the hash ranges is summarized in Figure 5.2. The traffic assignment function divides the hash space among all the MEDINAnodes in a path. One possible solution is to split the hash space in different ranges with size proportional to the hardware resources of each node. However, it is possible to devise more complex functions that consider both the hardware capabilities and the expected traffic forwarded by each node. In Section 5.3 we show that even a simple heuristic that considers these two parameters results in a fair load distribution. The outcome of the assignment is a manifest: a table that assigns to each IE-pair associated to the node a hash range used to identify the packets forwarded on that path that the node is responsible for capturing. Coherence among the manifests in all MEDINAnodes associated to an IE-pair (namely, associations that avoid redundant capture as well as missing some packets) is ensured by the fact that all nodes on an IE-pair run the assignment function with identical input parameters (other than the specific position of the node in the path). The relatively high complexity of the manifest computation O BE + BV logV + B2V (which is O V3 in the worst case) is acceptable given that this operation is performed fully only once; subsequent topology updates require only a partial re-computation that involves only affected paths. If a certain degree of redundancy n must be supported in the capture of packets transiting through an IE-pair, the assignment function ensures that any hash H(k)
5.2 MEDINADesign Compute all IE-pairs Is there a new IE-pair? Start End NO Get next IE-pair
Compute shortest path from ingress to egress
YES
Is this node in the shortest path? NO
Run the traffic assignment function
YES
Add IE-pair Hash range
to the manifest
Fig. 5.2 Offline manifest computation
New packet
End
Select the egress node (EID)
Is the packet from outside the MEDINA
domain?
NO Compute the
reverse path YES
Mark the packet with this node ID Compute shortest path from current node to the destination
Is current node the egress node of the
reverse path? NO Is the packet marked? Compute the reverse path Extract the ingress ID (IID) NO
Select as IID the egress node of the reverse path
YES
Extract hash key from the packet
Apply the hash function
Extract the hash range from the manifest
Is the hash value within the
range? Capture the packet YES YES NO
Fig. 5.3 Inline packet processing
offers a valuable contribution when n is much smaller than the number of nodes associated to an IE-pair, which is the case in common deployment scenarios.
Because the hash computed must be the same across all the nodes associated to an IE-pair, the portion of the packet k used as hash key must not change along the path. Thus, in order to achieve a balanced distribution of packets across different hash ranges (i.e., across network nodes), the hash key shall include high entropy, path-invariant fields of the L3/L4 headers and all the bytes of the packet payload, while fields that might be modified on a hop-by-hop basis (e.g., IP TTL, MPLS label) and fields that frequently have the same value in different packets (e.g., IP version) shall be excluded.
Fragmentation by routers internal to the MEDINAdomain is not supported since the hash function applied on the whole packet would return different values than on the fragments, which would result into packets captured by multiple nodes or not captured at all. This limitation is compatible with IPv6 networks (where fragmentation is performed by the source) and is practically irrelevant in IPv4
networks (where fragmentation most commonly happens at the network edge [84]). Morever, fragmentation can be avoided through a careful configuration of the MTU in the nodes of the MEDINAnetwork.