The goal of the PPM algorithm is to obtain a constructed graph such that the constructed graph contains the attack graph, where an attack graph is the set of paths the attack packets traversed, and a constructed graph is a graph returned by the PPM algorithm.
3.2.1
Global network and attack graph
We depict the meaning of an attack graph through the example in Figure 3.1. The figure shows a simple network with the legitimate users as well as the attackers attached. We name this network the global network. As an attack graph is defined as the paths traversed by the attack packets that form the attack graph, therefore, not all of the global network is the attack graph, and the attack graph should contain only the affected routers and edges, depicted in Figure 3.2.
R1 ν R5 R2 R4 R3 R7 R8 R6 legitimate users attackers Ri ν router victim group of clients traffic flow
Figure 3.1: A typical case of a DDoS attack toward the victim V. Nevertheless, it is always hard to decide whether a packet is legitimate or not. Eventually, there may be cases when the attack graph contains more nodes and more edges than the actual attack graph. As depicted in Figure
3.1, the legitimate traffic is mixed with the attack traffic (at router R1). As it
is not easy to make a fast and accurate decision about the legitimacy of the packet (because the source address of the packet may be spoofed), an attack graph that has included routers and edges that are not traversed by the attack packets is also accepted, and we call this graph the relaxed attack graph.
3.2.2
Constructed graph
To fulfill the goal to obtain the attack graph, [30] suggested a method to encode the information of the edges of the attack graph into the attack packets through cooperation between the routers and the victim site. When collecting enough encoded packets, the victim builds a constructed graph based on the encoded information. Thus, a constructed graph is the result returned by the PPM algorithm.
R1 ν R4 R7 R8 R5 R2 R1 ν R4 R7 R8 R5 R2 R3 (a) (b)
Figure 3.2: The illustration of an attack graph: (a) an attack graph is not the entire network; the attack graph is the paths traversed by attack packets; (b) the attack graph may become larger than the actual one due to the lack of legitimacy of the packets.
graph must contain the attack graph as its sub-graph. When the PPM al- gorithm stops and returns such a graph, then the PPM algorithm returns a correct result. Otherwise, the constructed graph is an incorrect one. We formally define the correctness of the constructed graph in Definition 3.1. Definition 3.1 A constructed graph returned by the PPM algorithm is correct if and only if the constructed graph contains the attack graph as a sub-graph. Note important that Definition 3.1 includes the case when the constructed graph is the same as the attack graph as well as the case when the constructed graph is a relaxed attack graph.
3.2.3
Structure of the PPM algorithm
In particular, the PPM algorithm is made up of two separated procedures: the packet marking procedure, which is executed on the router side, and the path reconstruction procedure, which is executed on the victim side.
The packet marking procedure is designed to randomly encode edges’ infor- mation on the packets arriving at the routers. By using the information, the
victim then executes the path reconstruction procedure to construct the attack graph. We first briefly review the packet marking procedure so that readers can become familiar with how the router marks information on the packets. A brief review of the packet marking procedure
The packet marking procedure aims to encode every edge of the attack graph, and the routers encode the information in the following three marking fields of an attack packet: the start, the end, and the distance fields (wherein [30] has discussed the design of the encoding of the marking fields). In the following, we describe how a packet stores the information about an edge in the attack graph, and the pseudocode of the procedure from [30] is given in Figure 3.3 for reference.
When a packet arrives at a router, the router determines how to process the packet based on a random number x (line #1 in the pseudocode). If x
is smaller than the pre-defined marking probability pm, the router chooses to
start encoding an edge. In other words, the probability that the router starts
encoding an edge is pm. The router sets the start field of the incoming packet
to the router’s address, and resets the distance field of that packet to zero. Then, the router forwards the packet to the next router.
When the packet arrives at the next router, the router again chooses if it should start encoding another edge. Say, this time, the router chooses not to start encoding a new edge. Then, the router will find out that the previous router has started marking an edge because the distance field of the packet is zero. Eventually, the router sets the end field of the packet to the router’s address. Nevertheless, the router increments the distance field of the packet by one so as to indicate the end of the encoding. Now, the start and the end fields together encode an edge of the attack graph. For this encoded edge to be received by the victim, successive routers should choose not to start encoding
Packet Marking Procedure(Packet w) 1. Let x be a random number in [0 . . . 1) 2. If x < pm, then
3. write router’s address into w.start and 0 into w.distance
4. else
5. If w.distance = 0 then
6. write router’s address into w.end
7. end If
8. increment w.distance by one
9. end If
Figure 3.3: The pseudocode of the packet marking procedure of the PPM algorithm.
only one edge. Further, every successive router will increment the distance field by one so that the victim will know the distance of the encoded edge. Path reconstruction procedure
The path reconstruction procedure is the final step to build the constructed graph. The procedure works with the encoded packets, and it extracts the edge information from every packet. Note that, to avoid attackers in spoofing the packets, the victim has to know the global network (not the attack graph), and the procedure will eliminate the abnormal edge information (line #8 in Figure 3.4). A subtle note is that, as the name of the procedure suggested, this procedure works only with paths. But, this does not stop the procedure from handling multiple numbers of paths.
Path reconstruction procedure
1. Let G be a tree with root v, where v is the victim. 2. Let every edge in G be (start, end, distance). 3. For each packet w from attacker
4. if w.distance == 0 ; then
5. insert edge (w.start, v, 0) into G;
6. else
7. insert edge (w.start, w.end, w.distance) into G;
8. remove any edge (x, y, d) with d 6= distance from x to v in G;
9. extract path (Ri. . .Rj) by enumerating acyclic paths in G;
Figure 3.4: The pseudocode of the path reconstruction procedure of the PPM algorithm.