CORRELATION 4.1 Introduction
4.8 Graph reduction
In order to reduce the complexity of the resulting graph, data redundancy should be eliminated. The graph consists of nodes representing aggregated alerts and edges representing the casual relationships. The number of nodes is not affected while the number of edges is minimised without affecting reachability. Hence, the target is to find a minimal DAG with the least number of arcs and which is equivalent to the original DAG. Consider the case shown in Figure 4.18, with four alerts: a, b, c, and d. If Alert a is causing Alert b and b is causing c, there is no need for the transitive edge between a and c, and similarly the edges between a-d and b-d. The removal of the transitive optional edges does not have any effect on connectivity between the original nodes.
a b c d
132
This is based on the assumption that the relationships between nodes can propagate and the removed edges are considered optional.
Definition 4.6: given a DAG G=(V,E), V=X is the vertex set, E=R is the set of arcs of
the graph, let n=#V, V={1,….,n}, the reduced graph G′(V,E′) is a DAG with the following properties:
(1) The vertex set (#V) of G(V,E) is equal to the vertex set (#V) of G′(V,E′). (2) The directed paths between the vertex in G(V,E) and G′(V,E′) are similar.
(3) G′(V,E′) has the smallest number of edges E′=R′ between vertex sets without affecting the connectivity, R′<=R.
Two algorithms have been developed: online graph reduction for edge deletion on the left side of the graph, and offline graph reduction for edge deletion on the right side of the graph. The online algorithm removes the transitive edges at the real-time when every node joins the graph. This procedure is performed at the first stage of correlation and before alert aggregation in order to minimise the system's processing time. The offline algorithm results in a further graph reduction if any redundant connection exists after the graph is built, starting from the leaf nodes to the root nodes. To clarify the idea, consider the alerts correlated by the system in the initial stage shown in Figure 4.19 below:
Figure 4.19 Example of graph reduction.
1 1 1 2 2 3 4 5 3 4 2 3 4 5 5 5 1 2 3 4 5 -a- -b-
133
There are five nodes and eight edges connecting these nodes to represent the causal relationship. In Figure 4.19 (a), the number of the representing nodes n is half the number connecting arcs #V. The edges 1→5 and 2→5 can be deleted because they are redundant and the description of the intrusion sequence will not be affected. In the proposed reduction algorithm, each node has two lists of children and parents, and the aim is to remove the duplicates in these lists as shown in the following two algorithms displayed in Figures 4.20-4.23
134
Algorithm: OnlineGraphReducer Input: Correlated Graph
Output: Reduced Correlated Graph Declaration: GraphNode: <id, value>
Parents , Childs,Roots: List of GraphNodes
Ancestors: Dictionary of GrpahNodes<GraphNode,List of GraphNode> NodeSet,: Dictionary of GrpahNodes<int, GraphNode)
node1, node2 : GrpahNode Methods:
// perform for each edge, if the n is the nodes number, the edge will be n/2
for i←0 to length[nodes]/2
do
// the node on the left side, causing alert
node1id←nodes[i]
//the node on the right, caused alert node2id←nodes[i+1]
node1←null
//nodeSet is the resulting set after reduction if nodeSet contains node1id
then
// if it is already added to the nodeSet node1←nodeSet[node1id]
else
//otherwise create a new GraphNode and ancestors list for node1 node1← new GraphNode(node1id) ;
ancestors[node1] ←new List of GraphNode //node2 is processed similarly
Node2←null
if nodeSet contains node2id then
// if it is already added to the nodeSet Node2←nodeSet[node2id]
else
//otherwise create a new GraphNode and ancestors list for node2
135
node2← new GraphNode(node2id) ;
ancestors[node2] ←new List of GraphNode
//check all parents of node2, if one exists in node1’s parents, remove it from node2’s parents to avoid duplicates (transitive relationships)
for k←0 to length[node2.parents]
do
if ancestors[node1] contains node2.parents[j] then
DELETE (node2, node2.parents[j]) //add node2 as a child of node1
INSERT (node1.child,node2)
//add node1 to node2’s ancestors if it is not already existent
if NOT (ancestors[node2] contains node1)
then
INSERT (ancestors[node2], node1); // add all ancestors of node1 to ancestors of node2
for j←0 to length[node1.ancestors]
do
n: GraphNode
n←node1.ancestors
if NOT (node2.ancestors contains n) INSERT (node2.ancestors,n)
// if node2 is a root node remove it from roots because it is not root anymore //after being a child of another node
if node2 roots then DELETE (roots, node2)
// if node1 is not already in roots add it to roots
if length[node1.parents]=0 AND NOT (node1roots) then
INSERT (roots, node1)
Figure 4.21 Online reduction algorithm (continued).
136
Algorithm: OfflineGraphReducer Input: Correlated Graph
Output: Reduced Correlated Graph Declaration:
GraphNode: <id, value>
Parents , Childs, Roots, n, grandson: List of GraphNodes
indirectedOffSprings : Dictionary of GrpahNodes<GraphNode,List of GraphNode>
Methods:
for i←0 to length[roots]
do
// if the root node is already existent in sons group return the group if n indirectedOffSprings
then return indirectedOffSprings.n
// if the root node does not have any child create a new list
if length[n.child] =0
then
return new list of GraphNode for i←0 to length[n.child]
do
//check the sons of the sons of each node in roots for j←0 to length[n.child[i].child]
do
if grandson n.child[i].child then
// if this son is not a member of the sons group add it if NOT grandson indirectedOffSprings
INSERT (currentIndirectedOffSprings, grandson)
indirectedOffSprings.n=currentIndirectedOffSprings for k←0 to length[indirectedOffSprings]
do
if n indirectedOffSprings then
// remove duplicates in sons of the nodes on the same sequence
Figure 4.22Offline reduction algorithm.
137 l←0 while l<length[n.child] do if n.child[l] indirectedOffSprings.n then DELETE (n.child,n.child[l]) else l← l +1
Figure 4.23 Offline reduction algorithm (continued).