4.4 Malicious Packet in Classification and Filtering
4.4.5 Switch Module Design
Having identified the intrinsic characteristics of an attack, we build a “profile” for each port on the switch which monitors and records the behavior of the port using the three (3) attributes described above. For each potential flow request, we extract the necessary attributes as described below.
Number of flow requests: From the analysis of both malicious and legitimate traffic, we found that legitimate sources generate a few tens of flows while malicious sources, in order to overwhelm the controller or Flow Table, must generate a large number of flows (at least several hundred flows in most cases). Having analysed the legitimate traffic on a 60 second scale, we keep track of the number of flows generated by a source per 60 seconds. We classify a “source” as a host connected to a switch port.
Flow Request Window: The best option for tracking the number of flow requests occurring from a given port is using a sliding window. This is challenging for active classification being performed on live traffic in the network as it would involve stor- ing each flow request from each port with a timestamp to track the number of flow requests which occurred 60 seconds or less prior to the current request. In the event of an attack in which thousands of flow requests are generated, this can quickly exhaust both memory and processing capacity. By contrast, using a static window reduces the amount of information that needs to be stored but also reduces accu- racy of the recorded number of requests for each port. In a static window collection method, the number of requests seems small at the start of each window, as if they were legitimate.
We instead attempt to merge the static and sliding windows into a hybrid. We use a static window of 60 seconds for the number of flow requests. To avoid the case where the first requests in a window are taken as values independent of the previous requests, we consider the number of requests in the latter half of the previous window for the first half of the current window. With this idea, we only need to record the number of flow requests in the latter half of a window instead of each flow request and its associated port and timestamp. We then add this value to the number of flow requests in the first half of the next window when calculating the number.
Algorithm 3 explains the procedure:
Algorithm 3: Calculation of the Number of Requests in 60 seconds Time window start = 0 Time window end = 60 Time window half =
(((time window end time window start)/2) + time window start) Prev half window val = 0 Current window value = 0
foreach FlowRequest(f ) do
if (f.time >time window start && f.time <time window end) then if ( f.time <time window half ) then
Num of requests = (Current window value++) + Prev half window val
else
Num of requests = Current window value++ Prev half window val ++
else
while (windowEnd≤time) do
time window end = time window start time window start = time window start +60 Current window value =
Prev half window val Prev half window val =0 Num of requests = Current window value++
Entropy of Flow Requests: The rules placed by our controller match the attributes source and destination IP address therefore we consider these particular attributes of the packet headers. Therefore, the attacker must alter the source and destination IP addresses in his packets to create new flow requests. Both the train- ing and classification data can be configured to consider the entropy of others such as protocol type and MAC address, however.
To monitor the entropy of all the flow requests within a given window (which we choose to be 1 second in our implementation), we would be forced to record all the requests within the window and calculate the entropy. Given the processing time constraints and storage resources constraints within the switch, storing several thousand requests for a port (as is received under attack) is impractical. We there- fore draw on the behavioral characteristics of the legitimate traffic. We noted that
>98% of legitimate traffic generated fewer than 25 flows per minute. With this in mind, we record the last 25 flow requests of each port, storing only the time it was generated and integer representations of the source and destination IP Addresses.
These values are stored in a linked list with each node representing a single flow request and each switch port being monitored is given its own linked list. For le- gitimate traffic, we can easily calculate the entropy for the last second using these 25 flow requests (since only one or two of them would fall within that window). If the host in question creates more than 25 flow requests within the time window (as malicious traffic does), we only consider the entropy of these last 25 flow requests (which will show itself to have a noticeably different entropy).
Time taken to generate Flow Requests: As our final attribute, we consider the time taken to generate 25 flow requests. We use the same 25 flow requests stored in the previous attribute and simply calculate the time between the first and last flow request in the 25. From this, malicious flow requests will have a very small-time gap between their first and last requests, whereas legitimate flow requests should show larger gaps. Sources which have not generated 25 flow requests have this value extrapolated based on the number of requests they have generated, and the time taken for these requests.
All of the above attributes are calculated in the switch for each incoming flow request. We shy away from using other factors such as distribution of IP addresses (which is a common attribute selected for classification techniques) since many other potential attributes are easily spoofed by the attacker and can be used to subvert the classification process as previously discussed. We also keep the list of attributes small in an attempt to minimize the number of trees needed. By minimizing the number of trees used to classify a flow request, we reduce the processing time necessary for classification which is essential in the system.