Application of Machine Learning Techniques to Distributed Denial of
Service (DDoS) Attack Detection: A Systematic Literature Review.
Sergio Armando Guti´errez, John Willian Branch
Grupo GIDIA, Departamento de Ciencias de la Computaci´on y la Decisi´on
Universidad Nacional de Colombia
Medell´ın
Email:
{
saguti—jwbranch
}
@unal.edu.co
Keywords: Network Security, DDoS attack detection, SLR, Systematic Literature Review.
Abstract
This paper presents an approximation to state of art in applica-tion of some machine learning techniques to the field of Dis-tributed Denial of Service (DDoS) attack detection. From a structured mapping study, relevant papers were selected, and later they were reviewed to detect the type of attacks they ad-dress, the techniques providing best performances and the tech-niques which present evidence to be conceived and used in dis-tributed environments.
1
Introduction
Denial of Service (DoS) Attacks are not a new concern on the field of Computer Security. In mid nineties decade, they be-came a very effective attack method. Later, with the wide adop-tion of Internet, threats such as viruses, trojans and worms ap-peared, taking DoS to a higher scale, producing the so-called Distributed Denial of Service (DDoS) Attacks.
A DDoS attack is a type of Denial of Service where multi-ple coordinated devices are used to lauch an attack towards one or more targets. The ultimate goal of DDoS is to exhaust pro-cessing and connectivity resources of the targets to avoid that legitimate users access the services provided by computer plat-forms, thus causing a partial or total unavailability. Because of the many-to-one dimension that poses this type of attacks, they are generally very powerful and devastating. Scientific com-munity forecasts that the disruptive power of DDoS attacks, their sofistication and damage capacity tends to increase in a very high rate, becoming a very critical threat for modern and emergent internet services [1, 2, 3, 4, 5]. Development of ad-equate solutions to detect and prevent this devastating attacks, and to minimize the damage they can provocate becomes an important task for scientific community.
In this paper, an approximation to state-of art from a Sys-tematic Literature Review in application of Machine Learning to DDoS Attack detection is presented. Machine Learning can be defined as the ability which can be exhibited by a computer program to learn and improve its future execution from the
re-sults of previous execution. In other words, the ability to im-prove its performance over the time.
Machine learning aims to answer many of the same ques-tions as statistics or data mining. However, unlike statistical approaches which tend to focus on understanding the process that generated the data, machine learning techniques focus on building a system that improves its performance based on pre-vious results. In other words systems that are based on the machine learning paradigm have the ability to change their ex-ecution strategy on the basis of newly acquired information [6] This paper is organized as follows. Section 2 describes the methodology used to perform the Literature Review, the search parameters used, the research questions and the inclusion and exclusion criteria used to produce the literature mapping and identify the papers to be reviewed. Section 3 presents the de-tailed Literature Review on the selected paper, and the answers to proposed research questions. Finally, Section 4 presents the conclusion of this Literature Review and future work that can be derived from it.
2
Methodology
For this paper, the methodology of Systematic Literature Re-view introduced by Kitchenham et al [7] was used. A previous mapping was performed to identify the most relevant Machine Learning techniques used for DDoS attack detection to focus the review.
For paper retrieval, SciVerse SCOPUS (http://www.scopus.com) was used. This is because SCOPUS has a wide coverage, with access to several of the most important scientific databases as ACM, IEEE, Elsevier, among other, simplifying in that way the search process. The main interest in the searching was DDoS attack detection, so the search string used was (“DDoS Attack Detection”) OR (“Distributed Denial of Service Attack Detection”). The search was configured to retrieve works between 2008 and 2012, on the field of Computer Science, and the types of publications retrieved were Journal Articles, Review Articles, and Conference articles. No other filters were applied.
2.1 Research questions
For the guidance of literature review, three research questions were defined:
• RQ1: What types of DDoS Attacks have been addressed with detection based on Machine Learning techniques?
• RQ2: What of the Machine Learning techniques in-cluded in the review present the best performance, re-garding the reduction of false detections (False possitives and false negatives)?
• RQ3: What of the techniques included in the review are reported to be used in distributed environments, or in-cluded in collaborative detection implementations? There are two main interests in chosing these research ques-tions. One is the interest in detecting what types of attacks are being studied with more detail. Literature reports that detec-tion of attacks related to transport and network protocols has had more interest, but attacks targetted to application protocols are not still a widely explored research topic [8, 9, 10, 11].
The other interest is identifying how much attention is put in attack detection for distributed computational environments, and in development of collaborative approaches for attack de-tection. Because of the nature of DDoS attacks, distributed and large scale, and because of the emergence of computing paradigms such as Cloud Computing, distributed solutions and collaborative approaches appear to be promising solutions to face this kind of attacks [12, 13, 14, 15].
2.2 Inclusion and exclusion criteria
Some rules were defined to select the relevant papers from the whole universe retrieved at search time. These criteria were ap-plied to select among the downloaded works, to decide which to review or which to discard.
For the mapping previously performed to identify the Ma-chine Learning techniques to be reviewed in detail, papers whose main topic would be Distributed Denial of Service At-tack Detection were included, and papers written in english. On the other side, the works not having DDoS detection as main topic, written in other languages different than english and the ones which could not be downloaded or gotten after re-questing them to the authors were discarded and not included for further study.
2.3 Preliminary Results
After performing the search with the keywords and parameters previously mentioned, SCOPUS retrieved 405 papers.
From these 405 papers, 345 were downloaded or obtained from direct request to the authors. The other 60 papers were discarded directly by one or several of the exclusion criteria defined.
Then, after applying the inclusion and exclusion criteria on the 345 remaining papers, 141 papers were selected, and clas-sified for further analysis.
From this universe of 141 papers, after a mapping study, 54 papers were identified to present approaches for DDoS Attack Detection, using mainly Machine Learning techniques.
Further classification was performed on these 54 papers, and the most used techniques were detected; these were Cumu-lative Sum - CUSUM Algorithm (11 papers), Support Vector Machines SVM (5 papers), Principal Components Analysis -PCA (4 papers) and Neural Classifiers (4 papers)
In the next section, detailed revision of these works is pre-sented, focused in answering the proposed research questions.
3
Detailed Literature Review
In this section, the detailed literature review is presented. In the first part, the review of the selected papers is discused; the papers are grouped by the technique they use. In the second part, the proposed research questions are answered.
3.1 Literature Review
3.1.1 CUmulative SUM (CUSUM) Algorithm
Wei et al [16] present a mechanism for detection of DDoS attacks aimed to TCP protocol on Kernel based Virtual Ma-chines (KVM) [17]. They focus in studying the relationship that should be kept between starting and ending packets asso-ciated to TCP connections. Their approach shows high perfor-mance regarding detection time, and refers to have almost 0% of false detections. It presents an scalability issue because in-formation of the connection is stored in hash tables, what might induce performance degradation if tables have high number of records.
Nosrati et al [18] address the detection of DDoS attacks in Internet Multimedia Subsystem, specifically in relation to REGISTER operation (the action a device performs to authen-ticate and inform itself its location within the network to a con-trol entity) of Session Initiation Protocol (SIP). They present a variant of CUSUM named adaptive z-score CUSUM; this adaptive variant has the capacity to adapt itself to changes which could occur in the traffic (In this case, quantity of REG-ISTER messages), for example, for connection restablishment. Their approach presents a low detection time, and a False Rate Alarm near to 6%. It is a simple approach, which could be adapted to be used in other application cases.
Yi et al [19] present a mechanism to analyze per IP behav-ior. A profile of the sending and receiving traffic of every IP address in the network is created. Then, that profile is evalu-ated to decide whether it meets normal behavior, or whether it indicates an anomaly. The presented approach analyzes met-rics of messages related to TCP protocol, and also ICMP and DNS messages. It is conceived to be deployed at so-called leaf routers, that is to say, routers which interconnect particular net-work segments. The presented results show times in the order of minutes for detection, which for some application scenarios could be inconvenient. Also, although it is thought to be used at leaf routers to protect smaller network segments, it might have scalability issues for high number of users, and it is not clear
whether present limitations in evironments using dynamic IP assignments.
Du and Nakao [20] present Mantlet, a complete solution for DDoS attacks, covering Anti-Spoofing (False source IP Ad-dresses), detection and mitigation. The detection component of Mantlet uses CUSUM to detect anomalies in the ratio between sent and received packets. A very important feature of the pre-sented solution is that it generalyzes the detection, covering besides TCP packets also UDP packets. Thus, it covers a wider spectrum of Transport Layer. Mantlet shows a very low rate of false alarms, and the capacity to reduce detection times as the attack intensity increases; it implies that for stealth attacks (low rate attacks), performance might drop.
Salem et al [21] propose an approach for Flooding DDoS attacks using a variation of CUSUM named Multi Channel Non Parametric (MNP). This solution covers a wide spectrum of network protocols, being able to detect TCP, UDP and ICMP based attacks. The justification in using a non parametric ver-sion is because of the high varaibility of network traffic and the lack of consensus on relevant features to monitor on it. This work does not present detailed results on detection time or false detections rate.
Leu and Li [22] present an Intrusion Prevention System with detection features based on CUSUM. Their approach presents detection time around 1.3 secs, and detection accu-racy of 98.4%; low rates of false positives (3.2%) and false negatives (0%).
Zhou et al [23, 24] present an approach for attack detec-tion starting from a P2P network built for detecdetec-tion. Devices organize themselves in a P2P network, detection is performed locally at end host via CUSUM algorithm, and at occurrence of abnormalities, they are broadcasted to other nodes in the net-work. With this information, similarity of abnormalities is de-tected. Detection rate for this approach is as high as 96%, and false positives as 9.3%.
He et al [25] present a scheme for bandwidth estimation for detection of DDoS. The main idea of this solution is that during a DDoS attack, available bandwidth is depleted. This approach presents a 100% detection rate and low detection times.
Finally Berral et al [26] present and adaptive approach for detecting flooding attacks. They combine CUSUM for anomaly detection with other Machine Learning techniques such as Naive Bayes for traffic classification. This approach presents 95% of accuracy, no false positives, and only false negatives.
3.1.2 Support Vector Machines
Choi et al [27] present an approach based in the concept of Timeslot Monitoring Model. This technique operates monitor-ing and producmonitor-ing profiles from user activity in predefined in-tervals, to detect whether a given profile corresponds to normal or attack activity, when that activity is executed by an automatic tool. This approach fits to detection of application layer proto-cols although it was tested specifically with HTTP protocol; It shows 100% of accuracy when 2 intervals for monitoring are used.
Cheng et al [28, 29] present an approach based in the notion of flow interaction. It starts from the idea that for normal com-munication flows, entities present interaction, whereas during attacks, entities might not present interaction (mainly because spoofing appears, false source IP addresses). SVM is used to classify whether flows correspond to attack or normal traffic according to the notion of interaction. This approach presents a high detection rate, around 100% and very low false alarm rates, even with multiple network flows, with the ability to even perceive low rate DDoS attacks mixed with high volume of le-gitimate traffic.
Bao [30] presents a mechanism for attack detection using data collected via Simple Network Based Protocol (SNMP), instead of using raw communication packets. Via an ensemble of SVM, traffic classification is performed to detect initially, whether an attack is ocurring and the specific attack type. This approach presents a performance of 99.27% of attack detection rate, with 1.9% false positives and 0.73% false negatives.
Yu et al [4] present a very similar approach to Bao’s, with results slightly better.
3.1.3 Principal Component Analysis
Bharati et al [31] present a framework based on PCA for detec-tion of applicadetec-tion layer DDoS attacks, specifically, those based on HTTP traffic. It works by building profiles from browsing activity of users. This framework presents very low attack de-tection time, in comparison to other existing methods; no in-formation on false positives is provided.
Liu et al [32, 33] present a solution combined of PCA and Sketches for detection in a distributed way. It is able to perform detection of anomalies both at the user side and at a central point, usually the Network Operation Center. Although results are not presented in detail, from graphics provided by authors is possible to note that the proposed approach has very low false detections rate, and it has low detection times.
Finally, Li et al [34] present a solution using PCA based on the concept of spatial correlation; this is, relations which can be established from studying similarities which present the traffic flows associated to attack, when directed towards the tar-get. This solution shows more than 80% accurate detections, with only 0.1% of false positives. It is important remark the distributed nature of this solution, understanding the very own distributed nature of the attacks under study.
3.1.4 Neural classifiers
Raj Kumar et al [35] present an approach based on an ensemble of neural classifiers, on the idea that by combining classifiers, it is possible to reduce the overall error. Applied to detection of TCP, UDP, ICMP and HTTP based attacks, it presented a 98% of accurate detections, and around 3% of false positives.
Li et al [36] present an approach based on LVQ Neural Net-works; they present results showing 99.7% of accurate detec-tions, with around 0.3% of false positives; results with this kind of neural network are shown to be better than the ones obtained with back propagation neural network.
Wu et al [37] present the usage of Neural Networks in de-tection of DDoS attacks against DNS servers; the overall per-formance obtained in their work is 96.5%, with 0% false posi-tives.
Finally, [38] present detection of DDoS attacks on IP flows using neural network classification, specifically back propaga-tion network; although details are not discused in detail, graph-ics show low false positives and high good classification rates.
3.2 Answering research questions
In this part of the section, answers to the proposed research questions are provided.
Regarding RQ1, most of the reviewed works show to be focused in low level DDoS attacks; IP, TCP and UDP based attacks is the most commonly analyzed attack type. Anyway, some of the authors address the problem of application based attacks, which are more complex to detect because of the high variability of attack patterns that can appear on traffic associated to them. Specifically the works referenced in [18] (SIP -IMS), [19] (DNS), [27] (HTTP), [31] (HTTP), [35] (HTTP), [37] (DNS) present approaches for application layer attack de-tection. Particularly, in the case of HTTP, is usual to model the attack as variations on normal user activites profiles.
In relation to RQ2, the review of the selected works allows to see that the best performances are provided with implemen-tations of SVM( [30]) and LVQ Neural networks ([36]). It is important to clarify that there were some papers which did not present explicitly their results, so it might be possible to have approaches which would have better results to these referenced. Finally, regarding RQ3, most of the works present centric solutions, to be used near to victim device. There are references of distributed approaches in [23, 24] and [34].
4
Conclusions and Future Work
An approach to state of art in application of Machine Learn-ing to DDoS Attack detection was presented via Systematic Literature Review. From a structured procedure for reviewing papers, it was detected that an important topic to be researched is detection of application layer based protocols. Reviewed au-thors who have worked on this topic usually mention the wide quantity of literature which can be found regarding low level (Transport and network layers) DDoS attacks in contrast to lit-erature on Application Layer attacks.
Another important detected issue is that, although there are techniques providing good performances, increasing accuracy, reducing detectiontimes and avoiding false detections is al-ways a goal to achieve in detection techniques. Besides, an important challenge arises in relation to application of detec-tion techniques on high speed networks, which are everyday more common.
Collaborative and distributed detection is a topic deserving more attention. Because of the emergent trend on cloud and grid computing environments, platforms and services tend to be highly distributed, so, because of this fact and because of the very own nature of DDoS attacks, it is possible that approaches
with capabilities of collaboration, distribution and even mobil-ity combined with machine learning and other techniques could provide better performance.
As future work from this review, exploring more recent pa-pers (up to date) and deeper analysis of experimental results of reviewed papers is proposed.
References
[1] C.-M. Chen, Y.-H. Ou, and Y.-C. Tsai, “Web botnet de-tection based on flow information,” inICS 2010 - Inter-national Computer Symposium, pp. 381–384, 2010. [2] D. Kim, B. Kim, I. Kim, J. Oh, J. Jang, and H. Cho,
“Au-tomatic control method of DDoS defense policy through the monitoring of system resource,” inRecent Researches in Applied Informatics - Proceedings of the 2nd Interna-tional Conference on Applied Informatics and Computing Theory, AICT’11, pp. 140–145, 2011.
[3] H. Liu, Y. Sun, and M. Kim, “Fine-grained DDoS detec-tion scheme based on bidirecdetec-tional count sketch,” in Pro-ceedings - International Conference on Computer Com-munications and Networks, ICCCN, 2011.
[4] J. Yu, H. Lee, M.-S. Kim, and D. Park, “Traffic flooding attack detection with SNMP MIB using SVM,”Computer Communications, vol. 31, no. 17, pp. 4212–4219, 2008. [5] R. Yogesh Patil and L. Ragha, “A rate limiting
mecha-nism for defending against flooding based distributed de-nial of service attack,” inProceedings of the 2011 World Congress on Information and Communication Technolo-gies, WICT 2011, pp. 182–186, 2011.
[6] A. Patcha and J.-M. Park, “An overview of anomaly de-tection techniques: Existing solutions and latest tech-nological trends,” Computer Networks, vol. 51, no. 12, pp. 3448–3470, 2007.
[7] B. A. Kitchenham and S. Charters, “Guidelines for per-forming systematic literature reviews in software engi-neering,” 2007.
[8] S. Ranjan, R. Swaminathan, M. Uysal, A. Nucci, and E. Knightly, “DDoS-shield: DDoS-resilient scheduling to counter application layer attacks,”IEEE/ACM Trans-actions on Networking, vol. 17, no. 1, pp. 26–39, 2009. [9] Y. Xie and S.-Z. Yu, “A large-scale hidden semi-markov
model for anomaly detection on user browsing behav-iors,” IEEE/ACM Transactions on Networking, vol. 17, no. 1, pp. 54–65, 2009.
[10] C. Ye and K. Zheng, “Detection of application layer dis-tributed denial of service,” inProceedings of 2011 Inter-national Conference on Computer Science and Network Technology, ICCSNT 2011, vol. 1, pp. 310–314, 2011.
[11] Y. Xie and S. Tang, “Online anomaly detection based on web usage mining,” inProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Sympo-sium Workshops, IPDPSW 2012, pp. 1177–1182, 2012. [12] T. Gamer, “Distributed detection of large-scale attacks in
the internet,” inProceedings of 2008 ACM CoNEXT Con-ference - 4th International ConCon-ference on Emerging Net-working EXperiments and Technologies, CoNEXT ’08, 2008.
[13] T. Gamer and C. Mayer, “Simulative evaluation of dis-tributed attack detection in large-scale realistic environ-ments,”Simulation, vol. 87, no. 7, pp. 630–647, 2011. [14] S. Shalinie, M. Kumar, M. Karthikeyan, J. Sajani,
V. Nachammai, K. Sundarakantham, and K. Mallikar-junan, “CoDe - an collaborative detection algorithm for DDoS attacks,” in International Conference on Re-cent Trends in Information Technology, ICRTIT 2011, pp. 113–118, 2011.
[15] T. Gamer, “Collaborative anomaly-based detection of large-scale internet attacks,”Computer Networks, vol. 56, no. 1, pp. 169–185, 2012.
[16] Z. Wei, G. Xiaolin, H. Wei, and Y. Si, “TCP DDOS attack detection on the host in the KVM virtual machine envi-ronment,” inProceedings - 2012 IEEE/ACIS 11th Inter-national Conference on Computer and Information Sci-ence, ICIS 2012, pp. 62–67, 2012.
[17] S. Zeng and Q. Hao, “Network I/O path analysis in the kernel-based virtual machine environment through trac-ing,” in Proceedings of the 2009 First IEEE Interna-tional Conference on Information Science and Engineer-ing, ICISE ’09, (Washington, DC, USA), pp. 2658–2661, IEEE Computer Society, 2009.
[18] E. Nosrati, A. Kashi, Y. Darabian, and S. Tonekaboni, “Register flooding attacks detection in IP multimedia sub-systems by using adaptive z-score CUSUM algorithm,” in
2011 International Conference on Information Technol-ogy and Multimedia: ”Ubiquitous ICT for Sustainable and Green Living”, ICIM 2011, 2011.
[19] Z. Yi, L. Qiang, and Z. Guofeng, “A real-time DDoS at-tack detection and prevention system based on per-IP traf-fic behavioral analysis,” inProceedings - 2010 3rd IEEE International Conference on Computer Science and In-formation Technology, ICCSIT 2010, vol. 2, pp. 163–167, 2010.
[20] P. Du and A. Nakao, “Mantlet trilogy: DDoS defense de-ployable with innovative anti-spoofing, attack detection and mitigation,” in Proceedings - International Confer-ence on Computer Communications and Networks, IC-CCN, 2010.
[21] O. Salem, A. Mehaoua, S. Vaton, and A. Gravey, “Flood-ing attacks detection and victim identification over high speed networks,” in2009 Global Information Infrastruc-ture Symposium, GIIS ’09, 2009.
[22] F.-Y. Leu and Z.-Y. Li, “Detecting DoS and DDoS attacks by using an intrusion detection and remote prevention system,” in5th International Conference on Information Assurance and Security, IAS 2009, vol. 2, pp. 251–254, 2009.
[23] Z. Zhou, D. Xie, and W. Xiong, “A novel distributed detection scheme against DDoS attack,”Journal of Net-works, vol. 4, no. 9, pp. 921–928, 2009.
[24] Z. Zhou, D. Xie, and W. Xiong, “A P2P-based distributed detection scheme against DDoS attack,” inProceedings of the 1st International Workshop on Education Technology and Computer Science, ETCS 2009, vol. 2, pp. 304–309, 2009.
[25] L. He, B. Tang, and S. Yu, “Available bandwidth estima-tion and its applicaestima-tion in detecestima-tion of DDoS attacks,” in2008 11th IEEE Singapore International Conference on Communication Systems, ICCS 2008, pp. 1187–1191, 2008.
[26] J. Berral, N. Poggi, J. Alonso, R. Gavald ˜A , J. Torres, and M. Parashar, “Adaptive distributed mechanism against flooding network attacks based on machine learning,” in
Proceedings of the ACM Conference on Computer and Communications Security, pp. 43–49, 2008.
[27] Y. Choi, J. Oh, J. Jang, and I. Kim, “Timeslot monitor-ing model for application layer DDoS attack detection,” inProceedings - 6th International Conference on Com-puter Sciences and Convergence Information Technology, ICCIT 2011, pp. 677–679, 2011.
[28] J. Cheng, J. Yin, Y. Liu, Z. Cai, and C. Wu, “DDoS attack detection using IP address feature interaction,” in Inter-national Conference on Intelligent Networking and Col-laborative Systems, INCoS 2009, pp. 113–118, 2009. [29] J. Cheng, J. Yin, Y. Liu, Z. Cai, and M. Li,DDoS attack
detection algorithm using ip address features, vol. 5598 LNCS. 2009.
[30] C.-M. Bao, “Intrusion detection based on one-class SVM and SNMP MIB data,” in 5th International Conference on Information Assurance and Security, IAS 2009, vol. 2, pp. 346–349, 2009.
[31] R. Bharathi and R. Sukanesh, “A PCA based framework for detection of application layer DDoS attacks,”WSEAS Transactions on Information Science and Applications, vol. 9, no. 12, pp. 389–398, 2012.
[32] Y. Liu, L. Zhang, and Y. Guan, “A distributed data stream-ing algorithm for network-wide traffic anomaly detec-tion,” inPerformance Evaluation Review, vol. 37, pp. 81– 82, 2009.
[33] Y. Liu, L. Zhang, and Y. Guan, “Sketch-based streaming PCA algorithm for network-wide traffic anomaly detec-tion,” inProceedings - International Conference on Dis-tributed Computing Systems, pp. 807–816, 2010.
[34] Z. Li, G. Hu, and X. Yao, “Spatial correlation detection of DDoS attack,” in 2009 International Conference on Communications, Circuits and Systems, ICCCAS 2009, pp. 304–308, 2009.
[35] P. Raj Kumar and S. Selvakumar, “Distributed denial of service attack detection using an ensemble of neural classifier,” Computer Communications, vol. 34, no. 11, pp. 1328–1341, 2011.
[36] J. Li, Y. Liu, and L. Gu, “DDoS attack detection based on neural network,” in 2010 2nd International Symposium on Aware Computing, ISAC 2010 - Symposium Guide, pp. 196–199, 2010.
[37] J. Wu, X. Wang, X. Lee, and B. Yan, Detecting DDoS attack towards DNS server using a neural network clas-sifier, vol. 6354 LNCS. 2010.
[38] D. Wang, G. Chang, X. Feng, and R. Guo,Research on the detection of distributed denial of service attacks based on the characteristics of IP flow, vol. 5245 LNCS. 2008.