Volume 2, Issue 3, 2015
200
Available online at www.ijiere.com
International Journal of Innovative and Emerging
Research in Engineering
e-ISSN: 2394 - 3343 p-ISSN: 2394 - 5494
Survey of Intrusion Detection System and its Versatile
Techniques
Nishant Bhandari
1, Dr. Darshak Thakore
2and Prof. Bhavesh Tanawala
31ME (Computer Engineering), B.V.M Engineering College, VallabhVidyanagar, Gujarat, India, 2Computer Department Head, B.V.M Engineering College, Vallabh Vidyanagar, Gujarat, India,
3Assistant Professor, B.V.M Engineering College, Vallabh Vidyanagar, Gujarat, India
ABSTRACT:
In the modern era of Information age, security of Information is significantly vulnerable to protect. With growing age of Internet, security of the network is a matter of threat. Hence, we need to make a system which not only protect the system or secure the network but equally raise an Alarm whenever any threat is found. An efficient Intrusion Detection System (IDS) can resolve the issue, with the use of data mining techniques. Many researchers have been used the mining approach for IDS over the past. This paper presents different IDS techniques including data mining techniques for better detection of known and unknown intrusion.
Keywords: Intrusion detection system, Anomaly detection, KDD Cup’99, Data Mining; classification, clustering, SVM, Neural network, Fuzzy logic.
I. INTRODUCTION
Basically, Intrusion Detection System is of two types host based intrusion detection system and network based intrusion detection system [1]. Which solves basic three purposes: to monitor, detect and respond. IDS further classify in its two types Analysis method which deals with Misuse and Anomaly Detection. Second type is source data which deals with host or network based security concerns. Hence, IDS is a prevention mechanism against malicious, unauthorized use of system or attack on network. IDS can be an important counterpart against the system integrity. It raise alarm which indicates intrusion occur to the system.
Intrusion Detection System working on live massive amount of data, which is called Automation. Such amount of data can be analyse using data mining, which is efficient way to extract a particular data from the large data set. Thus, Data mining is used for Detection of the intrusion. Moreover, data mining used to convert the data into readable form. It is a field of analyse the data from large amount of data. Data mining integrates the large amount of data continuously and convert into the available format which used in mining algorithm, which known as pre-processed of data. Data mining guarantees that no intrusion will be missed which checking the real network data. Pattern matching misuse detection efficiently done through this technique and can able to achieve high detection rate. IDS have two categories [1]:
A: Misuse Detection
It deals with identified known attacks to the system with high accuracy. But main disadvantage of misuse detection is that it fails to identify the novel attacks.
B: Anomaly Detection
It overcomes the issue of misuse detection by detecting novel attacks by identifying the normal traffic of the network, it generates a profile, any deviation from the expected behaviour is considered as attacks. But not all deviations are attacks. It is difficult to establish and the false alarm rate is high.
KDD Cup 1999 data set is the mostly used in IDS. It consists of 4, 90,000 connection instance and 41 feature each [8]. This paper present survey of IDS using data mining. And describes the Data mining approaches which are used to the detect intrusion in a network.
II. IDS AND ITSTECHNIQUES
There are several techniques of IDS: Data mining
Volume 2, Issue 3, 2015
201
A. Why data mining in IDS?
To maximize the effectiveness in identifying attacks, thereby helping the user to make a complete concrete intrusion detection system. [1] IDS using mining solve issues like classification of data and high level human interaction. Several data mining techniques like association, clustering, classification are used to acquire intrusion by observing network data. Data mining observed the known relation of the data set and gives the result. It can also detect the all patterns in large amount of data. Normally, IDS input data comes from many source like host, network etc. such complexity is very difficult to solve, mining approach solves this issue. It helps to distinguish between malicious data and normal data from the large amount of data set, and specify the attacks.
B. Basic Data mining model for IDS
In figure 1. A simple model of IDS has depicted. Which consist of several different component based on mining. Sniffer: It used to collect the data from the network or source.
Data Pre-processing: converts the original data into the available formats used in data mining algorithm. Event Data Base: Used to store the pre-processed data.
Misuse Detection: Used to make misuse pattern matching detection for the data. Response units: used to process and response the misuse detection alarm. Data mining: It continuously integrates the data.
Anomaly Detection: It distinguishes abnormal behaviour from normal.
Figure 1. simple model of IDS using mining [4]
In figure 1 describes how the intrusion detection can be processed using mining. The information can be retrieved from the internet, then it is checked and finally it protected by the IDS after that it send the information to the corresponding network. There are different mining methods which we discuss in the next section. Each of methods solves the different purpose. Which are: classification correlation algorithm, correlation analysis algorithm, series analysis algorithm [7]. Data mining known to be an efficient method due to its versatility of the different algorithms.
Following Data mining techniques and intrusion detection:
An analysis of the data deals with the data mining approach. Techniques like classification and clustering are one the most used methods in data mining. This part describes various mining techniques in intrusion detection.
Classification: It assigns label to the data as per their consideration. Classification is a supervised technique that can detect known attacks. The main objective of the classification is to differentiate between malicious and normal data set. Classification is less efficientthan clustering because classification needed massive amount of data to be collected. Authors of [7] conferred a data classification for intrusion detection that can be bringing out by the following steps:-
With the purpose of examine the classification models of normal and abnormal series of system calls, thus need to deliver it with training data sets, which contains pre-labeled normal or abnormal series. Contrasting techniques as decision tree, linear discrimination or rule based model is used to check the network traces. Further develop a collection of distinct series of system calls and named it as normal record.
Later check all of the intrusion traces. Find each sequence of system calls in the normal list. If a correct match can be found then name it as normal. Else it is named as an abnormal.
Volume 2, Issue 3, 2015
202 Clustering: Clustering groups the similar data set into single group. Same data are kept inside the cluster. Clustering differentiates same patterns or data and make group of it. It divides the data and make the group of similar objects, members of same class is remain in one cluster and rest data is remain in different cluster, since human labeling is time tedious. These methods effective to filter the data which looks dissimilar with the other data inside the group of data.
Authors of [7] gives fundamental steps involved in rectifying malicious intrusion are as below:-
First step is to find the large cluster on which maximum instances are present and name it as a normal cluster. Now next step is to arrange the remaining clusters in the form of their size ascending order of their distance to
largest cluster.
Choose the initial clusters such a way that the number of data instances in these clusters summarizeto ¼N, and name it as a normal, at where it is the percentage of normal instances.
Name all remaining clusters as a malicious.
Afterwards clustering, automatically names each cluster as normal or malicious as per their type, which is done by heuristics. The self-labelled clusters are then used to probe attacks in individual testing dataset.
Figure below illustrates attacks and its classes:
FOUR MAIN ATTACKS CLASSES
22 ATTACKS CLASSES
Denial of services pod, smurt, land, back, teardrop, neptune, Remote to user imap, multihop, p
spyftp_write, warezclient, warezmaster, guess_passwd, multihop
User to root loadmodule,buffer_overflow, perl, rootkit
Probe satan, ipsweep, namp, portsweep
Figure 2. Attacks classes in KDD’99 DATA SET [8]
Denial of Service Attack
A denial of service attack is a class of attacks in which an attacker makes some computing or memory resource too busy or too full to handle legitimate requests, or denies legitimate users access to a machine. Examples are Apache5 Back, Land, Mailbomb, SYN Flood, Ping of death, Process table, Smurf, Syslogd, Teardrop, Udpstorm.
User to Root Attacks
User to root exploits are a class of attacks in which an attacker starts out with access to a normal user account on the system and is able to exploit vulnerability to gain root access to the system. Examples are Eject, Ffbconfig, Fdformat, Loadmodule, Perl, Ps, Xterm.
Remote to User Attack
A remote to user attack is a class of attacks in which an attacker sends packets to a machine over a network? But who does not have an account on that machine; exploits some vulnerability to gain local access as a user of that machine. Examples are Dictionary, Ftp-write, Guest, Imap, Named, Phf, Sendmail, Xlock, Xsnoop.
Probing
Probing is a class of attacks in which an attacker scans a network of computers to gather information or find known vulnerabilities. An attacker with a map of machines and services that are available on a network can use this information to look for exploits. Examples are Ipsweep, Mscan, Nmap, Saint, Satan.
C. Support vector machine (SVM) in IDS
Volume 2, Issue 3, 2015
203 during the training process, which selects support vectors along the surface of this function. This capability allows classifying a broader range of problems. The primary advantage of SVMs is binary classification and regression that they provide to a classifier with a minimal VC-dimension [10], which implies' low expected probability of generalization errors. In our case all intrusions are classified as +1, and normal data are classified as -1. All the SVMs experiments described below use the freeware package SVM light [11]. Below diagram describes the SVM in IDS.
Figure 3. SVM model of IDS [19] D. Neural Network in IDS
Intrusion Detection Systems (IDS) are now mainly make use of to secure company networks. Ideally, an IDS has the capacity to detect in real-time all (attempted) intrusions, and to execute work to stop the attack (for example, modifying firewall rules). Intrusion Detection Systems, developing commercial and research tools, and a new way to improve false-alarm detection using Neural Network approach. This approach is still in development, nevertheless it seems to be very promising for the future. Neural network techniques can be apply for Misuse detection and anomaly detection model and there are mainly two type of neural network 1. Supervised training algorithms and 2. Unsupervised training algorithms. In supervised training phase the network itself learn desired output for given input. Example of it is Multi Level perception (MLP); the MLP is employed for pattern recognition problems. Whereas unsupervised phase learns without specifying desired output. . Neural network gives concrete predictive capability to probe of misuse detection classes which would further identified the probability that a particular event or series of event. It also gains the resources to overcome the ability to detect where the event is going to become attack process. Such information could then be used to produces a bunch of events that should occur, which end up intrusion attempt. By probing the same occurrence of the events which likely be an attack system itself make capable of tracking same types of intrusion events and possibly construct a defensive role against the intrusion attack, and prevent attack.
E. Fuzzy logic in IDS
Fuzzy intrusion recognize engine uses fuzzy system to recognize malicious activity against computer network and used to identifies the novel attacks which may fails to detect by other techniques. Fuzzy system mainly follows its four important goal 1) How fuzzy system use to rectify intrusion. 2) To identify which data are legitimate input to the fuzzy intrusion detection system 3) to evaluate the most accurate method for representing network input data. 4) To show how the Fuzzy system can scaled to distributed intrusion detection involving multiple hosts and networks.
III.CONCLUSIONS
In this paper we present techniques of intrusion detection system including SVM, neural network fuzzy logic and data mining approaches to intrusion detection. Paper presents two types of intrusion detection Misuse and Anomaly. Misuse detection is not sufficient to prevent the intrusion. Hence, Anomaly detection can gives a wide range of novel attacks and finds the intrusion. Also paper describes the two prominent techniques of data mining; classification and clustering. In which clustering is the most effective in the analysis of data most research in this area minimizes the false alarm rates which put this system towards a secure network.
REFERENCES
[1] Chunyu Miao, Wei Chen, “A study of Intrusion Detection System Based on Data Mining”, IEEE, 2010. [2] Online:http://www.cs.columbia.edu/~wenke/papers/usenix/usenix.html
[3] Kapil Wankhede, Sadia Patka, Ravindra Thool. “An Overview of Intrusion Detection Based on Data Mining Techniques” IEEE 2013.
[4] Gui Haixia “ Research of Intrusion Detection Based On data mining ,” International Conference on e-Education, Entertainment and e-Management IEEE, 2011.
Volume 2, Issue 3, 2015
204 [6] K. A. Abdul Nazeer, M. P. Sebastian “Improving the Accuracy and Efficiency of the K-Means Clustering
Algorithm”, IEEE, 2009
[7] Deepthy K Denatious, Anita John“Survey on Data Mining Techniques to Enhance IntrusionDetection”, IEEE, 2012.
[8] Shahrzad Zargari, Dave Voorhis,Third International Conference on Emerging Intelligent Data and Web Technologies “Feature Selection in the Corrected KDD-dataset” 2012.
[9] Srinivas Mukkamala, Guadalupe Janoski, Andrew Sung on Intrusion Detection Using Neural Networks and Support Vector Machines.
[10]Vladimir VN (1995) The Nature of StatisticalLearning Theory. Springer, Berlin Heidelberg NewYork.
[11]Joachims T (2000) SVMlight is an implementation of Support Vector Machines (SVMs) in C. http://ais.gmd.de/-thorstedsvm-light/. University of Dortmund. Collaborative Research Center on ‘Complexity Reduction in Multivariate Data’ (SFB475).
[12]Simon Haykin, Neural Network A Comprehensive Foundation, Macmillan College Publishing Company, 1994 [13]R. M. Dillon, C. N. Manikopoulos, “Neural Net Nonlinear Prediction for Speech Data”, IEEE Electronics Letters,
Vol. 27, Issue 10, May 1991, pp. 824-826.
[14]G.A. Carpenter, et al, “Fuzzy ARTMAP: An adaptive resonance architecture for incremental learning of analog maps”, International Joint Conference on Neural Networks, June 1992
[15]NeuraWare Inc., Neural Computing A Technology Handbook for NeuralWorks Professional II/PLUS and Neural Works Explorer, NeuralWare Inc., 1998
[16]Orchard, R. 1995. FuzzyCLIPS version 6.04 user’s guide. Knowledge System Laboratory, National Research Council Canada.