c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015
International Journal of Research (IJR)
Available at http://internationaljournalofresearch.orgReal Time Fault Detection System for Cloud Computing
Using Unsupervised Outlier Detection Method
Akshay Badak
KJCOEMR, PUNE KJCOEMR, PUNE Pune University,India [email protected]
Sushant Chvan
KJCOEMR, PUNE KJCOEMR, PUNE PuneUniversity,India
Pratik Phule
KJCOEMR,PUNEKJCOEMR,PUNE PuneUniversity,India [email protected]
Niraj Raskar
KJCOEMR, PUNE KJCOEMR, PUNE PuneUniversity,India
Abstract-
Outlier detection is becoming a recent area of research focus in data mining. Existing system propose an efficient outlier detection concept DenOD based on unsupervised method for intrusion detection in cloud computing environment [1]. Unsupervised outlier detection plays an important role in different application domains. It can be used in intrusion detection, fault detection, fraud detection.One main reason behind the popularity of unsupervised method is, it doesn’t require any training data set.
In this project work we are using unsupervised outlier detection method for fault detection over cloud computing environment.As in cloud computing there several machine running at the server side, having many services for many users, cannot guarantee the error free service all the time. Therefore our proposed approach will be able to detect the fault at the machine and their services in order to offer the guaranteed to the consumers.
I. INTRODUCTION
An outlier is a behavior of data pattern which is
significantly different from the remaining
behavior of data. According to Hawkins definition the concept of an outlier is as follows: “An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.” Outlier detection techniques are becomes a very important technique in the field of computer security and data analysis.
Fig 1 Outlier
c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015
International Journal of Research (IJR)
Available at http://internationaljournalofresearch.orgsecurity, network security, credit card fraud detection, insurance fraud, fault detection in real time safety critical systems and defence activities. The rapid proliferation of the Computer based system and Internet technologies, the most of the problems arises in data privacy, information security and network security has become very prominent.
The Cloud based resources are coupled very tightly with each other and they need reliable bandwidth for communication. Main purpose of cloud computing is to provide all resources to client as a service at low cost with security. Cisco defines cloud computing as follows: IT
resources and services that are abstracted from the underlying infrastructure and provided “on-demand” and “at scale” in a multitenant environment.[2]
1. On-demand means that services can be
easily provisioned as needed basis, and released also on demand basis. It means user billed only when it used.
2. At-scale means the service provides the
infinite resource on demand basis. This is kind of trickthat always show you unlimited resources.
3. Multitenant environment means that the
resources are accessed by many consumers from a single point of interface that saving the other infrastructure costs.
Recently, the technique of outlier detection is continuously gaining attention. As fault detection system to handle the log information of various machines and services, it is also need to update the structure of an effective fault detection system is a complex and large projects because of large number of resources. With so many resources, it is not reasonable to tell that all of them are well configured and non-faulty. To detect faulty service, first we need to collect database of status and log file. Fault detection is usually done by monitoring, collecting and analyzing performance counters [3]. An outlier detection technique helps
to improve the performance of fault detection and decreases the false alarm rate.
II. RELATED WORK
Chan et al.[4] proposed an unsupervised modified
kmeans algorithm for identifying outliers.
According to author, we can easily detect outlier from data set using clustering method but there is one main need is that to improve the clustering accuracy. They compare algorithm with existing algorithm and benchmark performance, they found more efficient result then others. They also implemented with different situations to evaluate whether identified outlier is by
chance or not. The identified outliers and then removed from the data set to enhance clustering accuracy. They validate their approach by comparing with some existing approaches and performance. Final implementation results on benchmark data sets show that the proposed technique is more efficient than other techniques on different measures.
Dina Said et al.[5] pointed to real time analysis of intrusion detection problem and also suggested its solution. According to authors, most of Intrusion
Detection Systems areused in real-time
applications. IDS should be simpler but efficient enough to detect intrusions quickly. Distance-based Outlier Detection is a method for detecting outliers. It is an unsupervised approach which overcomes the limitation of training data sets with known intrusions. Although the authors used high-dimensional data-sets for IDSs, yet it is adimensionality issue to use this approach. However, for finding distance between different observations, the intrusion data-sets must be normalized.
c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015
International Journal of Research (IJR)
Available at http://internationaljournalofresearch.organd behavior relative to other clusters. So they defined the notion a strong outlier, which is efficient for both local and global levels. For their algorithm one more basic assumption that distinguishes
the malicious activities by reasonable metric and safe points. The time complexity of CLAD is O (kN) where k is number of clusters and N is data points.
Peng Yang et al. proposed a modified density based outlier mining algorithm. This algorithm can solve the time consuming computation problem in conventional densitybased method. For every object in dataset, there is no need for the algorithm to find whether there are core
objects within the –neighborhood of it or not. In
this paper, the author has also introduced the module information of data object and it can avoid unnecessary computation involved in finding the
outliers from data-sets. The author has used the algorithm on intrusion dataset and has shown the experimental results which indicates that it obtains efficient performance in outlier mining and its performance is also improved in intrusion
detection.avoid unnecessary computation
involved in finding the outliers from data-sets. The author has used the algorithm on intrusion dataset and has shown the experimental results
which indicates that it obtains efficient
performance in outlier
mining and its performance is also improved in intrusion detection.
III. Existing System:
The existing system is based on cloud computing (kind of distributed environment) for detecting intrusion in it. As we know, cloud is collection of large resources and providesvarious services at low cost. In cloud environment, very basic thing are that how to monitor all resources with respect to security. Due to large number of resources and their services, we are not able to manage, monitor, analyses the resources and their condition against
intrusions and faults. To maintain the reliability and performance, we are going to use this framework and algorithm. Existing system proposing Density based outlier detection for Intrusion detection in cloud computing. This technique is based on unsupervised method, which is based on abnormal
behaviour not on training datasets.
Drawbacks:
1. Most of the traditional systems are
based on supervised outlier detection.
2. Supervised and Semi-supervised system
needs full or partial training for outlier detection.
3. Existing unsupervised system is
proposed for the intrusion detection over cloud.
4. Existing system does not considered the
fault detection for different machine on cloud and their services.
IV. Proposed Work:
Objectives:
1. To use the unsupervised outlier
detection system for cloud.
2. To use proposed system for fault
detection over cloud.
3. To monitor the working of machines
over cloud.
4. To achieve the guaranteed service for
the consumers.
Framework:
The framework is based on cloud computing (kind ofdistributed environment) for detecting fault in it. As we know,cloud is collection of large resources and provides all kind ofservices at low cost. In cloud environment, very basic thingare that how to manage these resources, security and
c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015
International Journal of Research (IJR)
Available at http://internationaljournalofresearch.orgfault or not with respect to itsreliability and performance.
Here we are proposing an unsupervised outlier detection based faultdetection algorithm for cloud computing. This algorithmtechnique is based on unsupervised method in which any kindof training data set is not required. This is capable to handle allkind of faults in each site of cloud and also detect someintrusion activities, which causes the resources and services athigh level.
Fig 2. Architecture
This framework has three phases, first one is actual cloudcomputing environment, second one is Fault Detection Systemand third one is End User. In cloud computing environment,there is N number of node that connected to each other. Eachnode have N number of machines that provide services. Eachservices contains database of status and log files. Thesedatabase files
contains all activities of each machine
andservices.
Algorithm:
Input: Logs and status of Virtual machines on
cloud VM={VM1,VM2,..VMN} and Services
S{S1,S2,..,Sn}
Output: FVM={Set of faulty nodes}, FS={ Set of
faulty services}
step 1. For each VM1,VM2,…VMN.
step 2. For each DB D1,D2,…Dn of
S1,S2,…Sn.
step 3. Map(Di)->Cell_Index.
step 4. Density[]=Get_Density();
step 5. X[]=Get_Ordered_Density();
step 6. Apply IQR on X[].
step 7. If([XL-1.5(IQR)]) <X <
([XU+1.5(IQR)])
Go to step 3.
Else service is faulty.
step 8. Return FS[].
step 9. FVM[]=Get_Fvm(FS[]).
step 10. Return FS[] and FVM[].
In FDS phase, there is individual connection for each node andresponsible for mapping the log database of each services andmachines to the cell index in each node. On the basis of celldata value, it generate its density function and then calculateits density. After calculating its density it returns orderednumeric value of each density and then performs inter-quartilerange (IQR) using outlier detection method.
References:
[1] Manoj Kumar, Robin Mathur,”Unsupervised Outlier Detection Technique for Intrusion Detection in Cloud Computing”, International Conference for Convergence of Technology – 2014.
[2] “Cisco Cloud Computing-Data Center Strategy, Architecture, and Solutions”. Point of View White Paper for U.S. Public Sector, ed. 1, 2009, pp. 4-16.
c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015
International Journal of Research (IJR)
Available at http://internationaljournalofresearch.org[4] Chan, Muhammad H. Arshad and Philip K. “Identifying Outliers via Clustering for Anomaly Detection,” in thesis research-CS, Department of Computer Sciences Florida Institute of Technology, Florida, 2003,TR CS-2003-19.