• No results found

Real Time Fault Detection System for Cloud Computing Using Unsupervised Outlier Detection Method

N/A
N/A
Protected

Academic year: 2020

Share "Real Time Fault Detection System for Cloud Computing Using Unsupervised Outlier Detection Method"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015

International Journal of Research (IJR)

Available at http://internationaljournalofresearch.org

Real Time Fault Detection System for Cloud Computing

Using Unsupervised Outlier Detection Method

Akshay Badak

KJCOEMR, PUNE KJCOEMR, PUNE Pune University,India [email protected]

Sushant Chvan

KJCOEMR, PUNE KJCOEMR, PUNE PuneUniversity,India

[email protected]

Pratik Phule

KJCOEMR,PUNEKJCOEMR,PUNE PuneUniversity,India [email protected]

Niraj Raskar

KJCOEMR, PUNE KJCOEMR, PUNE PuneUniversity,India

[email protected]

Abstract-

Outlier detection is becoming a recent area of research focus in data mining. Existing system propose an efficient outlier detection concept DenOD based on unsupervised method for intrusion detection in cloud computing environment [1]. Unsupervised outlier detection plays an important role in different application domains. It can be used in intrusion detection, fault detection, fraud detection.One main reason behind the popularity of unsupervised method is, it doesn’t require any training data set.

In this project work we are using unsupervised outlier detection method for fault detection over cloud computing environment.As in cloud computing there several machine running at the server side, having many services for many users, cannot guarantee the error free service all the time. Therefore our proposed approach will be able to detect the fault at the machine and their services in order to offer the guaranteed to the consumers.

I. INTRODUCTION

An outlier is a behavior of data pattern which is

significantly different from the remaining

behavior of data. According to Hawkins definition the concept of an outlier is as follows: “An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.” Outlier detection techniques are becomes a very important technique in the field of computer security and data analysis.

Fig 1 Outlier

(2)

c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015

International Journal of Research (IJR)

Available at http://internationaljournalofresearch.org

security, network security, credit card fraud detection, insurance fraud, fault detection in real time safety critical systems and defence activities. The rapid proliferation of the Computer based system and Internet technologies, the most of the problems arises in data privacy, information security and network security has become very prominent.

The Cloud based resources are coupled very tightly with each other and they need reliable bandwidth for communication. Main purpose of cloud computing is to provide all resources to client as a service at low cost with security. Cisco defines cloud computing as follows: IT

resources and services that are abstracted from the underlying infrastructure and provided “on-demand” and “at scale” in a multitenant environment.[2]

1. On-demand means that services can be

easily provisioned as needed basis, and released also on demand basis. It means user billed only when it used.

2. At-scale means the service provides the

infinite resource on demand basis. This is kind of trickthat always show you unlimited resources.

3. Multitenant environment means that the

resources are accessed by many consumers from a single point of interface that saving the other infrastructure costs.

Recently, the technique of outlier detection is continuously gaining attention. As fault detection system to handle the log information of various machines and services, it is also need to update the structure of an effective fault detection system is a complex and large projects because of large number of resources. With so many resources, it is not reasonable to tell that all of them are well configured and non-faulty. To detect faulty service, first we need to collect database of status and log file. Fault detection is usually done by monitoring, collecting and analyzing performance counters [3]. An outlier detection technique helps

to improve the performance of fault detection and decreases the false alarm rate.

II. RELATED WORK

Chan et al.[4] proposed an unsupervised modified

kmeans algorithm for identifying outliers.

According to author, we can easily detect outlier from data set using clustering method but there is one main need is that to improve the clustering accuracy. They compare algorithm with existing algorithm and benchmark performance, they found more efficient result then others. They also implemented with different situations to evaluate whether identified outlier is by

chance or not. The identified outliers and then removed from the data set to enhance clustering accuracy. They validate their approach by comparing with some existing approaches and performance. Final implementation results on benchmark data sets show that the proposed technique is more efficient than other techniques on different measures.

Dina Said et al.[5] pointed to real time analysis of intrusion detection problem and also suggested its solution. According to authors, most of Intrusion

Detection Systems areused in real-time

applications. IDS should be simpler but efficient enough to detect intrusions quickly. Distance-based Outlier Detection is a method for detecting outliers. It is an unsupervised approach which overcomes the limitation of training data sets with known intrusions. Although the authors used high-dimensional data-sets for IDSs, yet it is adimensionality issue to use this approach. However, for finding distance between different observations, the intrusion data-sets must be normalized.

(3)

c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015

International Journal of Research (IJR)

Available at http://internationaljournalofresearch.org

and behavior relative to other clusters. So they defined the notion a strong outlier, which is efficient for both local and global levels. For their algorithm one more basic assumption that distinguishes

the malicious activities by reasonable metric and safe points. The time complexity of CLAD is O (kN) where k is number of clusters and N is data points.

Peng Yang et al. proposed a modified density based outlier mining algorithm. This algorithm can solve the time consuming computation problem in conventional densitybased method. For every object in dataset, there is no need for the algorithm to find whether there are core

objects within the –neighborhood of it or not. In

this paper, the author has also introduced the module information of data object and it can avoid unnecessary computation involved in finding the

outliers from data-sets. The author has used the algorithm on intrusion dataset and has shown the experimental results which indicates that it obtains efficient performance in outlier mining and its performance is also improved in intrusion

detection.avoid unnecessary computation

involved in finding the outliers from data-sets. The author has used the algorithm on intrusion dataset and has shown the experimental results

which indicates that it obtains efficient

performance in outlier

mining and its performance is also improved in intrusion detection.

III. Existing System:

The existing system is based on cloud computing (kind of distributed environment) for detecting intrusion in it. As we know, cloud is collection of large resources and providesvarious services at low cost. In cloud environment, very basic thing are that how to monitor all resources with respect to security. Due to large number of resources and their services, we are not able to manage, monitor, analyses the resources and their condition against

intrusions and faults. To maintain the reliability and performance, we are going to use this framework and algorithm. Existing system proposing Density based outlier detection for Intrusion detection in cloud computing. This technique is based on unsupervised method, which is based on abnormal

behaviour not on training datasets.

Drawbacks:

1. Most of the traditional systems are

based on supervised outlier detection.

2. Supervised and Semi-supervised system

needs full or partial training for outlier detection.

3. Existing unsupervised system is

proposed for the intrusion detection over cloud.

4. Existing system does not considered the

fault detection for different machine on cloud and their services.

IV. Proposed Work:

Objectives:

1. To use the unsupervised outlier

detection system for cloud.

2. To use proposed system for fault

detection over cloud.

3. To monitor the working of machines

over cloud.

4. To achieve the guaranteed service for

the consumers.

Framework:

The framework is based on cloud computing (kind ofdistributed environment) for detecting fault in it. As we know,cloud is collection of large resources and provides all kind ofservices at low cost. In cloud environment, very basic thingare that how to manage these resources, security and

(4)

c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015

International Journal of Research (IJR)

Available at http://internationaljournalofresearch.org

fault or not with respect to itsreliability and performance.

Here we are proposing an unsupervised outlier detection based faultdetection algorithm for cloud computing. This algorithmtechnique is based on unsupervised method in which any kindof training data set is not required. This is capable to handle allkind of faults in each site of cloud and also detect someintrusion activities, which causes the resources and services athigh level.

Fig 2. Architecture

This framework has three phases, first one is actual cloudcomputing environment, second one is Fault Detection Systemand third one is End User. In cloud computing environment,there is N number of node that connected to each other. Eachnode have N number of machines that provide services. Eachservices contains database of status and log files. Thesedatabase files

contains all activities of each machine

andservices.

Algorithm:

Input: Logs and status of Virtual machines on

cloud VM={VM1,VM2,..VMN} and Services

S{S1,S2,..,Sn}

Output: FVM={Set of faulty nodes}, FS={ Set of

faulty services}

step 1. For each VM1,VM2,…VMN.

step 2. For each DB D1,D2,…Dn of

S1,S2,…Sn.

step 3. Map(Di)->Cell_Index.

step 4. Density[]=Get_Density();

step 5. X[]=Get_Ordered_Density();

step 6. Apply IQR on X[].

step 7. If([XL-1.5(IQR)]) <X <

([XU+1.5(IQR)])

Go to step 3.

Else service is faulty.

step 8. Return FS[].

step 9. FVM[]=Get_Fvm(FS[]).

step 10. Return FS[] and FVM[].

In FDS phase, there is individual connection for each node andresponsible for mapping the log database of each services andmachines to the cell index in each node. On the basis of celldata value, it generate its density function and then calculateits density. After calculating its density it returns orderednumeric value of each density and then performs inter-quartilerange (IQR) using outlier detection method.

References:

[1] Manoj Kumar, Robin Mathur,”Unsupervised Outlier Detection Technique for Intrusion Detection in Cloud Computing”, International Conference for Convergence of Technology – 2014.

[2] “Cisco Cloud Computing-Data Center Strategy, Architecture, and Solutions”. Point of View White Paper for U.S. Public Sector, ed. 1, 2009, pp. 4-16.

(5)

c e-ISSN: 2348-6848, p- ISSN: 2348-795X Volume 2, Issue 12, December 2015

International Journal of Research (IJR)

Available at http://internationaljournalofresearch.org

[4] Chan, Muhammad H. Arshad and Philip K. “Identifying Outliers via Clustering for Anomaly Detection,” in thesis research-CS, Department of Computer Sciences Florida Institute of Technology, Florida, 2003,TR CS-2003-19.

Figure

Fig 1 Outlier
Fig 2. Architecture  This framework has three phases, first one is

References

Related documents

During the past few decades four East Asian economies - South Korea, Taiwan, Singapore and Hong Kong - have achieved the fastest rates of economic growth the world has ever seen.

MD Standards Meta Data Definitions Processes Method Status ELEMENTS Cleansing Reporting Quality Control Maintenance Conversion

Keywords - Vertical Irregularity, Storey drift, Storey Shear, Seismic Analysis ,Base Shear, Torsion, Axial Forces, Mass Irregularity, Bending Moment.. These structures

The analysis aimed to closely reproduce the methods used in an earlier study 7 which presented two logistic regression models to predict the effects of distance on: the likelihood

Plant densities, heights, total number of leaves per plant, percentage of plots with &gt;5% green leaf surface area per plant, percentage of plants with re-growth, and

Study of relation between coping ways with individual’s characters and mental health in infertile couples who refer to Yazd infertility center.. Tehran, Tarbiat

Paper presentation at Council on Social Work Education 60 th Annual Program Meeting, Tampa, FL.. Cuellar, M., &amp;