FUZZY LOGIC CLASSIFICATION OVER SEMANTICALLY SECURED ENCRYPTED DATA

(1)

55

FUZZY LOGIC CLASSIFICATION OVER

SEMANTICALLY SECURED ENCRYPTED DATA

V.Mohanapriyanka1, N.Suguna2 PG Scholar1,Professor2

Department of Computer Science and Engineering, Akshaya College of Engineering and Technology, Coimbatore.

[email protected], [email protected] ABSTRACT

Data mining is the analysis step of the "Knowledge Discovery in database”. It is an interdisciplinary subfield of computer science and the computational process of discovering patterns in large data sets Classification is a data mining technique used to predict group membership for data instances. The data is

encrypted using the Knapsack cryptosystem algorithm. In previous method using the paillier cryptosystem

algorithm. The encrypted data is classified semantically using the fuzzy- logic algorithm. In previous method we are used KNN classifier. This classification algorithm gives the low Efficiency is low compare to the Fuzzy

logic. We are using many secure protocols for implementing classification algorithm. Improving the efficiency

of SMIN protocol is important in first step for improvement of the whole classifier. In the KNN the SMIN protocol efficiency is less so, the efficiency of the KNN is less compare to the fuzzy logic classifier. The

efficiency is less because the protocols used in the existing method The problem of computing nth residue

classless is believed to be computationally difficult.. The confidentiality is important in the data mining. For high confidentiality we go for fuzzy classifier. The cost and efficiency of the fuzzy classifier is high.

Index terms:Datamining, Fuzzy logic, SMIN protocol, KNN. 1. INTRODUCTION

1.1 OVERVIEW OF DATA MINING

Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses.

Data mining tools predict future trends and behaviors, allowing businesses to make proactive,

knowledge-driven decisions. The automated,

prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time consuming to resolve.

Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products and systems as they are brought on-line. When

implemented on high performance client/server or parallel processing computers, data mining tools can analyze massive databases.[4].

2.LITERATURE SURVEY

(2)

56 Ahmad Basheer Hassanat, Mohammad Ali Abbadi and Ghada Awad Altarawneh has proposed a “Solving the Problem of the K Parameter in the KNN Classifier Using an Ensemble Learning Approach” a new solution for choosing the K parameter in the k-nearest neighbor (KNN) algorithm, the solution depending on the idea of ensemble learning, in which a weak KNN classifier is used each time with a different K, starting from one to the square root of the size of the training set. The results of the weak classifiers are combined using the weighted sum rule. The proposed solution was tested and compared to other solutions using a group of experiments in real life problems. The experimental results show that the proposed classifier outperforms the traditional KNN classifier that uses a different number of neighbors, is competitive with other classifiers, and is a promising classifier with strong potential for a wide range of applications[1].

G.Liu,K.Sim,J.Li and L.Wong has proposed “An Adaptive Nearest Neighbour Classificatin Algorithm for Data streams” a feature selection method to a classification problem in molecular biology involving only 72 data points in a 7130 dimensional space. This approach is a hybrid of filter and wrapper approaches to feature selection. It makes use of a sequence of simple filters, culminating in Koller and Sahami’s (1996) Markov Blanket filter, to decide on particular feature subsets for each subset cardinality[3].

Biagio Ciuffo and Vincenzo Punzo has proposed “No Free Lunch” Theorems Applied to the Calibration of Traffic Simulation Models” a general method for verifying the robustness of a calibration procedure (suitable, in general, for any simulation optimization) is proposed based on a test with synthetic data. The main obstacle to this methodology is the significant computation time required by all the necessary simulations. For this reason, a Kriging approximation of the simulation model is proposed instead. The methodology is tested on a specific case study, where the effect on the optimization problem of different combinations of parameters, optimization algorithms, measures of goodness of fit, and levels of noise in the data is also investigated. Results show the clear dependence

between the performance of a calibration procedure and the case study under analysis and ascertain the need for global solutions in simulation optimization with traffic models[4].

Neila Mezghani, Sabine Husse,Karine Boivin, Katia Turcot, Rachid Aissaoui, Nicola Hagemeister, andJacques A de Guise Automatic has proposed “Classification of Asymptomatic and Osteoarthritis Knee Gait Patterns Using Kinematic Data Features and the Nearest Neighbor Classifier”develop an automatic computer method to

distinguish between asymptomatic (AS) and

osteoarthritis (OA) knee gait patterns using 3-D ground reaction force (GRF) measurements. GRF features are first extracted from the force vector variations as a function of time and then classified by the nearest neighbor rule. We investigated two different features: the coefficients of a polynomial expansion and the coefficients of a wavelet decomposition. We also analyzed the impact of each GRF component (vertical, anteroposterior, and medial lateral) on classification. The best discrimination rate (91%) was achieved with the wavelet decomposition using the anteroposterior and the medial lateral components. These results demonstrate the validity of the representation and the classifier for automatic classification of AS and OA knee gait patterns. They also highlight the relevance of the anteroposterior and medial lateral force components in gait pattern classification[5].

Vicente Cerverón and Francesc J. Ferri has proposed “Another Move Toward the Minimum Consistent Subset: A Tabu Search Approach to the Condensed Nearest Neighbor Rule” method that uses tabu search in the space of all possible subsets. Comparative experiments have been carried out using both synthetic and real data in which the algorithm has demonstrated its superiority over alternative approaches. The results obtained suggest that the tabu search condensing algorithm offers a very good tradeoff between computational burden and the optimality of the prototypes selected[6].

(3)

57 Relational Data” here the focus is on solving the classification problem over encrypted data. In particular, a secure k-NN classifier over encrypted data in the cloud. The proposed protocol protects the confidentiality of data, privacy of user’s input query, and hides the data access patterns. The work is to develop a secure k-NN classifier over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our proposed protocol using a real-

World dataset under different parameter settings[7]. 3.METHODOLOGY

3.1 EXISTING METHOD

Data mining is the analysis step of the "Knowledge Discovery in database”. It is an interdisciplinary subfield of computer science and the computational process of discovering patterns in large data sets Classification is a data mining technique used to predict group membership for data instances. The data is encrypted using paillier cryptosystem algorithm. The encrypted data is classified semantically using the KNN classifier. Many secure protocols for implementing classification algorithm is used to improve the efficiency of Secure minimum protocol which is important for improvement of the whole classifier. KNN is a classifier algorithm . It is used to classify the user required data. Initially all the dataset have to be train.The test data is compare with all trained data using Euclidean distance. According to that distance the KNN classify the user required data. This algorithm is more confidential for multiclass user.

3.1.1 PAILLIER CRYPTOSYSTEM:

The Paillier cryptosystem, named after and

invented by Pascal Paillier in 1999, is a

probabilistic asymmetric algorithm for public key

cryptography. The problem of computing n-th residue

classes is believed to be computationally difficult. The decisional composite residuosity assumption is

the intractability hypothesis upon which this

cryptosystem is based. The scheme is an

additive homomorphism cryptosystem; this means that, given only the public-key and the encryption

of and , one can compute the encryption

of .

For any given two plaintexts a, b ∈ ZN , the Paillier encryption scheme exhibits the following properties:

1) Homomorphism Addition Dsk(Epk(a + b)) =

Dsk(Epk(a) ∗ Epk(b) mod N 2 );

2) Homomorphism Multiplication Dsk(Epk(a ∗ b)) =

Dsk(Epk(a) b mod N 2 );

3) Semantic Security - The encryption scheme is semantically secure.

Briefly, given a set of cipher texts, an adversary cannot deduce any additional information about the plaintext(s). For succinctness, we drop the mod N2 term during homomorphism operations[7].

3.1.2 KEY GENERATION

1. Choose two large prime

numbers p and q randomly and independently

of each other such that gcd (pq(p-1)(q-1))=1. This property is assured if both primes are of equal length.

2. Compute n=pq and t=lcm(p-1,q-1)..

3. Select random integer g where

4. Ensure divides the order of by checking

the existence of the following modular multiplicative

inverse: ,

where function is defined as .

Note that the notation does not denote the modular multiplication of times the modular multiplicative inverse of but rather the quotient of divided by ,

i.e., the largest integer value to satisfy the

relation .

 The public (encryption) key is (n,g).

(4)

58 If using p,q of equivalent length, a simpler variant of

the above key generation steps would be

toset and ,

where .

3.1.3 ENCRYPTION

1. Let be a message to be encrypted

where

2. Select random where

3. Compute ciphertext as:

3.1.4 DECRYPTION

1. Let be the ciphertext to decrypt,

where

2. Compute the plaintext message

as:

3.1.5 Homomorphic properties

A notable feature of the Paillier cryptosystem

is its homomorphic properties. As the

encryption function is additively

homomorphic, the following identities can be described:

DRAWBACK OF EXISTING METHOD:

 Data perturbation

technique cannot be applicable for

semantically secure encrypted data

 It does not produce

accurate data mining results due to the addition of statistical noise to the data.

 This method did not

addressed the access pattern issue which is a crucial privacy requirement from the user’s perspective.

 The problem of

computing nth residue classless is believed to be computationally difficult by using paillier cryptosystem.

 Improving the efficiency of

SMIN protocol is important in first step for improvement of the whole classifier. In the KNN the SMIN protocol efficiency is less so, the efficiency of the KNN is less compare to the fuzzy logic classifier.

3.5 PROPOSED SYSTEM :

Fuzzy logic is a form of many valued logic in which the truth value of variables may be any real number between 0 and 1. By contrast, in Boolean logic, the truth values of variables may only be 0 or 1. Fuzzy logic has been extended to handle the concept of partial truth, where the truth value may range between completely true and completely false. Furthermore, when linguistic variables are used, these degrees may be managed by specific functions.

It is known that any Boolean logic function could be represented using a truth table mapping each set of

variable values into set of values . The task of

synthesis of boolean logic function given in tabular form is one of basic tasks in traditional logic that is solved via disjunctive (conjunctive) perfect normal form.

Fuzzy logic and probability address different forms of uncertainty. While both fuzzy logic and probability theory can represent degrees of certain kinds of

subjective belief, fuzzy set theory uses the concept of

fuzzy set membership, i.e., how much a variable is in a

set (there is not necessarily any uncertainty about this degree), and probability theory uses the concept of subjective probability[, i.e., how probable is it that a variable is in a set (it either entirely is or entirely is not in the set in reality, but there is uncertainty around whether it is or is not). The technical consequence of this distinction is that fuzzy set theory relaxes the axioms of classical probability, which are themselves derived from adding uncertainty, but not degree, to the crisp true/false distinctions of classical Aristotelian logic.

Security Protocols:

Secure Minimum protocols:

(5)

59 and v ′ = ([v], Epk(sv)). Here su and sv denote the secrets corresponding to u and v, respectively. The main goal of fuzzy is to securely compute the encryptions of the individual bits of min(u, v), denoted by [min(u, v)]. Here [u] = hEpk(u1), . . . , Epk(ul)i and [v] = hEpk(v1), . . . , Epk(vl)i, where u1 (resp., v1) and ul (resp., vl) are the most and least significant bits of u (resp., v), respectively. In addition, they compute Epk(smin(u,v)), the encryption of the secret corresponding to the minimum value between u and v. At the end of SMIN, the output ([min(u, v)], Epk(smin(u,v))) is known only to P1. We assume that 0 ≤ u, v < 2 l and propose a novel SMIN protocol. Our solution to SMIN is mainly motivated from the work of [24]. In addition, depending on F, P1 computes the encryption of randomized difference between su and sv and stores it in δ. Specifically, if F : u > v, then δ = Epk(sv − su + ¯r). Otherwise, δ = Epk(su − sv + ¯r), where r¯ ∈R ZN . After this, P1 permutes the encrypted vectors Γ and L using two random permutation functions π1 and π2. Specifically, P1 computes Γ ′ = π1(Γ) and L ′ = π2(L), and sends them along with δ to P2. Upon receiving, P2 decrypts L ′ component-wise to get Mi = Dsk(L ′ i ), for 1 ≤ i ≤ l, and checks for index j. That is, if Mj = 1, then P2 sets α to 1, otherwise sets it to 0. In addition, P2 computes a new encrypted vector M′ depending on the value of α. Precisely, if α = 0, then M′ i = Epk(0), for 1 ≤ i ≤ l. Here Epk(0) is different for each i. On the other hand, when α = 1, P2 sets M′ i to the

re-randomized value of Γ ′ i . That is, M′ i = Γ′ i ∗ r N ,

where the term r N comes from re-randomization and

r ∈R ZN should be different for each i. Similar

conclusions can be drawn for smin(u,v)

Communications security is the discipline of

preventing unauthorized interceptors from

accessing telecommunications in an intelligible form, while still delivering content to the intended recipients. In the United States Department of Defense culture, it is often referred to by the abbreviation COMSEC. Protocol minimality can be expressed in several dimensions: Number of Messages Size of each individual message Number of cryptographic operations Thus, a minimum of four distinct elds (or quantities) must be communicated. In addition, the protocol must also communicate from B to A the new key. However, since we couple the new key with B's challenge Nba, the protocol still makes

do with four elds. Since it is not possible to reduce the size of either message 1 or message 3 (both messages containing only a single data block), the issue of minimality concerns only the size of message 2. However, the underlying protocol cannot simultaneously guarantee the integrity of the new key. In other words, key distribution is possible while authenticated key distribution is not.

The 10 authentication token is meaningless without strong cryptographic features and the same is true of the key distribution token. In other words, reduction in the number of cryptographic operations leads either to compromised authentication, or compromised key distribution (key disclosure). To illustrate the last point, we consider two protocols similar to 2PKDP, where the second ows are: Example A (2nd ow): B =) A N_ ba z }| { AU T HKab (Nab ; Kba; B), MASK z}|{ Kab L Kba Example B (2nd ow): B =) A N_ ba z }| { AU T HKab (Nab; Kba; B), MASK z }| { MAC(Na)L Kba In the rst example, the MASK

expression requires no extra cryptographic

computations. Similarly, Γ ′ i and L ′ i are computationally indistinguishable from s ′ 1,i and s ′ 3,i, respectively. Also, as r¯ and rˆi are randomly generated from ZN , s + ¯r mod N and µi + ˆri mod N

are computationally indistinguishable from r ∗ and s ′

2,i, respectively. Furthermore, because the

functionality is randomly chosen by P1 (at step 1(a) of Algorithm 1), α is either 0 or 1 with equal probability. Thus, α is computationally indistinguishable from α ′ . Combining all these results together, we can conclude that ΠP2 (SMIN) is computationally indistinguishable from ΠS P2 (SMIN) based on Definition 1.

Secure Frequency

(6)

60 (hEpk(c1), . . . Epk(cw)i,hEpk(c ′ 1 ), . . . , Epk(c ′ k )i) and P2 securely compute the encryption of the frequency of cj , denoted by f(cj ), in the list hc ′ 1 , . . . , c′ k i, for 1 ≤ j ≤ w. Here we explicitly assume that cj ’s are unique and c ′ i ∈ {c1, . . . , cw}, for 1 ≤ i ≤ k. The output hEpk(f(c1)), . . . , Epk(f(cw))i will be known only to P1. During the SF protocol, no information regarding c ′ i , cj , and f(cj ) is revealed to P1 and P2, for 1 ≤ i ≤ k and 1 ≤ j ≤ w.

Secure Minimum protocol for n numbers:

Consider P1 with private input ([d1], . . . , [dn]) along with their encrypted secrets and P2 with sk, where 0 ≤ di < 2 l and [di ] = hEpk(di,1), . . . , Epk(di,l)i, for 1 ≤ i ≤ n. Here the secret of di is denoted by Epk(sdi ), for 1 ≤ i ≤ n. The main goal of the SMINn protocol is to compute [min(d1, . . . , dn)] = [dmin] without revealing any information about di ’s to P1 and P2. In addition, they compute the encryption of the secret corresponding to the global minimum, denoted by Epk(sdmin ). Here we construct a new SMINn protocol by utilizing SMIN as the building block. The proposed SMINn protocol is an iterative approach and it computes the desired output in an hierarchical fashion. In each iteration, minimum between a pair of values and the secret corresponding to the minimum value are computed (in encrypted form) and fed as input to the next iteration, thus, generating a binary execution tree in a bottom-up fashion. At the end, only P1 knows the final result [dmin] and Epk(sdmin ). The overall steps involved in the proposed SMINn protocol are highlighted in Algorithm 2. Initially, P1 assigns [di ] and Epk(sdi ) to a temporary vector [d ′ i ] and variable s ′ i , for 1 ≤ i ≤ n, respectively. Also, he/she creates a global variable num and initializes it to n, where num represents the number of (non-zero) vectors involved in each iteration. Since the SMINn protocol executes in a binary tree hierarchy (bottomup

fashion), we have ⌈log2 n⌉ iterations, and in each

iteration, the number of vectors involved varies. In the first iteration (i.e., i = 1), P1 with private input (([d ′ 2j−1 ], s′ 2j−1 ),([d ′ 2j ], s′ 2j )) and P2 with sk involve in the SMIN protocol, for 1 ≤ j ≤ num 2 . At the end of the first iteration, only P1 knows [min(d ′ 2j−1 , d′ 2j )] and s ′ min(d ′ 2j−1 ,d′ 2j ) , and nothing is revealed to P2, for 1 ≤ j ≤ num 2 . Also, P1 stores the result [min(d ′ 2j−1 , d′ 2j )] and s ′ min(d ′ 2j−1 ,d′

2j ) in [d ′ 2j−1 ] and s ′ 2j−1 , respectively. addition, P1 updates the values of [d ′ 2j ], s ′ 2j to 0 and num to num 2 , respectively. During the i th iteration, only the non-zero vectors (along with the corresponding encrypted secrets) are involved in SMIN, for 2 ≤ i ≤

⌈log2 n⌉. For example, during the second iteration

(i.e., i = 2), only ([d ′ 1 ], s′ 1 ),([d ′ 3 ], s′ 3 ), and so on are involved. Note that in each iteration, the output is revealed only to P1 and num is updated to num 2 . At the end of SMINn, P1 assigns the final encrypted binary vector of global minimum value, i.e., [min(d1, . . . , dn)] which is stored in [d ′ 1 ], to [dmin]. Also, P1 assigns s ′ 1 to Epk(sdmin ).

FUZZY-LOGIC STEPS

Fuzzification: determines an input's membership in overlapping sets.

Rules: determine outputs based on inputs and rules. Combination/ Defuzzification: combine all fuzzy actions into a single fuzzy action for executable system output.

STEP1: (FUZZIFICATION)

The first step to convert the input data into a fuzzy one. The first value the level of project staffing. The second value is the level of project funding.

Conversion of input into fuzzy set is

Fuzzy set= [T F I];

Where T=logical true

F=logical False

I=Intermediate value

STEP2: (RULES)

Applying the IF THEN rules based on the input.

Rule 1: Rules containing disjunctions, OR, are

evaluated using the UNION operator.

And alternative way of computing the disjunction is via the algebraic

Rule 2 :Conjunctions in fuzzy rules are evaluated

(7)

61 Alternatively the same rule can be evaluated using multiplication.

Rule3:This is evaluated by using both union and intersection

STEP3: (DEFUZZIFICATION)

The defuzzification can be performed in several different ways. The most popular method is the centroid method.

Centroid method

Calculates the center of gravity for the area under the curve.

Mean of maximum

Assuming there is a plateau at the maximum value of the final function takes the mean of the values it spans.

Smallest value of maximum

Assuming there is a plateau at the maximum value of the final function takes the smallest of the values it spans.

Largest value of maximum

Assuming there is plateaus at the maximum value of the final function take the largest of the values it spans.

FLC Steps

In general, fuzzy logic control involves three main steps:

1) Fuzzification is to convert the quantitative inputs into natural language variables,

2) Rule evaluation are to implement the control heuristics, and

3) Defuzzification is to map the qualitative rule outcomes to a numerical output.

FUZZIFICATION

The first step in the FLC is fuzzification, which

pre processes the inputs to the controller.

Fuzzification translates each numerical input into a set of fuzzy classes, also known as linguistic variables.

For the local occupancy and local speed, the fuzzy classes used are very small (VS), small (S), medium (M), big (B), and very big (VB). The degree of activation indicates how true that class is on a scale of 0 to 1.. The fuzzy class for advance queue occupancy looks identical to that shown for queue occupancy. For each input at each location, the dynamic range, distribution, and shape of these fuzzy classes can be tuned. In other words, one way of modifying the behavior of the controller is to redefine our linguistic variables.

RULE EVALUATION After fuzzification, the rule base is evaluated. The rules are a set of if-then statements similar to the heuristics an operator would use to control the system . For a given premise, a fuzzy class of metering rate is specified, either VS, S, M, B, or VB.

Rules 1 through 3 specify a fuzzy metering class given the local mainline occupancy. These rules are similar to the heuristics of the Local Metering Algorithm. While all other rules have a minimum rule weight of zero, these local rules have a minimum rule weight of 0.1 (the software checks for this). The reason for barring non-zero local rule weights is to prevent the possibility of no rules activating within the rule base. The result of no active rules would be undefined, so the controller is not permitted to operate in that input space. The intersection of these two conditions is implemented as the minimum of the membership degrees, which becomes the rule outcome.

DEFUZZIFICATION

The last step in the FLC is to produce a numerical metering rate given all of the rule outcomes. This reverse process from a fuzzy to a

crisp, or quantitative state, is known as

(8)

62 ADVANTAGES OF FUZZY CLASSIFIER

 The users now have the

opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form.

 The encryption data is

more confidential in this classification.

 This algorithm is used

for multi class user.

 The efficiency of the

Fuzzy classifier is more high.

 The proposed protocol

protects the confidentiality of data, privacy of user’s input query, and hides the data access patterns.

MODULUS FOR FUZZY CLASSIFIER

Module 1: Encryption using Knapsack algorithm

 1. This encryption algorithm encrypt the whole

dataset.

 2. This algorithm generate the Secret key

according to the dataset size.

 3. The confidential of the dataset is proved

according to this algorithm

 4. The cipher text is generate according to the

length of the message

 2. Security protocols

 1. The main goal of security protocol is to

securely compute the encryptions of the individual bits of minimum value

 2. the encryption of the secret corresponding to

the minimum value between original data and encrypted data.[9]

 3. The key is only known to the sender.

 4. The encrypted data is secure according to the

secure protocol.

 3.Classify required data using FUZZY Classifier:

 It is used to classify the required data for user

need.

 To apply the fuzzy classifier steps as follows,

 Fuzzification: determines an input's membership in overlapping sets.

 Rules: determine outputs based on inputs and rules.

 Combination/ Defuzzification: combine all fuzzy actions into a single fuzzy action for executable system output.

 A fuzzy is a classifier it is used to classify user

required large amount of data.

 The first step in fuzzy is fuzzification. In the

fuzzification involves logical 0, logical 1 and intermediate values.

 The second step of fuzzy logic is fuzzy rules.

The fuzzy rules is assign performance of the data sets like high, low and medium.

 The final step of fuzzy logic is defuzzification.

The secure data will classify according to the rules of the fuzzy logic.

 4. Decryption algorithm:

 1. The encrypted data is decrypted using the

secret key.

 2. After classify the user required data the

encrypted data is decrypted only which person knows the secret key.

5.RESULT ANALYSIS

5.1 WORKING ENVIRONMENT

MATLAB is an interactive system whose basic data element is an array that does not require dimensioning. This allows you to solve many technical computing problems, especially those with matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar non-interactive language such as C or FORTRAN.

(9)

63 MATLAB is a high-level language and interactive environment that enables to perform computationally intensive tasks faster than with traditional programming languages such as C, C++ and Fortran[3].

It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation.

Typical uses include:

•Math and computation •Algorithm development

•Modeling, simulation, and prototyping •Data analysis, exploration, and visualization •Scientific and engineering graphics

•Application development, including graphical user interface building

MATLAB has several advantages over other methods or languages:

Its basic data element is the matrix. A simple integer is considered an matrix of one row and one column. Several mathematical operations that work on arrays or matrices are built-in to the Matlab environment. For example, cross-products, dot-products, determinants, inverse matrices.

 Vectorized operations. Adding two

arrays together needs only one command, instead of a for or while loop.

 The graphical output is optimized for

interaction. You can plot your data very easily, and then change colors, sizes, scales, etc, by using the graphical interactive tools.

 MATLAB’s functionality can be

greatly expanded by the addition of toolboxes. These are sets of specific functions that provided more specialized functionality.Example: Excel link allows data to be written in a format recognized by Excel, Statistics Toolbox allows more specialized statistical manipulation of data (Anova, Basic Fits, etc)[6].

5.2 DATA SETS

Dataset and Experimental Setup For our experiments, we used the Car Evaluation dataset

from the UCI KDD archive [34]. It consists of 1,728 records (i.e., n ¼ 1; 728) and six attributes (i.e., m ¼ 6). Also, there is a separate class attribute and the dataset is categorized into four different classes (i.e., w ¼ 4).

(10)

64 6.CONCLUSION AND SCOPE FOR FUTURE WORK

To protect user privacy, various classification techniques have been proposed over the past decade. The existing techniques are not applicable to outsourced database environments where the data resides in encrypted form on a third-party server. This paper proposed a novel Fuzzy classification protocol over encrypted data in the cloud. Our protocol protects the confidentiality of the data, user’s input query, and hides the data access patterns and also evaluated the performance of our protocol under different parameter settings. Since improving the efficiency is an important first step for improving the performance of our Fuzzy protocol, the complexity time of the proposed method using KD-trees or other hashing techniques . Such efforts are best left to be done in the future.

REFERENCES

[1] [AB , 14], “Solving the Problem of the K Parameter in the KNN Classifier Using an Ensemble Learning Approach”,2014

[2] [MH , 14] , “Efficient Interactive Brain Tumor Segmentation As Within-Brain Knn Classification”,2014. [3] [EM +, 02],” Feature Selection for High-Dimensional Genomic Microarray Data”, 2002.

[4] [BC ,14] , “ “No Free Lunch” Theorems Applied to the Calibration of Traffic Simulation Models , 2014.

[5] [NM+,08], “Automatic Classification of Asymptomatic and Osteoarthritis Knee Gait Patterns Using Kinematic Data Features and the Nearest Neighbor Classifier” , 2008. [6] [VC , 01] , “Another Move Toward the Minimum Consistent Subset: A Tabu Search Approach to the Condensed Nearest Neighbor Rule” , 2001

[7] [KS , 15] , “k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data” , 2015 [8] T. M. Cover and P. E. Hart, "Nearest Neighbor Pattern Classification,".

[9] Y. Hamamoto, S. Uchimura, and S. Tomita, "A Bootstrap Technique for Nearest Neighbor Classifier Design.

[10] E. Alpaydin, "Voting Over Multiple Condensed Nearest Neoghbors," .

[11] K. Q. Weinberger and L. K. Saul, "Distance Metric

Learning for Large Margin Nearest Neighbor

Classification," .

(11)

65 [14] G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, "KNN Model- Based Approach in Classification," .