International Journal of Computer Applications (0975 – 8887) Volume 77– No.4, September 2013
Using Fuzzy Center Mean (in general any) Clustering
Methods to Construct Fuzzy Classifier Tuned to do
Classification
G.K.Vikram
Department of Electrical Engg. Indian Institute of Technology,
Roorkee, India
ABSTRACT
This paper presents about using
FuzzyCentreMean[1][4][5][7][8] (in general any clustering method) to construct fuzzy classifier tuned to do classification. Clustering methods, in general try to form clusters of data in such a way that a huge chunk of data is reduced to its representative elements(sets).The different clustering methods are like different points of view of the same data[1][4][5][7][8]. For Fuzzy classifier decision making/logic is imparted by ‘Rules of inferences’ framed by expert human pertaining to data considered. This paper provides a way to construct the rules of inference without the need of humanly intervention but by interpreting data centers of the clusters[1][3][6][11] .The above case is mainly important in almost lossless data compression, in reconstruction of an entire image/video from the damaged copy and in areas of classification of data points into appropriate classes(similar to classes designed by humans manually after interpreting the data).The case study here, is on fisher Iris data set[9].
General Terms
Fuzzy classifier algorithms, Data classification, Data compression, Data reconstruction, Data cluster.
Keywords
fuzzy, fuzzy clasifier,cluster,rules of inference, data point, data center,membership,mamdani type, fuzzy center mean, Fuzzy classifier algorithms
1.
INTRODUCTION
The driving philosophy of bringing out relationship between clustering techniques and design of fuzzy classifier, rather “Rules of inference” in specific is just in support for the way “Great Neural Network”(Humans) work.
People learn things by “ Association”(Associative learning); and, so they learn classifying similar with dissimilar.[2]
Also, one important point to be noted is that when same information is given to many individuals each individual may develop different interpretation and hence different view-point.[2] The different clustering methods are like different points of view for the same data[1][2][4][5]. So, choice of clustering methods to meet particular design ends of controller also matters greatly. And, writing about “Association”, there is a classic theorem called “Sampling theorem” which says that exact reconstruction of continuous signal from discrete signal points sampled in time is possible .But for this to happen the instants of sampling should be right(as defined by the conditions of the above said theorem)[5]. According, to my interpretation, these sampled time instants can be
considered as “Associating-links” between data points such that we are able to construct(learn) the entire function(of continuous kind).So, if data compression is considered equivalent to sampling and (exact) data reconstruction is considered equivalent to interpolation. Then knowledge about distribution of data point around each data center(found from clustering methods) or I can call it noise around data centers. These data centers are like “associative-links” similar to sampling time instants, which might help in associating each data center and by “fuzzying” around them we might reconstruct entire data without any loss.( i.e., equivalent to interpolation or finding values in between sampled signals). I have devised my own two algorithms—one on formulating “rules” based on (primitive/in general, may use advanced) statistical properties of data (like mean, cluster center, relative distance).And the second algorithm focuses on forming what I call “group(class)-feature matrix” based on the results of clustering methods which is made use in devising “Rules of inferences” for Fuzzy controllers/classifiers(most critical/difficult part in design of this controllers). An algorithmic approach is followed to tackle this task of tuning of fuzzy controller/classifier, also in situations where I found scope for generalizing, I took the liberty to do so. And in other situations I would choose to limit my interpretation rather study to “fisheriris” data set[4] .
2.
Fuzzy Classifier
This classifier has its foundations in fuzzy logic and mamdani fuzzy logic controller.[1][3][11]
2.1
Fuzzy Logic and Mamdani fuzzy logic
controller
International Journal of Computer Applications (0975 – 8887) Volume 77– No.4, September 2013
3. Begin:
1)Arrange data set in a matrix of this form as shown below—
A=
No. Feature1 Feature n Output1 Output k
(Datapoint)1
Dimensions of this matrix=(no. Of datapoints * (no. Of features +no. Of outputs));
If outputs are absent then –
A=
Dimensions of this matrix=(no. Of datapoints * no. Of features );
2)Every classification problem specifies no. And type of groups present.
Form a matrix such that each row of which should stand/labelled for group/class and each column should stand/labelled as features involved.
Each cell of the matrix should contain a (statistical) quantity which represent each group’s each feature’s value.
In this case,for initialising this matrix(G) ,I took for each cell “mean” for group say (indexed) “i” and across datavalues for a feature say, “j”.So, this matrix contains mean values derived from training set.
G=
Group/class no.
Feature1 Feature j Feature n
G1
Gi
Gm
where,indexes m=no. Of clusters/groups/classes as specified in classification problem
n=no. Of features as specified in classification problem.
3)Apply FCM method( self-organisng maps may be made use of or other clustering methods it outputs information about all “centers”,membership values of each datapoint with respect to
C=
Centre Feature 1 Feature n Output 1 Output k
C1
Cz
If “A-Matrix” has no output columns then-
C=
Centre Feature 1 Feature 2 Feature 3 Feature n
C1
Cz
Where,C=contains centers of clusters and each exist in real number space of dimensionality=(no. Of centers * no. Of features);
4) Replace each cell value of group-feature matrix (G), by the values in “center” matrix such that the most matching value in each column of “center” matrix with respect to each cell value in G.
:For each cell of “G-matrix”
Let me denote cell value in G for ith row jth column as G(i,j)
Now, search in column jth of “C-matrix” the most nearest matching/approximate
Value and replace the value in “G-matrix” for ” G(i,j)” by value from “C-matrix” found by doing as above.
Repeat
3.1 Algorithm to update “G-matrix” and
“Ou-matrix” from values in “C-“Ou-matrix”.
For each column/feature of “G-matrix”:indexed as “i”
For each row/groups/cluster/class of “G- matrix”:indexed as “j”
For each feature of “C-matrix”:indexed as “m”
a(m)= absolutevalue(C(m,i)-G(j,i));
end;
minf=min(a(1:m));
gp_index=findindex(a(1:m)==minf);
G(j,i)=C(gp_index,i); No. Feature1 Feature 2 Feature 3 Feature n
International Journal of Computer Applications (0975 – 8887) Volume 77– No.4, September 2013
comment—“Ou –matrix” is the matrix which holds the values for ouputs corresponding to each set of features for particular group as given for “G-matrix”.
Result--Each cell of “G-matrix” is updated with the co-ordinate values of appropriate “centers” in such a way that value at each cell of “G-matrix” will represent a value describing the “center of attraction” for datapoint for particular group(say i) with respect to particular feature(say, j).
4.“Algorithmic way to devise Rules of
Inference for Fuzzy-classifier based on
calculated statistical property value,which
is indicated from the co-ordinates of
“centers” in space Rno.of feature “
4.1 Design of Mamdani type fuzzy classifier
->
Take each feature of data and set it as inputs to the fuzzy classifier.
->Set type of fuzzy classifier as “Mamdani type”.
->Set Universe of for each feature
--Calculate Min and Max values from the training set for each feature.
--For each feature
Set range as [Min Max]
Or set range as [Min-somenumber between(0,1) *(Max-Min) ,Max + somenumber between(0,1) *(Max-Min)]; %so as be cautionary for future test data set and hence not to get out of bounds error.
->For each feature take as many membership functions as there are no. Of groups.
->For each feature’s membership function set the type of MF’s as either trimf,gaussf,gauss2f (depending on which gives better result under empirical study over training data set).
->set mean-value of each Membership Function’s with that corresponding value in “G-matrix” for feature and for output MF’s use values from “Ou-matrix”.
->For each output/input
Calculate distribution information of data in space for group by either adopting other neural techniques or by askng others so, as to know rough estimate on
randomly selected large training set. (say,in case of fisheriris dataset each group has 33.3% share of data as given in data folder of this data set in “UCI” database)
Take gaussmf(gauss2mf/trimf depending on satisfactory empirical results on training sets) to be in proportion to each of groups rough estimate of share of data with respect to range of ‘Universe of discourse’[1][11] of each feature/output.
[ consider, for a given dataset has associated 3-classes(A,B,C)
Class A contains “a”no. of datapoints;class B contains “b” no. ofdatapoints and Class C contains “C” no. Of datapoints . Let range of any feature/output be say,”M”
Then,spread for each class say,
Sigma(A) for a particular feature/output=
a/m *(range of Universe of discourse);
];
Set no. Of Membership functions of each output also same as no. Of group
4.2 Devise Rules of Inference
---no. Of rules =no. Of clusterers=no. Of groups;
->Take structure of “G-matrix” and “Ou-matrix” as reference.
(consider now only for single ouput case;you can generalise on your own time)
->For each row/group of “G-matrix”
Construct Rule in the format as shown below Consider “x “ be the vector of input to the fuzzy controller.
Let vector x=[x(f1),x(f2)...x(f no.of features)];
Rule 1)--If x(f1) is gp1 and x(f2) is gp1 and x(f3) is gp1... then o is gp1;
Rule i)--If x(f1) is gpi and x(f2) is gpi and x(f3) is gpi... then o is gpi;
Now, fuzzy controller is ready to be used for classification.
International Journal of Computer Applications (0975 – 8887) Volume 77– No.4, September 2013
Below are the screenshots of FIS GUI editor window(Fuzzy classifier) (MATLAB software) created after implementation of above discussed algorithms algorithms in MATLAB. In Rules editor GUI window where input values can be given at
[image:5.595.54.282.148.554.2]the lower left side and the result is seen over top-right side of this window.The result matches very closely to the values corresponding to each input.
[image:5.595.319.544.149.338.2]Fig 2: Fuzzy classifier(FIS Editor)
Fig 3: Membership functions plots of first input
Fig 4: Membership functions plots of second input
[image:5.595.321.543.381.569.2]International Journal of Computer Applications (0975 – 8887) Volume 77– No.4, September 2013
Fig 6: Membership functions plots of fourth input
.
5.
Conclusion
Tuning fuzzy classifier by above studied approach has lot of potential in classifying data points based on their inherent associative pattern. In addition, I strongly believe that using clustering methods with the above implementation of two algorithms, could help in finding distributive nature of signals and hence will enable estimation of proper time instants for sampling data signal and finally reconstruct exact continuous signals. Also,using Fuzzy controllers tuned with such clustering methods prove an helping hand in further providing knowledge of experience similar to that gained by ”Great neural Network”(expert human). Hence, this above studied approach might open scope of using regression problems(say,reconstruction of entire continuos signal from discrete samples at discrete times) using fuzzy classifier is ample
6.
ACKNOWLEDGMENTS
I would like to thank my father Dr.G.L.Kakhandki,H.O.D of botony K.C.P Science College ,Bijapur,Karnataka for teaching me biological aspects of associative learning and some of different ways a data can be interpreted/misinterpreted through practical psychological viewpoint.I would like to thank my mother Mrs.Kumudvati for teaching me technique of brainstorming to produce fresh and out-of-box ideas/algorithms and also all thanks to my sister Ms.G.K.Shwetha who taught me that in business and in scientific enquiry practicability of ideas is vital for progress and most importantly helped regain faith in myself during
rehabilitation process of my father who is presently recovering from half-body paralysis .Also,I would like to thank Dr. Gopinath Pillai, Department. Of Electrical Engineering, Indian Institute of Technology, Roorkee, for encouraging me to contiue on my project. Moreover I would like to thank Institute Computer Cente, Indian Institute of Technology, Roorkee for providing me the software MATLAB without which the project would not have been possible .
7.
REFERENCES
[1] Timothy J.Ross 2010 Fuzzy Logic With Engineering Applications, John Wiley & Sons,Wiley India edition(3rd),90 p.-390p.
[2] G.L.Kakhandki,2011,biological & psychological lessons/concepts on associative learning[personal interaction]
[3] Robert Babuska and Ebrahim Mamdani 2008 “Fuzzy control”,ScholarPedia,[Online],Available:
http://www.scholarpedia.org/article/Fuzzy_control
[4] Wikipedia,(2013,Mar),”FuzzyClustering”,[Online],Avail able: http://en.wikipedia.org/wiki/Fuzzy_clustering
[5] Miyamoto, Sadaaki and Ichihashi,Hidetomo.” Algorithms for Fuzzy Clustering: Methods in c-Means Clustering with Applications”. Springer; 2008 edition
[6] P.Prandoni,M.Veterli(2008).Signal processing for Communications,[Online].pp263-266,Available: http://www.sp4comm.org/docs/sp4comm.pdf.
[7] Havens,T.C. and Bezdek,J.C.” Fuzzy c-Means Algorithms for Very Large Data”.IEEE Press,November 2012
[8] Parker, J.K.; Hall, L.O.; Bezdek, J.C. "Comparison of scalable fuzzy clustering methods", Fuzzy Systems (FUZZ-IEEE), 2012 IEEE International Conference on, On page(s): 1 - 9
[9] MATLAB (2011),” ’Fuzzy Logic Toolbox’,Help-documents’[Online].Available from: http://radio.feld.cvut.cz/matlab/toolbox/fuzzy/fuzzyint.ht
ml to
http://radio.feld.cvut.cz/matlab/toolbox/fuzzy/fuzzyt44.ht ml
[10] R.A.Fisher(1936).”Iris data set.”[Online].Available: http://archive.ics.uci.edu/ml/datasets/Iris