• No results found

Building Java Intelligent Applications Data Mining for Java Type-2 Fuzzy Inference Systems

N/A
N/A
Protected

Academic year: 2021

Share "Building Java Intelligent Applications Data Mining for Java Type-2 Fuzzy Inference Systems"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Building Java Intelligent Applications

Data Mining for Java Type-2 Fuzzy Inference Systems

Manuel Casta˜

on-Puga

1

, Josu´

e Miguel Flores-Parra

1

, Juan Ram´

on Castro

1

,

Carelia Gaxiola-Pacheco

1

, and Luis Enrique Palafox-Maestre

1 Autonomous University of Baja California,

Tijuana, Baja California, M´exico

{puga, mflores31,jrcastror,cgaxiola, lepalafox}@uabc.edu.mx

Abstract

This paper introduces JT2FISClustering, a data mining extension for JT2FIS. JT2FIS is a Java class library for building intelligent applications. This extension is used to extract information from a data set and transform it into an Interval Type-2 Fuzzy Inference System in Java applications. Mamdani and Takagi-Sugeno Fuzzy Inference Systems can be generated using fuzzy c-means or subtractive data mining methods. We compare the outputs and performance of Matlab R versus Java in order to validate the proposed extension.

Keywords: Data mining, Interval Type-2 Fuzzy Inference System, Java Intelligent Applications

1

Introduction

In recent years, the use of new information technologies has come to help in the handling of a large amount of data. Data mining extraction is an evolved technology that allows representing knowledge of data and the implicit storage.

Additionally, data mining contributes to the making of tactical and strategic decisions pro-vided by power users because they can measure actions and results in the best way. It generates descriptive models to explore and understand the data and identifies patterns, relationships and dependencies that affect the final results. They can create predictive models [6] that allow undiscovered relationship through the data mining process and explore possible business rules [10].

JT2FIS1is a class library developed for Java applications [4]. The main purpose of JT2FIS is to deploy a library to build interval Type-2 fuzzy inference system (FIS) with an object-oriented programming language for Java developers.

Most of FIS used until now are FIS based on a Type-1 model [13], but lately, a Type-2 model has been developed and other applications are being extended with it [14]. This current

1http://kiliwa.tij.uabc.mx/projects/jt2fis

Volume 51, 2015, Pages 2719–2728

ICCS 2015 International Conference On Computational Science

Selection and peer-review under responsibility of the Scientific Programme Committee of ICCS 2015 c

The Authors. Published by Elsevier B.V.

(2)

technique led us to Type-2 General Fuzzy Inference Model [7] that has been developed as a next step to design and have Fuzzy Inference Systems with more capability to model real-world things [5].

The purpose of this paper is to introduce JT2FISClustering2, a Java data mining extension for JT2FIS. We described the JT2FIS architecture, design, implementation, and Java program-mers examples in detail in [4].

This paper uses the JT2FIS class library as a core Type-2 Fuzzy Inference System for the implementation of JT2FISClustering data mining methods.

1.1

Mamdani and Takagi-Sugeno Fuzzy Logic System

Mamdani and Takagi-Sugeno Fuzzy Logic Systems are popular FIS approaches and are formed by IF-THEN rules with the same antecedent structures. The difference between them is in the consequent structures. The consequent of a Mamdani rule is a fuzzy set while in Takagi-Sugeno is a function, so Takagi-Takagi-Sugeno uses fewer fuzzy rules to represent a real system than Mamdani.

Takagi-Sugeno Fuzzy Inference System was proposed in an effort to develop a systematic approach for generating fuzzy rules from a given input-output data set. This model consists of rules with fuzzy antecedents and mathematical function in the consequent part. The antecedents divide the input space into a set of fuzzy regions while consequences describe the behavior of the system in those regions [9].

The Mamdani and Takagi-Sugeno process is divided into four parts: fuzzifier, rule base, fuzzy inference engine, and output processor. In type-2, a type reducer is needed in the output processor to derive a type-1 set from the type-2 set [3]. Figure 1 shows a block diagram of the classic structure of a Mamdani and Takagi-Sugeno Fuzzy Logic System.

Figure 1: Type-2 fuzzy inference system block diagram.

1.2

Fuzzy C-Means clustering algorithm

The fuzzy proposals represent a significant place in data mining providing intelligible results. One of these proposals is the Fuzzy C-Means clustering algorithm (FCM)[2] [1] that makes

(3)

use of a membership function and centroid computation procedure iteratively to find the best centroid.

The FCM is one of the most popular clustering algorithms; the effectiveness of the clustering method relies on the distance measure. An FCM is the resulting combination of the c-means approach with the handling of fuzzy data. The result of this combination is sufficient because it considers the uncertainty presented in the data avoiding incorrect results and creating crisp partitions in the correct way [2]. Additionally the FCM is used to acquire the adequate levels of the set clustering parameters [12].

1.3

Subtractive clustering algorithm

Subtractive Clustering operates by finding the optimal data point to define a cluster centre based on the density of surrounding data points. It reduces the computational complexities and gives a better distribution of cluster centres in comparison with other clustering algorithms [9].

This method considers each point as a potential centre, and based on mathematical approx-imations, calculates the best choice of centre. Each cluster centre can be seen as a fuzzy rule of the system, and the identified group represents the antecedent of this rule. The measure of potential for data is estimated based on the distance of this data point from all other data points [11].

The identification of a Takagi-Sugeno using clustering involves formation of clusters in the data space and translation of these clusters into Takagi-Sugeno rules such that the model obtained is close to the system to be identified.

2

JT2FISCLUSTERING DATA MINING EXTENSION

FOR JT2FIS

JT2FISClustering is a class library developed in Java. The main purpose of this library is data mining large data sets to set up automatically a Type 2 Inference System applying the paradigm of object-oriented programming relying on JT2FIS[4] class library.

JT2FISClustering is a structure in packets containing a collection of classes. The content and organization of these packages is shown in Table 1.

Table 1: Content Library ClassesJT2FISClustering Package JT2FISClustering Package Clustering Class Cluster

Class FuzzyCMeans Class Subtractive

Package Generate Class GenerateSugenoFis Class GenerateMamdaniFis

Package util Class MersenneTwister

Class Utilities

Figure 2 shows a diagram packages UML of class library JT2FISClustering. The class collection is organized in code packages that encapsulate its behavior.

One advantage of this library is the heritage of the capabilities of an object-oriented pro-gramming paradigm to integrate new clustering methods. Figure 3 shows the abstract class

(4)

Figure 2: Diagram Packages UML of JT2FISClustering.

Clustering that allows us to extend clustering methods.

Figure 3: Types of Clustering Methods.

The current version of JT2FISClustering has three different membership functions (see Table 2). Fuzzy c-Means clustering is the default method.

Table 2: Membership Functions of JT2FISClustering.

Type of Clustering Type 2 Membership Functions

Fuzzy c-Means, Subtractive GaussCutMemberFunction

Params=[inputs outputs uncertainty]

Fuzzy c-Means GaussUncertaintyMeanMemberFunction

Params=[inputs outputs]

Fuzzy c-Means,Subtractive GaussUncertaintyMeanMemberFunction Params=[inputs outputs uncertainty]

Fuzzy c-Means GaussUncertaintyStandardDesviationMemberFunction Params=[inputs outputs]

Fuzzy c-Means,Subtractive GaussUncertaintyStandardDesviationMemberFunction Params=[inputs outputs uncertainty]

(5)

2.1

JT2FIS Fuzzy C-Means Data Mining Operation

In order to exemplify how to use JT2FISClusteing to create an FIS using Fuzzy C-Means method, we are going to follow the next steps:

1. Create a new instance of the class FuzzyCMeans. 2. Create a new instance of the class GenerateMamdaniFis. 3. Create a list with all the dataset for inputs and outputs. 4. Generate FIS, selecting the required membership function.

2.1.1 Creating a new instance of the class FuzzyCMeans.

First, we must create a new instance of the class FuzzyCMeans and define the number of clusters that you want to generate. Listing 1 shows a Java code example.

Listing 1: Creating a New Instance FuzzyCMeans

// N u m b e r of c l u s t e r s r e q u i r e d .

int n u m b e r C l u s t e r =3;

F u z z y C M e a n s fcm = new F u z z y C M e a n s ( n u m b e r C l u s t e r ) ;

2.1.2 Creating a new instance of the classGenerateMamdaniFis

To build a FIS, we must create a new instance of the class GenerateMamdaniFis. This instance receives an object as a parameter that represents the type of clustering method you want to use. Listing 2 shows a Java code example.

Listing 2: Creating a New Instance GenerateMamdaniFis

// fcm is a i n s t a n c e of the class F u z z y C M e a n s .

G e n e r a t e M a m d a n i F i s g M a m F i s = new G e n e r a t e M a m d a n i F i s ( fcm ) ;

2.1.3 Data sets for inputs and outputs

A dataset (ArrayList) that contains all data from each of the inputs and outputs must be used to build the FIS. In order to show this particular example, we are going to use a 2 inputs array and 2 outputs array.

2.1.4 Generate FIS using Fuzzy C-Means

Once you have created the inputs and outputs data sets, the FIS can be generated. Listing 3 shows how to generate a FIS using “ GaussUncertaintyMeanMemberFunction ” membership function.

Listing 3: Generating FIS Type Mamdani .

// V a r i a b l e to e s t a b l i s h u n c e r t a i n t y

d o u b l e u n c e r t a i n t y =0.8;

// C a l l i n g the m e t h o d to g e n e r a t e Mam d ani FIS t y p e m e m b e r s h i p f u n c t i o n ‘‘ G a u s s U n c e r t a i n t y M e a n M e m b e r F u n c t i o n ’’ c h o o s i n g u n c e r t a i n t y .

(6)

M a m d a n i fis = g M a m F i s . g e n e r a t e M a m d a n i F i s G a u s s

U n c e r t a i n t y M e a n M e m b e r F u n c t i o n ( inputList , outputList , u n c e r t a i n t y ) ;

S y s t e m . out . p r i n t l n ( fis . t o S t r i n g () ) ;

You can use different Member Function options to obtain the desired FIS. The generated FIS object contains a full Mamdani or Takagi-Sugeno Fuzzy Logic System functionality and can be used into Java code to build JT2FIS intelligent applications.

2.2

JT2FIS Subtractive Data Mining Operation

In order to exemplify how to use JT2FISClusteing to create an FIS using Fuzzy Subtractive method, we are going to follow the next steps:

1. Create a new instance of the class Subtractive.

2. Create a new instance of the class GenerateSugenoFis. 3. Create a list with all the dataset for inputs and outputs. 4. Generate FIS, selecting the required membership function.

2.2.1 Creating a new instance of the class Subtractive.

First, we must create a new instance of the class Subtractive and define the range of influence of the cluster for each input and output dimension that you want to generate. Listing 4 shows Java code example.

RADII specifies the range of influence of the cluster centre for each input and output di-mension.

Listing 4: Creating a New Instance Subtractive

// Range of i n f l u e n c e of the c l u s t e r s

double radii =0.5;

S u b t r a c t i v e sbt = new S u b t r a c t i v e ( radii ) ;

2.2.2 Creating a new instance of the classGenerateSugenoFis

To build a FIS, we must create a new instance of the class GenerateSugenoFis. This instance receives an object as a parameter that represents the type of clustering method you want to use. Listing 5 shows a Java code example.

Listing 5: Creating a New Instance GenerateSugenoFis

// fcm is a i n s t a n c e of the class S u b t r a c t i v e .

G e n e r a t e S u g e n o F i s g S u g e n o F i s = new G e n e r a t e S u g e n o F i s ( sbt ) ;

2.2.3 Data sets for inputs and outputs

A dataset (ArrayList) that contains all data from each of the inputs and outputs must be used to build the FIS. In order to show this particular example, we are going to use a 2 inputs array and 2 outputs array.

(7)

2.2.4 Generate FIS using Subtractive

Once you have created the inputs and outputs data sets, the FIS can be generated. Listing 6 shows how to generate a FIS using “ GaussUncertaintyMeanMemberFunction ” membership function.

Listing 6: Generating FIS Type Sugeno.

// V a r i a b l e to e s t a b l i s h u n c e r t a i n t y

d o u b l e u n c e r t a i n t y =0.9;

// C a l l i n g the m e t h o d to g e n e r a t e Sugeno FIS type m e m b e r s h i p f u n c t i o n ‘‘ G a u s s U n c e r t a i n t y M e a n M e m b e r F u n c t i o n ’’ c h o o s i n g u n c e r t a i n t y . S u g e n o fis = g S u g e n o F i s . g e n e r a t e S u g e n o F i s G a u s s U n c e r t a i n t y M e a n M e m b e r F u n c t i o n ( inputList , outputList , u n c e r t a i n t y ) ; S y s t e m . out . p r i n t l n ( fis . t o S t r i n g () ) ;

You can use different Member Function options to obtain the desired FIS. The generated FIS object contains a full Mamdani or Takagi-Sugeno Fuzzy Logic System functionality and can be used into Java code to build JT2FIS intelligent applications.

3

VALIDATING JT2FISCLUSTERING

One of the main motivations of JT2FIS and JT2FISClustering is not to substitute or displace MatlabR libraries to analyze data, but to create a Java library for building object oriented Java applications.

To validate the proposed library, we compared JT2FISClustering against the methods pro-vided by MatlabR, genfis3 and genfis2. We expected that JT2FISClustering were able to gener-ate the same outputs that the MatlabR libraries in order to validate accuracy and performance. So first, we generated a data set randomly created using the Mersenne Twister pseudo- ran-dom numbers generation method[8] in order to use it as inputs and output in JT2FISClustering as well as MatlabR.

3.1

JT2FIS Fuzzy C-Means Data Mining Test

For this first test case, we used two inputs, two outputs and three rules (numbers of clusters) for 100,1,000 and 10,000 dataset. We selected the Gaussian membership function with uncertainty in the middle (“GaussUncertaintyMeanMemberFunction”) for inputs and outputs.

In Table 3 the time that each method took to generate the inference system is shown. We can see that the JT2FISClustering library is as fast as the genfis3 method provided by MatlabR.

3.2

JT2FIS Subtractive Data Mining Test

For this second test case, we used two inputs, two outputs for 100, 1,000 and 10,000 dataset. We selected the Gaussian membership function with uncertainty in the middle (“GaussUncer-taintyMeanMemberFunction”) for inputs and outputs and the cluster influence range was 0.9. In Table 4 the time that each method took to generate the inference system is shown. We can see that the JT2FISClustering library is as fast as the genfis3 method provided by MatlabR.

(8)

Table 3: Comparing times(ms) of FuzzyCMeans JT2FISClustering to genfis3 Matlab.

Number of Datas JT2FISClustering Matlab

FuzzyCMeans Time(ms) genfis3 FuzzyCMeans Time(ms)

100 14.5 16.18

1000 94.86 172.81

10000 197.21 277.84

Table 4: Comparing times(ms) of Subtractive JT2FISClustering to genfis2 Matlab.

Number of Datas JT2FISClustering Matlab

Subtractive Time(ms) genfis2 Subtractive Time(ms)

100 81 90

1000 270 280

10000 4012 6492

3.3

JT2FISClustering Versus Matlab

R

In this study case the cancer dataset was taken proportionatelly for MatlabR. This dataset contains 9 inputs and 2 outputs. Inputs are a 9x699 matrix defining nine attributes of 699 biopsies, and outputs are a 2x966 matrix where each column indicates a correct category with a one in either element 1 or element 2.

It was compared the genfis2 and genfis3 methods of MatLabR versus the JT2FISClustering Methods with uncertainty 0. In case of genfis2 the radio was 0.3 and in case of genfis3 the type selected was Mamdani and the numbers of clusters was 3. The results were equal for this case, the same number of rules, with the same values for parameters of each member function of FIS. This is seen in the Table 5.

Table 5: Comparing Results of Methods JT2FISClustering to Methods MatLab.

Method JT2FISClustering MatLab

Substractive 2 rules 2 rules

2 members functions for each Input 2 members functions for each Input 2 members functions for each Output 2 members functions for each Output

Fuzzy C-Means 3 rules 3 rules

3 members functions for each Input 3 members functions for each Input 3 members functions for each Output 3 members functions for each Output

Figure 4 and 5 show values for an input for each method of MatLabR and JT2FISClustering.

4

Conclusion and future work

On this paper, we introduced JT2FISClustering, a data mining extension for JT2FIS class library. The proposed extension library includes the basic fuzzy c-means and subtractive data mining methods to extract information from the data set and transform it into an Interval Fuzzy Inference System. The methods help to configure Mamdani and Takagi-Sugeno Java

(9)

Figure 4: Comparing a Input of Subtractive JT2FISClustering to genfis2 Matlab.

Figure 5: Comparing a Input of Fuzzy C-Means JT2FISClustering to genfis3 Matlab.

FLSs in JT2FIS Java applications. First, we presented a brief description of the library. Some UML diagrams were depicted to show its components in a software point of view. Then, a code example was listed to help to understand how to use it by Java programmers. Finally, a set of performance tests was applied to compare data mining methods with Matlab.

The JT2FISClustering library operations showed the comparison times(ms) of FuzzyCMeans JT2FISClustering versus genfis3 and Subtractive versus genfis2 in Matlab; in both cases JT2FISClustering proved to be as fast as Matlab.

As future work, we will continue working on improving performance and accuracy applying more tests, and planning continues refactoring and adding more data mining methods for Fuzzy Logic Systems to the library.

JT2FIS and JT2FISClustering is available for academic purposes in http://kiliwa.tij.uabc.mx

Acknowledgments

This work was supported in part by the Internal Fund for Research Projects (Grant No. 300.6.C.135.17) of the Autonomous University of Baja California, M´exico.

(10)

References

[1] J. Bezdek. FCM: The fuzzy c-means clustering algorithm.Computers and Geosciences, page 1023 and 191203, 1984.

[2] S. Bozkir and E. Sezer. FUAT - A fuzzy clustering analysis tool.Expert Systems with Applications, FUZZYSS11: 2nd International Fuzzy Systems Symposium, 40:403: 842849, 2013.

[3] Alvin. Cai, Chai. Quek, and Douglas L. Maskell. Type-2 ga-tsk fuzzy neural network. InIEEE, 2007.

[4] Manuel Casta˜n´on-Puga, Juan Ram´on Castro, Josu´e Miguel Flores-Parra, Carelia Guadalupe Gaxiola-Pacheco, Luis Guillermo Mart´ınez-M´endez, and Luis Enrique Palafox-Maestre. JT2FIS A Java Type-2 Fuzzy Inference Systems Class Library for Building Object-Oriented Intelligent Applications. InAdvances in Soft Computing and Its Applications, volume 8266 ofLecture Notes in Computer Science, pages 204–215, Berlin Heidelberg, 2013. Springer.

[5] Oscar Castillo, Patricia Melin, and Juan R. Castro. Computational intelligence software for interval type–2 fuzzy logic. Journal Computer Applications in Engineering Education, 2010.

[6] J Han and M. Kamber. Data Mining: Concepts and Techiques. Morgan Kaufmann, 2 edition, 2006.

[7] LA Lucas. General Type-2 Fuzzy Inference Systems: Analysis, Design and Computational Aspects.

Fuzzy Systems . . ., (5), 2007.

[8] Makoto Matsumoto and Takuji Nishimura. Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul., 8(1):3–30, Jan-uary 1998.

[9] Qun. Ren, Luc. Baron, and Marek. Balanzisnki. Type-2 takagi-sugeno-kang fuzzy logic modeling using subtractive clustering. InIEEE, 2006.

[10] Two-Crows. Introduction to Data Mining and Knowledge Discovery. Two Crows Corporation, 1999.

[11] V. Vaidehi, S. Monica, Sheik Saafer S. Mohamed, M. Deepika, and S. Saangeetha. A prediction system based on fuzzy logic. InProceedings of the World Comgress on Engeineering and Computer Science, WCECS 2008, 2008.

[12] X. Yin, L. Khoo, and Y. Chong. A fuzzy c-means based hybrid evolutionary approach to the clustering of supply chain. Computers and Industrial Engineering, page 664 and 768780, 2013. [13] L. A. Zadeh. Fuzzy Sets. Information and Control, 8:338–353, 1965.

[14] L. A. Zadeh. Outline of a new approach to the analysis of complex systems and decision processes.

Figure

Figure 1: Type-2 fuzzy inference system block diagram.
Table 1: Content Library Classes JT2FISClustering Package JT2FISClustering Package Clustering Class Cluster
Table 2: Membership Functions of JT2FISClustering.
Table 5: Comparing Results of Methods JT2FISClustering to Methods MatLab.
+2

References

Related documents

Two possible events are meant to resume the activity of waiting tasks: an I/O operation on a SCSI or network device performed by other tasks or the expiration of their timer. When

In order to present that close analysis the specific work the breast care nurse participants offered as the nature and scope of their work associated with sexuality and body image

The principal findings of this benchmark study are that residents who own their manufactured home communities, commonly referred to as mobile home parks, have consistent economic

Despite the growth in research on masculinities and health help seeking behaviour we have little idea of how gender and ethnicity intersect to inform health help seeking

Figure 2-1: Impact of incremental and radical innovations on business 28 Figure 2-2: Innovation types according to HENDERSON and CLARK 30 Figure 2-3: Innovations from

William Chris Vineyards 500 Block Merlot, Texas Hill Country, 2012 Merlot/Merlot blends ($21 and more) Bronze Texas Texas Class Champion William Chris Vineyards Enchante, Texas

Text žalmov bol ako dodato ný materiál umiestnený za prvú as teologického diela nestoriánskeho biskupa Elíju (+ 940) z mesta al-Anbar. stor.; jeden ms. stor.) a dva biblické

They either relate bank lending to key macroeconomic variables (i.e. real GDP and interest rate), assess its relation to asset prices (i.e. house and stock prices), or examine its