A New Filled Function method for Smooth Clustering

(1)

A New Filled Function method for Smooth

Clustering

Qing Wu

School of Automation, Xi’an Institute of Posts and Telecommunications, Xi’an, Shaanxi, 710121 P. R.China

Email: [email protected]

Lixing Yuan

School of Automation, Xi’an Institute of Posts and Telecommunications, Xi’an, Shaanxi, 710121 P. R. China

Email: [email protected]

Abstract—The mathematical modeling of the clustering centers problem leads to a min-sum-min formulation which, has the significant characteristic of being strongly nondifferentiable. To overcome this difficulty, a new filled function method is proposed to find centers of clusters based on entropy technique. A completely differentiable non-convex optimization model for the clustering center problem is constructed. A parameter free filled function method is adopted to search for a global optimal solution of the optimization model. For the purpose of illustrating both the reliability and the efficiency of the method, a set of computational experiments was performed. Numerical results illustrate that the proposed algorithm can effectively hunt centers of clusters and especially improve the accuracy of the clustering even with a relatively small entropy factor.

Index Terms—nondifferentiable, entropy function, cluster centers, global minimizer, filled function method

I. INTRODUCTION

Clustering techniques have received attention in many areas [1-3], such as engineering, medicine, biology, data mining, information retrieval and document extraction, etc. Cluster analysis deals with the problems of classification of a set of patterns or observations, in general represented as points in a multidimensional space, into clusters, following two basic and simultaneous objectives: patterns in the same clusters must be similar to another (homogeneity objective) and different from patterns of other clusters [4]. The goal of clustering is to reduce the amount of data by categorizing or grouping similar data items together.

Clustering methods can be divided into two basic types: hierarchical and partition clustering. Within each

of the types there exists a wealth of subtypes and different algorithms for finding the clusters.

Hierarchical clustering proceeds successively by either merging smaller clusters into larger ones, or by splitting larger clusters. The clustering methods differ in the rule by which it is decided which two small clusters are merged or which large cluster is split. The end result of the algorithm is a tree of clusters called a dendrogram, which shows how the clusters are related. By cutting the dendrogram at a desired level a clustering of the data items into disjoint groups is obtained.

Partition clustering, on the other hand, attempts to directly decompose the data set into a set of disjoint clusters. The criterion function that the clustering algorithm tries to minimize may emphasize the local structure of the data, as by assigning clusters to peaks in the probability density function, or the global structure. Typically the global criteria involve minimizing some measure of dissimilarity in the samples within each cluster, while maximizing the dissimilarity of different clusters.

The mathematical model of clustering is a global optimization problem. Therefore different algorithms of mathematical programming can be applied to solve this problem. Some review of these algorithms can be found in [5] with dynamic programming, branch and bound, cutting planes, k-means algorithms being among them.

As mentioned in [6], Branch and bound algorithms are effective when the database contains only hundreds of records and the number of clusters is not large (less than 5) [5]. Different heuristics can be used for solving large clustering problems and k-means is one such algorithm. Different versions of this algorithm have been studied by many authors [7]. This is a very fast algorithm and it is suitable for solving clustering problems in large data sets. k-means gives good results when there are few clusters but deteriorates when there are many[5]. This algorithm achieves a local minimum of proble, however, results of numerical experiments presented [8] show that the best clustering found with k-means may be more than 50% worse than the best known one. Much better results have Manuscript received January 1, 2011; revised June 1, 2011; accepted

July 1, 2011.

(2)

been obtained with metaheuristics, such as simulated annealing, tabu search and genetic algorithms [9]. The simulated annealing approaches to clustering have been studied [10]. Application of tabu search methods for solving clustering problem has been studied.

An approach to cluster analysis problems based on bilinear programming techniques has been described in [11]. The paper [12] describes the global optimization approach to clustering and demonstrates how the supervised data classification problem can be solved via clustering. The objective function in this problem is both nonsmooth and nonconvex and this function has a large number of local minimizers. Problems of this type are quite challenging for general-purpose global optimization techniques. Due to the large number of variables and the complexity of the objective function these techniques, as a rule, fail to solve such problems. The model of clustering centers problem is the global optimization problem and the objective function in this problem has many local minima. However, global optimization techniques are highly time-consuming for solving many clustering problems. It is very important, therefore, to develop clustering algorithms based on optimization techniques that compute ‘‘deep’’ local minimizers of the objective function. The clustering algorithm proposed and studied in this paper is of this type and is based on nonsmooth optimization techniques. The algorithm provides the capability of calculating clusters step-by-step, gradually increasing the number of data clusters until termination conditions are met, that is it allows one to calculate as many cluster as a data set contains with respect to some tolerance.

A cluster can be identified by its center (or centroid), thus the core of a cluster is a center problem. The clustering centers problem is a non-smooth non-convex problem. A smooth function [2] with one parameter has been used to approximate the non-smooth term of the clustering problem. However, the further analysis shows that the method has the following deficiency [13,14]. When the parameter

p

increases, the overflow problem will occur. To overcome this drawback, an adjustable entropy function is proposed to approximate the non-smooth term of the clustering problem in this paper. The clustering centers problem is transformed into a smooth non-convex problem. By using the adjustable entropy function, we can find an optimal solution with a relatively small parameter

p

, which can avoid the numerical overflow in the traditional maximum entropy function method.

The filled function methods [15-19] converge more rapidly, and can often find a solution with higher precision. In this paper, we propose a new parameter free filled function method to search for a global optimal solution of the smooth non-convex cluster centers problem. The filled function includes neither exponential terms nor logarithmic terms, so it is superior to the traditional ones. It avoids the knottiness of choosing parameters with parameter free filled function method.

The rest of this paper is organized as follows: In Section 2, the mathematical models of the smooth clustering center problem are constructed. Section 3 proposes a new filled function without parameters and analyzes the properties of the filled function. An algorithm is given for solving the clustering center problem in Section 4. In the last section, the authors present numerical experiments and results.

II. THE SMOOTH CLUSTERING CENTERS PROBLEM In cluster analysis, suppose we are given a setx( , ,x x1 2,xq), wherexiis n-dimensional centers of clusters withxiRn, i1,,q. Then the clustering centers problem can be described as follows

1 2 2

1, , 1 1 2

1

min ( , , , ) min || ||

. . ( , , , ) m

q j

j q

i

q n q

f x x x x a

m

s t x x x x R

 



 

 



_



i

, (1)

where f x( ) is a cluster function. The optimization problem (1) is non-convex and non-smooth. It is difficult to find the global minimum with the traditional optimization methods using gradient.

A smooth function [2] with one parameter has been used to approximate the non-smooth term of the clustering problem. However, with the increasingp, the overflow problem will occur.

Now we consider the minimax problem

1

min max_k i( ) i l

x R   f x , where

1

: k i

f R R

i

is a continuously differentiable function, l is a positive integer and . The maximum entropy function of

2

l ( )

f x ( ) is

defined as follows

1,  ,

i l

1

( ) ln exp( ( ))

l

p i

i

F x pf x

p 

 

 _ _





, (2) where the nonnegative parameter is a real number. The maximum entropy function can be employed to approximate the maximum function . The solution to can not approximate the optimal solution to the minimax problem

p

mi

1

( ) max _i( ) i l

F x f x

  

1

n max_k _i( ) i l x R ( )

p

F x

f x  

 until

the entropy factor is sufficiently large. However, with the increasing , the overflow problem can be found similarly.

p p

To overcome the above drawback, an adjustable entropy function is proposed to approximate the optimal solution to the minimax problem

1

min max_k i( ) i l

x R   f x in this

paper with smooth penalization technique.

Suppose is a continuously differentiable function, is a positive integer and Let

, then is a non-smooth function on

1

: k i

h R R l

1 2

ax{ ( ),h x h x( ),

2.

l ( )

H x

( ) m , ( )}_l

H x   h x

k

R .

Let { | 0, 1, , , i

l i

R i l

   1}

(3)

( , )

0

1

( ) ln exp( ( )

l

p i

i

F x ph x

p

 



 

 _ _





i  (3) is called the adjustable entropy function of , where

( )

i

h x

0,

p  . Let

( ) { | _i( ) ( ), 1, 2, , }

I x  i h x H x i l ,

{ | ( ), 0

I  i I x 

     i }.

Theorem 2.1 For   and 0 , the quality

max ( , )

ln  ( )x H x( ) )}

x 

0

i I  p Fp 

p 

holds, where , and if then the function is uniformly convergent to .

max max{ ,i i I(

  

( , )p ( )

F _ x

, ( )

H x

Proof. It is not difficult to obtain

( , )

0

1

( ) ln exp( ( )) l

p i

i

F x pH x

p      _ 



   ( ) H x 

For any i I x ( ), we have h x_i( )H x( ), then





( , )

1

( ) ln exp( ( )) ( ) 1ln .

p i i

F x pH x H x

p p

     

Thus, we have

( , )p ( ) ( ) (lni Imax )

F _ x H x   i p.

As the entropy factor goes to infinity, is uniformly convergent to .

p

(

H x

( , )p ( )

F _ x

)

Theorem 2.2. For x R m and , we have if and only if

p R_ 

( , )p ( ) ( )

F _ x H x  _I.

The above two theorems show that can fast approach , if we adjust parameter and

( , )p ( )

F _ x p

( )

H x at the

same time. This technique can overcome the overflow problem.

Obviously, by the above definition of the adjustable entropy function, we can approximate (1) with the following unconstrained differentiable optimization

1 2

1 1

1 2

min ( , , , )

1 1

ln exp( ( ))

. . ( , , , ) q q m j j i j

q n q

f x x x

pf x m p

s t x x x x R



       _    



 _  

 (4)

where f xj( ) || xjai||2 . f x( ) is non-convex and infinitely differentiable aboutx. With more and more relevant research, the filled function method becomes a promising way used in global optimization.

III. NEW FILLED FUNCTION METHOD

A number of filled functions [15-18] are described recently, most of which have one or two adjustable parameters. However, there is no efficient criterion to choose the parameter. In this paper, a filled function without parameter is proposed.

A. Preliminaries

Consider the following global optimization problem

min ( ) . .

f x

s t x X (5) Where f x( ) : RnR X, Rn

)

, and is a closed bounded domain containing all global minimizers of

X

(

f x in its interior.

The main idea of the filled function method is to construct an auxiliary function, i.e., the filled function

*

( , )

P x x , with a maximizer x1* which is a known

minimizer of the objective functionf x( ). The function

*

( , )

P x x does not have minimizers or saddle points in any basin of a higher minimizer of f x( ), but it has a minimizer or a saddle point x in a basin of a lower minimizer of f x( ). Then, one can use x as the initial point to minimize f x( ) again, thus finding a lower minimizer x2* of f x( ), i.e.,

* 2 (

* 1

( ) )

f x  f x . Using x2*

in place of x1*, one can construct a new filled function

and then find a lower minimizer of f x( ) in the same way. Repeating the above process, one can finally find the global minimizer x*of f x( ).

Suppose f x( ) is a continuous function on

R

n . Definition 3.1. Let

x

1

,

x

2

R

,

x

1

x

2

.

n

_



x

1



x

2

is called a descent (ascent) segment of f x( )

1 ( 2 1)) x t x x

  

[0,1]

t

if the

function is monotonically

increasing (decreasing) for . ( )t f(



Definition 3.2. The basin of f x( )

)

at an isolated minimizer is a connected domain which contains and in which starting from any point the steepest descent trajectory of

* 1

x

* 1 * 1

B

x

(

f x converges to , but outside which the steepest descent trajectory of

* 1

x

( )

f x

does not converge to

x

1*. Suppose is a maximizer of

* 1

x

( )

f x , the hill of f x( ) at

x

1* is the basin of f x( ) at its minimizer

x

1*.

Definition 3.3. Let and be two distinct minimizers of * 1

x

* 2

x

( )

f x . If 1* ) * 2

( ) (

f x  f x

* 2

B

, we say that the basin at is lower than the basin at or the basin is higher than the basin .

* 2

B

* 1

B

* 2

x

* 1

B

* 1

x

It is clear that if

B

₁*is the basin of f x( ) at an isolated minimizer

x

₁*, then f x( ) f x( 1*) holds for any

point

x



B

₁* and

x



x

₁*.

Definition 3.4. The simple basin at a local minimizer of

* 1

S

*

1

x

f x( ) is a connected domain in which the inequality

f x( ) (x x1*) 0 

   *

1 *

1

,

x

S

x





(4)

holds. In addition, if

x

₁* is an isolated minimizer of ( )

f x , then

D



min{||

x



x

₁*

||

|

x



S

₁*

}



0

, that is, the minimal radius of the simple basin is not

zero.

* 1

S

Definition 3.5. P x x( , *)is called a filled function of ( )

f x at a local minimizer x* if P x x( , *) has the following properties:

(1) x*is a strict local minimizer of P x x( , *); (2) P x x( , *)

1

has no stationary point or minimizer in set

, where ;

1

S * *

{ | ( ) ( ), \ { }}

S  x f x  f x x X x

(3) If x*is not a global minimizer of f x( ) ( )

and is one of the nearest local minimizers of

* 1

x

f x such that

* *

1

( ) ( )

f x  f x , that is S2{ | ( )x f x  f x( *),xX} intX

 , then there exists a point xS2that minimizes *

( , )

P x x on the line from x*to x1*, where * 1 x

 is in a neighborhood of x1*;

(4) For any x x1, 2X satisfying

* 1

( ) ( )

f x  f x and

* 2

( ) ( )

f x  f x , ||x2x*||( ) || x1x*||

* *

2 ) ( ,1 )

if and only if ( , ) (

P x x   P x x .

Lemma 3.1. If the function f x( ) 0.

attains a local minimum at

x

₁*, then f x( 1*)

B. A new filled function without parameters

We assume that the following conditions are satisfied throughout this paper.

Assumption 3.1. It is assumed that f x( )has only a finite number of local minimizers on X . when

( ) : n

f x R R

|| ||x  

is coercive, i.e. as

.

( )

f x  

Assumption 3.2 f x( ) is continuously differentiable on Rn.

Consider a new filled function without parameter P x x( , *) ( ( )f x  f x( *))3 x x * 2, (6) where x*is a current local minimizer of f x( ). The new parameter free filled function method avoids the knottiness of choosing parameters. In the form, our new filled function is simpler than the one proposed in the paper [19].

Theorem 3.1. P x x( , *) is continuously differentiable on Rn.

Theorem 3.2. Suppose that x* is a local minimizer of f x( ) , then x* is a strictly local maximizer of

*

( , )

P x x .

Proof. As we know, x*is a known local minimizer of f x( ). There exists a neighborhood O x( , )*  where

0

  such that f x( ) f x( *) for all x O x ( , )* . Therefore

* * 3

* *

( , ) ( ( ) ( )) || || 0 ( , ).

* 2

P x x f x f x x x

P x x

   

 

Thusx* is a strictly local maximizer ofP x x( , *).

From the above theorem, it is easy to see that for any

*

x B , where B*is a basin of f x( ), but x x *, it holds that f x( ) f(x*)and ( , *) * *

, ) (

P x x  ( )

P x x . Thus, it implies that the whole basin of f x becomes a part of a hill of P x x( , *).

Theorem 3.3. Support f x( ) f x( *) for x x *. Then it holds thatP x x( , *)0.

Proof. By the definition of the filled function

*

( , )

P x x , we have

* * 2

* 3 *

( , ) 3( ( ) ( )) ( )

2( ( ) ( )) ( ) P x x f x f x f x

f x f x x x

    

   .

Iff x( ) f x( *)andx x *，then the inequality

* 2

( ( )f x  f x( )) 0and ( ( )f x f x( ))* 30 hold. By Taylor’s formula, we have

f x( *) f x( )(x* x f x) ( )o x(|| *x||) From the condition f x( ) f x( *), we have

*

(x  x f x) ( )0. That is

*

(x x )f x( )0. Hence

* *

* 2 * * 3 * 2

( ) ( , )

3( ( ) ( )) ( ) ( )

2( ( ) ( )) || || 0

x x P x x

f x f x x x f x f x f x x x



 

    

   

Therefore, forx x *, the inequality P x x( , ) 0*  holds. In the filled function method, we minimize a filled function in order to find a point xˆ such that f x( )ˆ  f x( *)

( )

or even a next better minimizer of

f x . It will not stop when f x( )ˆ is no less than

*

( ) f x

*

( ) ( )

. The above theorem indicates that, when

f x  f x , P x x( , *) has no stationary point. Furthermore, notice that along the direction

*

( ) x x

d x   , we have

*

( , ) 0

d_P x x _

(5)

It demonstrates that the filled function decreases monotonically along d, which is a downhill direction. Theorem 3.4. Assume that x* is not a global minimizer of f x( ) and x₀* is another nearest local minimizer of f x( ) such that ( *) *

0 ( )

f x  f x . Then there is a point xB₀* that minimizes on the line through

*

( ,x ) P x

x and x₀* , for every x₀* in a certain neighborhood ofx₀*.

Proof. Since x* is a local minimizer of f x( )with its basinB₀*, there exists a neighborhood O x( , )*  where

0

 such that f x( ) f x( *) for all x O x ( ,* ) . Therefore

* * 3

( , ) ( ( ) ( )) || || 0

P x x   f x  f x x x * 2 . Let x₀* be another nearest local minimizer x0* of

( )

f x , there exists a certain neighborhood O x( ,*₀ ₀)of

* 0

x such that f x( *) f x( 0*) f x( 0*) for every

* *

0

( ,x

0 )

x O ₀ . It holds t hat

* _* * _* ₃ *

0 0 0

( , ) ( ( ) ( )) || || 0

P x x   f x f x x x* 

At first, P x x( ,* *)0. Whenx starts to go away fromx*, we haveP x x( , *)0. As x approaches x0*,

then P x( ,x*)0holds. Thus, P x x( , *) first decreases when f x( ) f x( *)

)

and then increases when

*

( ( )

f x  f x .

The filled function proposed in this paper indeed has neither exponential terms nor parameters. So it is superior to the traditional ones. Based on the theoretical results above, a filled function algorithm is described as follows.

Algorithm 3.1.

I. Initialization: 1. Set 0.1, 1.k

2. Choose direction , , with

integer , where n is the number of variables. i

e i1,,m

n 2 m

3. Choose an initial point x10X .

II. Main Program:

Step 1. Obtain a local minimizer x*_k of f x( ) by implementing a local downhill search procedure starting fromx_k0. Seti1.

Step 2. If i then terminate the iteration, otherwise go to Step 3.

m 

Step 3. Construct the filled function

* * 3

( , ) ( ( ) ( )) || *||2

P x x   f x  f x x x .

Step 4. x x 



e_i, if P x x( , _k*)0or *

( ) ( )_k

f x f x , then let k k 1, x0_k x and go to Step 1; If xgoes out of X , then i i 1and go to Step 2.

In this algorithm, we needn’t compute gradients of the objective of original function and the filled function during minimizing the filled function.

IV. ALGORITHM FOR SOLVING A CLUSTERING CENTER PROBLEM

Suppose that the centers of the first clusters have been calculated, then we need to calculate a center of next (k+1)-st cluster. Consider the following optimization problem

k

1* 2 * 2 2

1

min ( )_n min{|| || , || || ,|| || } m

k i k i

x R f x 



_i_ x a  x a x a

i

(7) where xj*is the given centers, j1,,k, and x is the next center need to find. The optimization problem (5) is a non-convex and non-smooth problem. We can use the following unconstrained differentiable optimization to approximate to it.

* 2

1 1

2 1

1 1

min ( ) ln[ exp( || || )

exp( || || )]

n

q m

k j

j

x R _i _j

i q

f x p x a

m p

p x a



 _ _



   

  



i

(8) Filled function method is adopted to search for a global

optimal solution of the approximation problem based on calculating centers step by step [6]. Presuppose that there are clusters (Generally, set ), and calculate the centers of the first clusters with filled function method. Then we calculate a center of the next cluster and refine the centers obtained. Repeat this process until the termination conditions are met.

0

q q₀1

0 q

Algorithm 4.1.

Step 1. (Initialization) Select a tolerance



0 and

0 0

q q  as the starting number of clusters. Select a starting point x1 1 1 0 0

1 1

{ ,x ,x_n, ,xq , ,xq_} nq0

n R



    

0

and solve the minimization problem (4) with Algorithm 3.1. Let x1*Rn q be a solution to problem (4) and f1*be the corresponding objective function value. Setk1. Step 2. (Computation of the next cluster center) Solve the optimization problem (8) with Algorithm 3.1.

Step 3. (Refinement of all cluster centers). Let

1,*

k

x 

be a solution to problem (8). Take

1,0 1* * 1,*

( , , , )

k k

x  x x xk



as a new starting point and solve the minimization problem (4) with Algorithm 3.1 (q k 1).

Step 4. (Stopping criterion). Let be a solution to the problem (4) and be the corresponding value of the objective function. If

1,* ( 1)

k n k

x  _R 1,*

k

f 

* 1,* 1*

k k

f f

f 



 _

, (9)

(6)

1 2

{ : ( , ),

1

| sin( ) |, 2 | cos( ) |, 1, ,50}, 2

k k k k

k k

M m R m m m

m k m k k

  

    

2 2

1 2

{ : ( , ),

1

(1 | sin( ) |), | cos( ) |, 1, ,50}, 2

k k k k

k k

M m R m m m

m k m k k

  

    

3 2

1 2

{ : ( , ),

2 | cos( ) |, 1.5 sin( ), 1, ,50}.

k k k k

k k

M m R m m m

m k m k k

  

     

One of the important questions when one tries to apply Algorithm 4.1 is the choice of the tolerance



0. Large values of



can result in the appearance of large clusters whereas small values can produce small and artificial clusters. In order to explain this, let us consider one artificial data set on R2 as shown in Figure 1.

Figure1 Results of three Clusters with our method

There are three isolated clusters in this data set given by the following formulae, respectively: If  101

2  _ 

then Algorithm 4.1 exactly calculates these three clusters, if

then this algorithm divides the third cluster into three clusters and leaves two other clusters unchanged. When

10



is smaller we have further division of these clusters. So if  is small enough we obtain some artificial clusters. The results of numerical experiments show that the best values for



are [102, 10 ]1 .

The center searched firstly is (1.2579, 1.5901). The centers searched secondly are (1.6841, 1.0697) and (0.3171, 2.6413). And the centers searched thirdly are (0.8336, 0.6358), (0.3176, 2.6363) and (2.6457, 1.4665). It is seen from Fig.1 that the cluster centers can be search effectively with our method in this paper.

V.EXPERIMENTAL RESULTS

In this section, we now demonstrate the effectiveness of our smoothing clustering algorithm with parameter free filled function (SCPF3) by comparing it with the smoothing and filled function method for clustering in data mining (SFMC) [2]. All experiments are run on a personal computer with a single Pentium IV processor (3.0 GHz). The parameters design and the stopping

criterion of Algorithm 4.1 are similar to those in [2]. In order to compare the effectiveness of the two algorithms SCPF3 and SFMC, we use two datasets from [2]. The first dataset has 150 data points and the points in every cluster obey the normal distribution. The dataset is divided into five groups. Each group has 50 data points and each point has two attributes. The size of the five groups datasets is different, ranging from 1.8 to 1.6, -1.2 to -1.0, -0.2 to 0.0, -1.2 to1.4 and 3.0 to 3.2 respectively. The second dataset is Liver Disorder from [20].

The parametersp and  are smoothing parameters, denotes the number of local search. CPU time denotes time of the algorithms performing in second, fval the value of clustering function.

d

N

TABLE I.NUMERICAL RESULTS OF TWO ALGORITHMS ON ARTIFICIAL DATASETS

No. Algorithm Parameters (p/μ) Nd

CPU

Time fval

SCPF3 _{(0.1, 16)} ₉₄ _3397.4 _{7.36 e-003}

1

SFMC ( 2000 , \) 94 3879.1 7.64e-003 SCPF3_(0.1,_{16) 92 7321.1}_7.32e-003

2

SFMC ( 2000 , \) 94 7445.7 7.92e-003 SCPF3 (0.1, 16) 97 6571.0 7.99e-003 3

SFMC ( 2000 , \) 100 7008.2 8.28e-003

The results presented in Table I show an efficient performance of the SCPF3 Algorithm on artificial datasets. The experiment results described in Table III show that the results vary with the different size of Liver Disorder dataset, where m is the number of data, n attribute of data and k the number of the clustering centers. SFMC has achieved good clustering results with a relatively large parameter p . In fact, with the increasingp, the overflow phenomenon can be found. By adjusting parameterspand



at the same time, we can find a global optimal solution of the approximation problems using SCPF3 algorithm with a relatively small parameterp. In Table I and Table III, the entropy factor

p was chosen not more than 20. The clustering time cost of SCPF3 almost equals that of SFMC. The experiment results in Table II show that SCPF3 algorithm improves the accuracy of the clustering with a relatively small parameterp. Numerical results demonstrate the efficiency of the proposed algorithm to find centers of clusters.

VI.CONCLUSIONS

(7)

TABLE II. CLUSTER CENTERS OF ARTIFICIAL DATASET BY TWO ALGORITHMS

No. Algorithm centers x*

SCPFN (-0.81,-0.53; -1.17,-1.12; -0.64,-0.52; 1.31,1.30; 3.03,3.16) 1

SFMC (-0.12,-0.06; -1.09,-1.08; -1.69,-1.70; 1.30,1.30; 3.11,3.10) SCPFN (-0.74,-0.47; -1.08,-1.19; -0.64,-0.72; 1.30,1.34; 3.15,3.20) 2

SFMC (-0.11,-0.14; -1.09,-1.13; -1.64,-1.70; 1.30,1.30; 3.10,3.07) SCPFN (-1.69,-1.62; -1.17,-1.12; -0.15,-0.04; 1.30,1.32; 3.13,3.03) 3

SFMC (-1.64,-1.68; -1.11,-1.12; -0.12,-0.06; 1.31,1.28; 3.15,3.08)

TABLE III. NUMERICAL RESULTS OF TWO ALGORITHMS ON LIVER DISORDER DATASET m n k  Algorithm Parameters

(p/μ) Nd CPU time fval

SCPFN (0.36, 10) 7 15772.0 5204.7

345 7 5 

SFMC _{( 1000, \)} ₇ _15609.3 _5321.2

SCPFN (0.36, 10) 9 23001.4 5094.5

345 7 6 

SFMC ( 1000, \) 9 24737.5 5320.4

SCPFN (0.4, 20) 11 27034.5 5217.5

345 7 7 

SFMC ( 1000, \) 11 30672.5 5319.8

SCPFN (0.56, 20) 13 4098.2 5157.5

345 7 8 

SFMC ( 1000, \) 14 49936.4 5305.1

zation problem. Numerical results demonstrate the reliability and efficiency of the proposed algorithm to find centers of clusters.

ACKNOWLEDGEMEN

This work was supported in part by the Nature Science Foundation of China under Grant (No.60974082, No.61075055) , Natural Science Foundation of Shaanxi Province (No.2010JQ8004) and Xi’an Institute of Post & telecommunications Science Foundation for Middle-aged and Young Scientists (No.110-0402).

REFERENCES

[1] A. E. Xavier, “The hyperbolic smoothing clustering

method,” Pattern Recognition, Vol. 43, pp.731-737, 2010.

[2] L. H. Zhu, X. L. Sun, “Smoothing and filled function

method for clustering in data mining,” Communication on Applied Mathematics and Computation (in Chinese), Vol. 21, pp. 10-16, 2007.

[3] C. W. Bong, M. Rajeswari, “Multi-objective nature-inspired clustering and classification techniques for image

segmentation,” Applied Soft Computing, Vol. 11, pp.

3271-3282, 2011.

[4] H.Späth, Cluster Analysis Algorithms for Data Reduction

and Classification, Ellis Horwood, Chichester, 1980. [5] P. Hansen, B. Jaumard, “Cluster analysis and mathematical

programming,” Mathematical Programming, Vol. 79, pp.

191-215, 1997.

[6] A.M. Bagirov, J. Yearwood, “A new nonsmooth

optimization algorithm for clustering problems,” European

Journal of Operational Research, Vol. 170, pp. 578-596, 2006.

[7] H. Spath, “Cluster Analysis Algorithms,” Ellis Horwood Limited, Chichester, 1980.

[8] P. Hansen, E. Ngai, B.K. Cheung, N. Mladenovic, “Analysis of global k-means, an incremental heuristic for

minimum sum-ofsquares clustering,” Journal of

classification, Vol. 22, pp.287-231, 2005.

[9] C.R. Reeves (Ed.), Modern Heuristic Techniques for

Combinatorial Problems, Blackwell, London, 1993. [10] L.X. Sun, Y.L. Xie, X.H. Song, J.H. Wang, R.Q. Yu,

“Cluster analysis by simulated annealing,” Computers and

Chemistry, Vol. 18, pp.103-108, 1994.

[11] O.L. Mangasarian, “Mathematical programming in data

mining,” Data Mining and Knowledge Discovery, Vol.1

pp.183-201, 1997.

[12] A.M. Bagirov, A.M. Rubinov and J. Yearwood, “Using global optimization to improve classification for medical

diagnosis and prognosis,” Topics in Health Information

Management, Vol. 22, pp.65–74, 2001.

[13] Q. Z. Yang, “Adjustable entropy function method,” Mathematica Numerica sinica (in Chinese), Vol. 23, pp.81-86, 2001.

[14] Q. Wu, S. Y. Liu and L. Y. Zhang, “Adjustable entropy

function method for support vector machines,” Journal of

Systems Engineering and Electronics, Vol. 19, pp.1029-1034, 2008.

[15] S. X. He, W. L. Chen, H. Wang, “A new filled function algorithm for constrained global optimization problems,” Applied Mathematics and Computation, Vol. 217, pp. 5853-5859, 2011.

[16] C. J. Wang, Y. J. Yang and J. Li, “A new filled function

(8)

Computational and Applied Mathematics, Vol. 225, pp.68-79, 2009.

[17] Y. J. Yang, Y. M. Liang, “A new discrete filled function

algorithm for discrete global optimization,” Journal of

Computational and Applied Mathematics, Vol. 202, pp. 280-291, 2007.

[18] Y. J. Lin, Y. J. Yang, “Filled function method for

nonlinear equations,” Journal of Computational and

Applied Mathematics, Vol. 234, pp.695-702, 2010. [19] S. Z. Ma, Y. J. Yang and H. Liu, “A parameter free filled

function for unconstrained global optimization,” Applied

Mathematics and Computation, Vol. 215, pp.3610-3619, 2010.

[20] P. M. Murphy, D.M. Aha. Uci repository of machine

learning databases. Technical report, Department of information and Computer Science, University of Cali1fornia, Irvine, 1992. http:// www.ics.uci.edu/mlearn/ MLRepository.html.

learning and Robots.

Qing Wu was born in 1975. She received the M.E. and Ph.D. degrees in Applied Mathematics from Xidian University, Xi’an, China in 2005 and 2009, respectively. Now she is working at School of Automation, Xi’an Institute of Posts and Telecommunications, Xi’an, China. Her research interests include pattern recognition, data mining and machine learning.