Lee04b

(1)

Network Center Selection

Dae-Won Lee and Jaewook Lee Department of Industrial Engineering, Pohang University of Science and Technology,

Pohang, Kyungbuk 790-784, Korea. {woosuhan,jaewookl}@postech.ac.kr

Abstract. In this paper, we propose a new method for selecting RBF centers. The strength of our method is to determine the number and the locations of RBF centers automatically without any priori assumption about the number of centers. The proposed method consists of three phases. The ﬁrst phase is to partition the input patterns into the several subsets according to their output labels. In the second and third phase, the number and the locations of RBF centers are determined using bi-section algorithm and weighted mean centering. These second and third phase are iteratively repeated until to reach the goal error. The proposed method is applied to several benchmark data sets. The numerical results show that our method is robust and eﬃcient for determining the number and the locations of centers.

1 Introduction

Radial basis function network (RBFN), due to the simplicity of its single hidden layer structure and universal property, has been widely used for nonlinear func-tion approximafunc-tion and pattern classiﬁcafunc-tion. Training of the RBFNs consists of selecting centers of the hidden neurons and estimating the weights that connect the hidden and the output layers. Once centers have been ﬁxed, the network weights will be directly estimated by using the least squares algorithm.

The generalized radial basis function network(GRBFN) involves searching for a suboptimal solution in a lower-dimensional space that approximates the interpolation solution where the approximated solution_F∗(x) can be expressed as follows

F∗(x) = m

i=1

wiφ(x−ti) (1)

where the set of RBF centers{t_i|_i= 1_{, . . . , m}} is to be determined.

One of the key issues in the design of RBFN specially is how to determine the number and the locations of the RBF centers. In the recent researches, a variety of ways for determining the locations of centers have been proposed. Previously reported approaches for RBF center selection include random selection from

F. Yin, J. Wang, and C. Guo (Eds.): ISNN 2004, LNCS 3173, pp. 350–355, 2004. c

(2)

input patterns [1], and selection of centers based on clustering algorithms [2], [3]. These methods have some drawbacks in that they cannot determine the number and the locations of centers at the same time and do not consider the information of the supervised data.

In this paper, to overcome such drawbacks, the proposed method consists of three basic ingredients. Firstly, input patterns are partitioned into several subsets according to their output labels and the RBF centers are determined separately for each subset in order to reﬂect information of the output class label. Then these centers are combined to form a larger superset to be used in the ﬁnal RBFN. Secondly, the proposed algorithm determines the number of the centers by using bi-section algorithm. Finally, the proposed algorithm determines the optimal locations of the centers by employing a weighted mean centering.

The organization of this paper is as follows : In Section 2 a new method for RBF center selection was explained. Section 3 presents an algorithm for the proposed method, and Section 4 presents experimental results applied to benchmark problems, followed by conclusions in Section 5.

2 The Proposed Method

2.1 Phase I: Partitioning Supervised Data

The proposed method is basically similar to the selection of centers based on clustering algorithm which is most widely used for center selection. However this kind of approaches deal with input patterns as unsupervised data during the center selection step even though supervised data are available. When the output class label is not considered, some of centers is often found to be located in the boundary between several classes, in which case these centers fail to play a role of a good feature extractor in the GRBFN.

In phase I, we partition training data into_Cdisjoint subsets{D_i}C_i₌₁ accord-ing to its class to reﬂect the distribution of input pattern for each output class. Then, with each subset, cluster centers are found using clustering algorithm to be explained in Section 2.2 and 2.3. Finally, we use _C disjoint sets of cluster centers as GRBFN centers.

2.2 Phase II: Determining the Number of Centers by Bi-section Algorithm

(3)

Fig. 1.Flow diagram for center selection (Phase II and III). Where Error(λ) is a mis-classiﬁed rate with theλnumber of centers

2.3 Phase III: Determining the Locations of Centers Using Weighted Mean Centering

Determining the locations of the RBF centers is based on optimizing the follow-ing objective function with respect to (t_,w).

E= 1 2

N

i=1



_d_i₋ki j=1

wjφ(xi−tj)  

2

(2)

From the necessary optimality condition, positions of centers is given by

∂E ∂t_j =

1

σ2wj

N

i=1

(d−_Φw)_i_Φ_ij(x_i−t_j) = 0 (3)

t_j = _N

i=1(d−Φw)iΦijxi

_N

i=(d−Φw)iΦij

= N

i=1

(d−_Φw)_i_Φ_ij _N

i=1(d−Φw)iΦij

x_i

= N

i=1

λj(x_i_,t_,w)x_i_, ∀_j= 1_{, . . . , k}_i

(4)

whereN_i₌₁_λ_j(x_i_,t_,w) = 1_, ∀_j= 1_{, . . . , k}_i,_Φ= [_φ(x_i_,t_j)]_i=1_{,... ,ni, j}=1_{,... ,ki},

d= [_d1, d2, . . . , d_N]T and w = [w1, w2, . . . , w_m₁]T. It shows that the estimate

(4)

t_j directly. Instead, we employ a so-calledweighted mean centering scheme to implement this, which consists of two steps. In the ﬁrst step, center positions

t_j of Eq. (4) are approximated as a simple average of the x_i that has higher absolute value of the numerator of _λ_j(x_i_,t_,w). This process is repeated until no change is made. (See Section 3 for more details about this.) In the case of non-convex data set, the obtained centers in the ﬁrst step are often found to be located in the regions of other classes, even though we have partitioned data sets according to their classes. To avoid this problem, in the second step, we modify RBF centers into the nearest points within the partitioned subsetDi.

3 Algorithm

An algorithm of center selection for the proposed method is as follows. % Phase I: Partitioning supervised data

1. Separate_N training data into_Cdisjoint subsetsD_i containing_n_ielements, according to class.

{(x_i_{, d}_i)}_iN₌₁→ D1∪ D2∪, . . . ,∪DC

Di={(x1, i),(x2, i), . . . ,(xni, i)} fori= 1, . . . , C

% Selecting_k_i centers for each subsetDi 2. for_i= 1 to_C do

% Phase II: bi-section algorithm

2.1. Determine the number of centers (_k_i) using bi-section algorithm as explained in Fig. 1: initially,_k_i= 1+₂ni.

% Phase III: weighted mean centering

2.2. Determine the locations of_k_i centers using weighted means centering. 2.2.1. Choose random values for the initial centersTi={tj}ki_j₌₁from input

space. Wheret_j is thejth cluster center of subsetD_i. % Adjust the centers

2.2.2. for_l= 1to_n_i do

2.2.2.1. Let Ii(x_l) denote the index of the best-matching center for the input vectorx_l∈ Di as follows.

Ii(xl) = arg max_j (d−Φw)lΦlj, j= 1,2, . . . , ki

2.2.2.2. Adjust the centers using the update rule

t_j ←t_j+_η(x_l−t_j)_, _j=I_i(x_l) (5) where _η is a learning step size.

2.2.3. Continue the center adjusting procedure (step 2.2.2) until no change are observed in the centers{t_j}ki_j₌₁

(5)

Table 1.Benchmark data description

Input dimension Number of classes Number of patterns

2-Spirals 2 2 388 [194,194]

Sonar 60 2 104 [55,49]

Heart 13 2 180 [98,82]

Vowel 10 11 528[48,48,48,48,48,48,48,48,48,48,48]

bracketed numbers mean the number of patterns for each class.

t_j= the nearest pointx∈ D_i for_j= 1_,2_{, . . . , k}_i 2.3. Repeat Step 2.1.∼2.2. until converge to goal error.

% Complete the RBFN training

3. Combine the centers of_Cdisjoint subsets{D_i}C_i₌₁and construct generalized RBF network by using pseudo-inverse.

T ={tj}K_j₌₁← T1∪ T2∪, . . . ,∪TC

w= (_ΦT_Φ+_λΦ0)−1ΦTd

where _Φ = [_φ(x_i_,t_j)]_i=1,... ,N, j=1,... ,K, Φ0 = [φ(xi,tj)]i,j=1,... ,K, and K =

_C i=1ki .

4 Simulation Results

The algorithm described in the previous section has been simulated on four kinds of benchmark data sets (2-spiral, sonar, heart, vowel). Description of the benchmark data sets is given in Table 1.

The performance of the proposed method is compared with two widely used center selection methods. In Table 2, KM is the k-means based center selec-tion without partiselec-tioning andRS israndom selection from training data without partitioning. For these two methods, the number of centers is determined by in-creasing centers one by one until they achieve the goal error. For the comparison we adopted three criterion: the number of centers, the mis-classiﬁed rate, and the computing time. Simulation results are shown in Table 2. The results show that the proposed method achieves better accuracy with a slightly fewer number of RBFN centers while signiﬁcantly reducing computing time.

5 Concluding Remarks

(6)

Table 2.Simulation results on four benchmark problems

Method KM RS Proposed

m E T m E T m E T

2-Spiral 84 0.069 5650 88 0.080 5925 80 [35,45] 0.064 297 Sonar 37 0.096 1570 37 0.096 1112 34 [17,17] 0.096 21 Heart 129 0.97 11876 131 0.99 9723 129 [68,61] 0.093 125 Vowel 66 0.099 3658 64 0.099 2465 59 [2,5,5,2,8,7,3,2,9,2,14] 0.099 188

mis the number of centers and bracketed numbers are the number of centers for each class. E is mis-classiﬁed error rate and T is computing time to construct the GRBFN.

the information of output class label. Finally, it is robust to a data set with non-convex distribution.

Experimental results show that the proposed method is competitive with the previously reported approaches for RBF center selection. Other methods to improve eﬃciency of RBF center selection, such as Homotopy method [5], [6] can be also be investigated.

Acknowledgement. This work was supported by the Korea Research Founda-tion under grant number KRF-2003-041-D00608.

References

1. Mao, K.Z.: RBF Neural Network Center Selection Based on Fisher Ratio Class Separability Measure. IEEE Trans. Neural Networks, Vol. 13(5) (2002) 1211-1217 2. Gomm, J.B., Yu, D.L.: Selection Radial Basis Function Network Centers with

Recur-sive Orthogonal Least Squares Training. IEEE Trans. Neural Networks, Vol. 11(2) (2000) 306-314

3. Haykin, S.: Neural Networks: A Comprehensive Doundation. Prentice Hall, New York (1999)

4. Cover, T.M.: Geometrical and Statistical Properties of Systems of Linear Inequali-ties with Applications in Pattern Recognition. IEEE Trans. Electronic Computers, Vol. EC-14 (1965) 326-334

5. Lee, J., Chiang, H.-D.: Constructive Homotopy Methods for Finding All or Multiple DC Operating Points of Noninear Circuits and Systems. IEEE Trans. on Circuits and Systems- Part I, Vol. 48-(1) (2001) 35-50