• No results found

Radial-Basis Function (RBF) Network

CHAPTER 7 RADIAL-BASIS FUNCTION (RBF) NETWORK

7.1 Radial-Basis Function (RBF) Network

This section explains the background of RBF networks, the mathematical background of RBF, and the incorporation of RBF into the artificial neural network family.

7.1.1 Radial-Basis Function (RBF)

Radial-basis function (RBF) was first established by Powell (1987) to solve inverse problems by interpolation, with the contribution of Micchelli (1986) who proved that the interpolation functions create a non-singular matrix. Powell described the motives of RBF study as: ^^When it is expensive to calculate an output value from input vectors, one may preserve all calculated function values, and then it may he suitable to use an interpolation procedure to construct an approximation to function)

F ’(PoweU, 1987, p. 146).

In the RBF algorithms, learning is viewed as a problem of hypersurface

reconstmction from a set of data points that may be sparse (Haykin, 1999 p. 265). Common to aU learning problems, the predictions tend to be unreliable when a data set is too big and contains too little information. Tikhonov & Arsenin (1977) defined a problem to be ‘‘'^wellposed' if the solutions are existent, unique, and continual

If any of these conditions is not satisfied, then the problem is defined as ''''ill-posed\

Régularisation was proposed by Tikhonov to solve the ill-posed problems (Haykin, 1999, p. 267). The idea is the assumption that the input-out function is

‘smooth’, which means that similar inputs could reach similar outputs (Haykin, 1999, p. 267). The regularised solution above can be viewed as taking the forms of superposition of N Gaussian functions (Haykin, 1999, p. 276).

7.1.2 Radial Basis Function (RBF) for Interpolation

Broomhead and Lowe (1988, p. 323) summarised an interpolation problem as to

choose a function s: R n ^ R which satisfies the interpolation conditions s (xj=fj i= l,2 .. .m, for a given set of m distinct vectors (data points), {x,.i=l,2.. .m} in R" and m real numbers (f,; i= l,2 ...m}; the function s is constrained to go through

the known data points.

RBF constructs a linearfunction space, which depends on the positions of the known data points

according to an arbitra^ distance measure fibidj. A set of arbitrary basis functions

(j)( X — y’.||] is introduced, where x e R " and | • | denotes a norm imposed on R " ,

which is the Euclidean distance between a pair of n-by-I vectors; y. G R" is the

centers of RBF functions. It could be any other form (e.g., the multiquadric form,

(j)(r) = -yj{c~ +r~) ) in interpolation theory, (|) employs Gaussian transformation

for the RBF network as (|)(r) = (Broomhead and Lowe, 1988, p. 329).

The interpolating function is (Broomhead and Lowe, 1988, p. 324):

■S'-. y=i

x b R" ... (7.1)

After replacing the known data, the interpolating problem could be expressed by the following linear equation (Broomhead and Lowe, 1988, p. 324):

' 4 , A Am]

M = M O M M

A ■^mm J

(7.2)

where ^,yA())|X i - y ,

The coefficients of unique solution can be obtained by least square method for the following equation (Broomhead and Lowe, 1988, p. 324):

(7.3)

7.1.3 RBF Method and Multi-Layered Neural Networks

Broomhead and Lowe (1988, p. 327) suggested that the interpolation function mapping produced by RBF summarised in previous section has the form of weighted sum over nonlinear function, and this naturally corresponds to the three-layer network system, as illustrated in Fig. 7.1.

Input layer Hiddendayer Output layer

Figure 7.1 RBF-Equivalent Feed-forward Layered Network (Source; B ro o m h ead & L ow e, 1988)

Regarding the structure, the input layer represent n nodes taking one n-dimension vector; the hidden layer is n^, nodes each with one n-dimensional vector; and the

output layer are n' scalar values. Although RBF network’s structure is similar to that of multi-layer network, the algorithms between input and hidden layers are faidy different: firstly, the input layer receives input signals as one n-dimensional vector, instead of n single scalar values; secondly, the input to hidden layer is the Euclidean distance between input vectors and hidden neuron (or centers), instead of weighted sums; thirdly, the transfer function of hidden layer is Gaussian function, instead of sigmoid function.

The transfer function in hidden layer are more like multi-dimension geometry approximation; the algorithms used in this exercise is adopted from Demuth & Beale (1992), as described below:

^{n) = radbas(n) = (7.4)

n is the input to the radial basis transfer function, « = ^ x ||jv — , ())(«)is

the output of RBF transfer function,

b is the bias of radial basis function, b — yj—/n{0.5) jspread ; spread is also

known as the width of radial basis function,

ll'll is Euclidean distance between a pair of m-by-1 vectors x. and Xj is

defined as (Haykin, 1999, p. 27):

I k - -

II =

S

-

^jk}

k=\

(7.5)

where and are the k* elements of the input vectors

In the application of RBF, “generalization is synonymous with interpolation between the data points with the interpolation being along the constrained surface

generated by the fitting procedure (Broomhead & Lowe, 1988, p. 325). The nonlinear mapping in RBF has been reduced to the problem in linear algebra, which has a guaranteed learning algorithm (Broomhead & Lowe, 1988, p. 325), and this feature makes RBF able to cure the problems caused by asymmetry between training data and degree of freedom in MLP network.

Haykin described the approach of RBF in multi-dimensional interpolation as:

^^Ljeaming is equivalent to finding a suface in a multidimensional space that provides a best fit

to the training data...”', he compared the generalisation approach of layered

networks to the approximation approach of RBF as; Generalisation is equivalent to

the use of multidimensional suface to interpolate the test datd’’ (Haykin, 1999, pp. 256).

The comprehensive history and development of RBF network could be found at Poggio & Girosi (1990).

Since RBF was originally developed from the interpolation theory, it still inherits the approaches in solving smooth surface built from sparse data, e.g., the régularisation theory. Ill-posed problems have been common tasks to tackle in input-output mapping, one of the approaches is the adoption of a priori knowledge,

which assumes that the mapping surfaces are “smooth” Hke Gaussian function (Poggio & Girosi, 1990) and reconstructs it as weU-posed problem. In addition to the over-fitting problems, this approach could be a remedy to the case of deficient database.

Related documents