• No results found

7 Generator Functions and Gradient Descent Learning

The effect of the generator function on gradient descent learning al- gorithms developed for reformulated RBF neural networks essentially relates to the criteria established in Section 5 for selecting generator functions. These criteria were established on the basis of the response h

j;k of the

jth radial basis function to an input vector x

k and the norm of the gradient r

x k

h

j;k, that can be used to measure the sensitivity of the radial basis function response h

j;k to an input vector x k. Since r x k h j;k = r vj h j;k, (46) gives kr v j h j;k k 2 =kx k v j k 2 2 j;k : (71)

According to (71), the quantitykx k v j k 2 2 j;k

can also be used to mea- sure the sensitivity of the response of the jth radial basis function to changes in the prototypev

j that represents its location in the input space. The gradient descent learning algorithms presented in Section 6 attempt to train a RBF neural network to implement a desired input-output map- ping by producing incremental changes of its adjustable parameters, i.e., the output weights and the prototypes. If the responses of the radial ba- sis functions are not substantially affected by incremental changes of the

the output weights and eventually the algorithm trains a single-layered neural network. Given the limitations of single-layered neural networks [28], such updates alone are unlikely to implement non-trivial input- output mappings. Thus, the ability of the network to implement a desired input-output mapping depends to a large extent on the sensitivity of the responses of the radial basis functions to incremental changes of their corresponding prototypes. This discussion indicates that the sensitivity measure kr v j h j;k k 2

is relevant to gradient descent learning algorithms developed for reformulated RBF neural networks. Moreover, the form of this sensitivity measure in (71) underlines the significant role of the gen- erator function, whose selection affects kr

v j h j;k k 2 as indicated by the definition of

j;k in (48). The effect of the generator function on gradient descent learning is revealed by comparing the responseh

j;k and the sen- sitivity measure kr v j h j;k k 2 = kr x k h j;k k 2

corresponding to the linear and exponential generator functions.

According to Figures 2 and 3, the response h

j;k of the

jth radial basis function to the input x

k diminishes very slowly outside the blind spot, i.e., askx k v j k 2

increases aboveB. This implies that the training vec- torx

k has a non-negligible effect on the response h

j;k of the radial basis function located at this prototype. The behavior of the sensitivity mea- sure kr vj h j;k k 2

outside the blind spot indicates that the update of the prototypev

j produces significant variations in the input of the upper as- sociative network, which is trained to implement the desired input-output mapping by updating the output weights.Figures 2and3also reveal the trade-off involved in the selection of the free parameterÆ in practice. As the value of Æ increases, kr

v j h j;k k 2

attains significantly higher values. This implies that the jth radial basis function is more sensitive to up- dates of the prototypev

j due to input vectors outside its blind spot. The blind spot shrinks as the value ofÆincreases butkr

v j h j;k k 2 approaches 0 quickly outside the blind spot, i.e., as the value ofkx

k v j k 2 increases aboveB. This implies that the receptive fields located at the prototypes shrink, which can have a negative impact on the gradient descent learn- ing. Increasing the value of Æ can also affect the number of radial ba- sis functions required for the implementation of the desired input-output mapping. This is due to the fact that more radial basis functions are re- quired to cover the input space. The receptive fields located at the proto- types can be expanded by decreasing the value of . However, 2

becomes flat as the value ofÆdecreases. This implies that very small val- ues ofÆ can decrease the sensitivity of the radial basis functions to the input vectors included in their receptive fields.

According toFigures 4 and5, the response of thejth radial basis func- tion to the inputx

k diminishes very quickly outside the blind spot, i.e., askx k v j k 2

increases aboveB. This behavior indicates that if a RBF network is constructed using exponential generator functions, the inputs x

kcorresponding to high values of kr v j h j;k k 2

have no significant effect on the response of the radial basis function located at the prototypev

j. As a result, the update of this prototype due tox

k does not produce sig- nificant variations in the input of the upper associative network that im- plements the desired input-output mapping. Figures 4 and 5 also indicate that the blind spot shrinks as the value of increases whilekr

v j h j;k k 2 reaches higher values. Decreasing the value of expands the blind spot butkr v j h j;k k 2

reaches lower values. In other words, the selection of the value of in practice involves a trade-off similar to that associated with the selection of the free parameterÆ when the radial basis functions are formed by linear generator functions.