• No results found

Neural network based model design by MOGA

1. Introduction

2.7. Neural network based model design by MOGA

The problem of designing a neural network based model can be divided into two sub- problems as follows [4]:

๏‚ท Neural network structure: It denotes the network inputs, the number of hidden layers, and the number of neurons in each layer.

๏‚ท Neural network parameters: They depend on the model chosen and are usually determined by a suitable learning algorithm.

Since the RBFNN models considered in this thesis were designed by a MOGA, the remaining of this section details the MOGA application to the design of RBFNN models for classification and regression problems.

The output of a RBFNN model is given by Eq. (2.57):

๐‘œ[๐‘˜] = ๐‘ค๐‘™+1+ โˆ‘ ๐‘ค๐‘š๐‘’ โ€–๐’Š๐‘—[๐‘˜]โˆ’๐‘ช(๐‘š)]โ€–22 2๐œŽ๐‘š2 ๐‘™ ๐‘š=1 (2.57)

In Eq. (2.57), ๐‘œ[๐‘˜] and ๐’Š๐‘—[๐‘˜] denote the model output and the ๐‘—th input at time instant ๐‘˜,

respectively. ๐’˜ represents the vector of the linear weights, ๐‚(๐‘š) refers to the vector (extracted from the ๐‚ matrix) of the center associated with the ๐‘šth hidden neuron, ฯƒ

m is its

corresponding spread, and 2 represents the Euclidean distance. The network parameters which will be denoted as the parameter vector ๐ฉ, are therefore ๐‚, ๐›” and ๐ฐ. In order to design a RBFNN model that satisfies a set of defined goals, it is necessary to define a set of quality measures in the form of objectives for each sub-problem mentioned above.

Assume that ๐‘ซ = (๐‘ฟ, ๐’š) is a data set composed of ๐‘ input-output pairs, which is divided into a training set, ๐‘ซ๐‘ก, a generalization or testing set ๐‘ซ๐‘” and a validation set ๐‘ซ๐‘ฃ. Assume also that

๐น is a set of all possible input features (delayed values of the modeled and exogenous variables in time-series regression problems). The problem of designing RBFNN model by MOGA can be expressed as follows:

The Dataset ๐‘ซ, the allowed range ๐‘‘ โˆˆ [๐‘‘๐‘š, ๐‘‘๐‘€] of input features from ๐น and the range

๐‘› โˆˆ [๐‘›๐‘š, ๐‘›๐‘€] of hidden neurons are given as design parameters to the MOGA. After the

38

๐œ‡๐‘ and ๐œ‡๐‘  denote a set of objectives related to the RBFNNโ€™s parameters ๐ฉ and its structure, respectively. ๐œ‡๐‘  includes only one objective,

๏€จ ๏€ฉ

s O

๏ญ ๏€ฝ ๏ƒฉ๏ƒซ ๏ญ ๏ƒน๏ƒป (2.58)

that denotes the model complexity which is a function of the number of input features and the number of the hidden neurons.

Since the specification of ๐œ‡๐‘ is different in the classes of problems considered, the following subsections address the specification of ๐œ‡๐‘ for each class.

2.7.1. Specification of ๐๐’‘ in classification problems

In classification problems, we are mainly interested to minimize ๐น๐‘ƒ and ๐น๐‘ criteria (see Section 2.5). Hence the corresponding objectives for ๐œ‡๐‘ are considered as:

๐œ‡๐‘ = [๐น๐‘ƒ๐‘ซ๐‘ก, ๐น๐‘๐‘ซ๐‘ก, ๐น๐‘ƒ๐‘ซ๐‘”, ๐น๐‘๐‘ซ๐‘”] (2.59)

where ๐น๐‘ƒ๐‘ซ๐‘ก and ๐น๐‘๐‘ซ๐‘ก denote the ๐น๐‘ƒ and ๐น๐‘ on the training set ๐‘ซ๐‘ก, respectively. Similarly,

๐น๐‘ƒ๐‘ซ๐‘” and ๐น๐‘๐‘ซ๐‘” refer to the ๐น๐‘ƒ and ๐น๐‘ on the testing set ๐‘ซ๐‘”, respectively.

2.7.2. Specification of ๐๐’‘ in regression problems

The specification of ๐œ‡๐‘ in for the case of regression problems relies on the minimization of

the error between model outputs and desired values. Therefore, the corresponding objectives for ๐œ‡๐‘ are defined as:

๐œ‡๐‘ = [๐œ€(๐‘ซ๐’•), ๐œ€(๐‘ซ๐‘”)] (2.60)

where ๐œ€(๐‘ซ๐‘ก) and ๐œ€(๐‘ซ๐‘”) denote the Root Mean Square Errors (RMSE) of the model

considering training ๐‘ซ๐‘ก and the testing set ๐‘ซ๐‘”.

2.7.2.1. Specification of ๐๐’‘ in time series prediction problems

Regarding time series prediction problems, the basic objectives specified for regression problems are also taken into account. Besides these, an additional objective, ๐œ€(๐‘ซ๐‘ , ๐‘ƒ๐ป), is

39

๐œ‡๐‘ = [๐œ€(๐‘ซ๐‘ก), ๐œ€(๐‘ซ๐‘”), ๐œ€(๐‘ซ๐‘ , ๐‘ƒ๐ป)] (2.61)

To understand ๐œ€(๐‘ซ๐‘ , ๐‘ƒ๐ป), assume ๐‘ฌ(๐‘ซ๐‘ , ๐‘ƒ๐ป) is an error matrix defined over the simulation

set ๐‘ซ๐‘  as expressed in Eq. (2.62), where ๐‘ซ๐‘  is composed of a number of consecutive samples

with respect to the time instant.

๐ธ(๐‘ซ๐‘ , ๐‘ƒ๐ป) = [ ๐‘’[1,1] ๐‘’[1,2] โ‹ฏ ๐‘’[1, ๐‘ƒ๐ป] ๐‘’[2,1] ๐‘’[2,2] โ‹ฏ ๐‘’[2, ๐‘ƒ๐ป] โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ ๐‘’[๐‘š โˆ’ ๐‘ƒ๐ป, 1] ๐‘’[๐‘š โˆ’ ๐‘ƒ๐ป, 2] โ‹ฏ ๐‘’[๐‘š โˆ’ ๐‘ƒ๐ป, ๐‘ƒ๐ป] ] (2.62)

where

e i j๏› ๏,

is the model prediction error taken from instant i of Ds at step j within the

prediction horizon PH. Denoting

๏ฒ๏€จ ๏€ฉ.,i

as the RMS function operating over the ith column of its argument matrix, then ๐œ€(๐‘ซ๐‘ , ๐‘ƒ๐ป) is defined as:

๐œ€(๐‘ซ๐‘ , ๐‘ƒ๐ป) = โˆ‘ ๐œŒ(๐‘ฌ(๐‘ซ๐‘ , ๐‘ƒ๐ป), ๐‘–) ๐‘ƒ๐ป

๐‘–=1

(2.63)

This value is proportional to the area below the curve defined by ๐œŒ(๐‘ฌ(๐‘ซ๐‘ , ๐‘ƒ๐ป), ๐‘–) for ๐‘– within

the prediction horizon, reflecting the model accuracy over the complete prediction horizon for the data set considered.

2.7.3. Model representation in MOGA

Each RBFNN model in the population has a chromosome representation consisting of two components. The first corresponds to the number of hidden neurons and the second one to a string of integers, each one representing the index of a particular feature in ๐น. The chromosome representation is shown in Fig. 2.17.

40

Fig. 2.17. Chromosome representation in MOGA.

Before being evaluated in the MOGA, each model has its parameters determined by a Levenberg-Marquardt algorithm [32, 33] minimizing the error criterion in Eq. (2.38) that exploits the linear-nonlinear relationship of the RBFNN model parameters [34, 50]. The initial values of the nonlinear parameters (๐‘ช and ๐ˆ) are chosen randomly, or by the use of a clustering algorithm, ๐’˜ is determined as a linear least-squares solution, and the procedure is terminated using the early-stopping approach [17] within a maximum number of iterations.

2.7.4. Model design cycle

There are three main actions in the model design cycle: problem definition, solution(s) generation and analysis of results. In the problem definition stage, the data sets, the ranges of features and neurons are defined, as well as the objectives. After this stage, the MOGA execution performs a search to obtain models that satisfy the predefined objectives and goals. In the third stage, the set of models obtained by the MOGA that lie in the Pareto front are analyzed. For this purpose, the performance of the models in the validation set (not involved in the training) is also considered and is of paramount importance. If good solutions are found, the process stops. Otherwise, based on the analysis of results, the search space can be reduced, and/or the objectives and goals can be redefined, therefore restricting the trade-off surface coverage. A more detailed description on the application of the MOGA to the design of ANN models can be found, for instance, in [4, 24].

41

Related documents