• No results found

Transfer learning involves the improvement of learning in one task (target task), through the expert knowledge transfer from one or more other tasks (source tasks) [107, 108]. This can be achieved by starting the learning process from an improved initial condition, through an increased training rate or asymptotic performance [107,108]. Knowledge is only transferred from the source task to the target task, unlike in multi-task learning where knowledge or information can be passed among all tasks [107, 109].

To avoid negative transfer, where the source task has a detrimental affect on the performance of the target task, the source and target tasks must be related. The source tasks information needs to be understood by the target task, otherwise a “mapping” is needed [107]. Therefore, it is important to determine what information is to be transfered, how it can be transfered and when to transfer it [109].

Additional data or parameters from the source task can be transfered to the target task to improve its learning. For example, transfer learning has been used to improve the performance of neural networks by including additional output neurons in the network during training [110, 111]. The additional output neuron introduces connections back to common hidden neurons and these additional connections influ- ence other connections in the network, due to the backward phase of the training algorithm. The extra output is removed after training and improvements were seen on the training data for the target task [111], as well as an increase in convergence speed [110]. However, it can introduce negative transfer at later training epochs. Additional time series data sets have been used to help with a multi-step ahead prediction task. The source tasks are trained up to the final prediction horizon and these models are then used to help train the target task, which is trained for less steps. After training, the target task is used to predict to the prediction horizon [112]. Finally, weight values from other networks have been used as the starting point for training another network [113].

It is important to determine what the goals are of the transfer method and what metric will be used to determine if the transfer has been a success [108]. Included in this metric should be a decision as to whether the computational cost of the source task will be ignored or included in the total cost of developing the solution for the target task. For example, if the target task can be learnt on its own, without the source task, what is the time difference to learning with the source task?

2.7

Summary

This chapter has discussed the background information related to the complex and challenging nature of aerodynamic optimisation tasks. Due to the computational expense of evaluating aerodynamic perfor- mance using CFD, surrogate models can be adopted to imitate CFD to reduce the computation time of an aerodynamic optimisation search. Examples of different surrogate model types and where they have been used during aerodynamic optimisation tasks have also been discussed.

A convergence based surrogate, which is an alternative to the traditional surrogates that map from the design space to the objective space was introduced and it is this type of surrogate that will be investigated further and form the basis of the thesis. This model has been selected as it is not reliant on an initial sample of data points and has shown good performance for a reduction in computation time, through the use of fewer CFD iterations. This work though has used single models to predict convergence, so ensembles of surrogates will be used to provide confident predictions and details of this modeling approach were also presented.

Chapters 3 and 4 introduce two novel convergence prediction methods. Due to a single data set for each CFD convergence history, diverse ensemble members are created in Chapter 3 using heterogeneous ensemble members with different input structures. A gradient based learning algorithm is used and training is monitored to ensure suitable models are created. The method introduced in Chapter 4 uses a hybrid multi-objective evolutionary algorithm (H-MOEA) to optimise the networks structures, as well as for training. The result of this approach is a Pareto set of networks that can be used to construct an ensemble. Selection of ensemble members is considered for both methods.

Transfer learning was also introduced in this chapter as a method for improving surrogate prediction performance and learning. Although not the focus of the research in this thesis, Chapter 5 introduces the detail behind a surrogate model that incorporates transfer learning to improve the learning and prediction of a more traditional parameter based surrogate. This work has been conducted to investigate a transfer learning methodology, as well as to allow a performance comparison to be made between the convergence based method and this more traditional methodology using a common data set.

Chapter 3

Ensemble of Heterogeneous RNN for

Convergence Prediction

1

3.1

Introduction

Due to the expensive evaluation processes needed to determine aerodynamic performance (e.g. CFD), there can be a lack of data available to guide an aerodynamic optimisation task. To reduce the computa- tional expense a surrogate model can be used to imitate the performance evaluation process. As discussed in Chapter 2, the majority of the surrogate methods used during aerodynamic optimisation tasks rely on evaluated designs that are then used to build the surrogate models. An alternative approach is to build a surrogate model based on the intermediate data of the expensive evaluation process and predict what the converged value will be, by projecting the trend of the CFD convergence history. By learning the characteristics of the partially converged CFD data, the computational cost of each design evaluation can be reduced, allowing more of the design space to potentially be explored or the same amount in less time.

This chapter introduces the first convergence based method for predicting aerodynamic performance from CFD intermediate data. A methodology that constructs an ensemble of heterogeneous models has been developed, as it has been shown that an ensemble of predictors will reduce the effect of any accumulated errors and provide confidence in the predictions.

Intermediate CFD convergence data can be considered as univariate, as the performance information is changing at each iteration, which is equivalent to a unit of time. Prediction of CFD convergence data is therefore similar to time series forecasting, which projects time series data into the future [73]. However, predicting a CFD convergence history is not the same as predicting a standard time series data set, as it is concerned with long-term prediction and not single step prediction, as is the case with standard time series tasks. Therefore, information is first given on time series prediction, followed by techniques for

1

Some of the contents of this chapter have been published in the conference paper: C. Smith, J. Doherty, and Y. Jin, “Recurrent neural network ensembles for convergence prediction in surrogate-assisted evolutionary optimization”, Computational Intelligence in Dynamic and Uncertain Environments (CIDUE), 2013 IEEE SSCI, Singapore, pp. 916, April 2013.

long-term prediction tasks.

As will be shown, recurrent neural networks perform well on time series forecasting tasks, therefore specific information on these types of surrogate model are then introduced. The specific methodology and then details of the first real world CFD data set are then provided. Finally, the results are presented and discussed, including how they can be improved by selecting different models to construct the ensemble.