Self-Organized incremental model

CHAPTER 7 GENERAL DISCUSSION

7.3 Self-Organized incremental model

In our work we have developed a few models based on Takag-Sugeno neuro-fuzzy model. All of these are capable on online incremental learning. However, our aim was to ﬁnd a model that will not need to rely on the knowledge of the data that should not be known ahead. Thus, we

have adapted on of our models, ARTIST to a self-adapting version. Here, all the structures, i.e. the rules, are organized automatically in order to follow the adaptive nature of all the parameters of ARTIST.

Both of these models are besides neuro-fuzzy modeling based on ART-2A neural network. This network has two hyper-parameters, vigilance and learning rate. These are used to decide about the creation of a new cluster, n our case fuzzy rule, for the former, and the ’speed’ of adapting of the parameters used to derive the similarity to these clusters, for the latter. We have developed a formulas to adapt these parameters based on the responses of the model to the data and its predictions.

Since these parameters responsible for the creation of new rules, especially the vigilance parameter, it was necessary to ﬁnd a way for the rules to re-organize. Thus we have developed methods to allow the rules to merge, split and discard, based on the changing parameters and the needs of the model to adapt.

All these solutions have contributed to the development of the self-organized model that does not rely on the information of the unknown data and can adapt itself based on the dynamic changes in the environment. However, all this has also contributes to the complexity of the model and thus also the processing time.

7.4 The problem of missing data

As we have stated in the objectives of this thesis, in online learning the data are introduced in a sequence. Thus, at the beginning of the learning the model lacks the amount of data which results into a higher variance and over-ﬁtting. This is caused by that the model is learned to recognize speciﬁc examples rather than their expected and thus more general value. To be able to increase the generalization power of the model at the beginning of the learning process as well as after introducing new class, we generate new synthetic data so that the parameters derived for each class are more representative and include some variance within the data.

In our work we have developed a framework to incorporate a generation of synthetic data to the online process such that it does not create blocks of data of the same kind. At the same time we have estimated the necessary amount of the synthetic samples to be generated and the time span of their generation.

Using the synthetic data we were able to avoid the problem of generalization at the beginning of the process, even more when in combination with our Elastic Memory Learning, that has helped to handle the amount of the synthetic data to be used. However this solution is partially restricted to the handwritten gesture recognition, since the synthetic data generation is used solely for the handwriting.

learning. We were able to tackle all our posed problems within online learning as well as address all the objectives that has raised from these problems. Our work has led to a number of publications in international journals and conferences that have shown the usefulness of our work.

To ﬁnd the similarity metric that deals with the accuracy vs processing time problem is very necessary for online learning, especially for real-time processing. Many approaches choose either high performance in the sense of accuracy, precision, recall, etc. or the low computational cost. Usually, the higher performance results into higher complexity and oppositely lower complexity results into lower performance. We were able to propose and develop Incre- mental Similarity measure that we have applied to Neuro-Fuzzy models and shown that while achieving lower complexity it retains the high performance. We have compared this metric on a number of models and various data sets showing its applicability to various aspects of the machine learning domain. While we see the success of our work and that we have successfully accomplished our objective, we still see a room for improvement. We believe that in the future, the science will move forward with the real-time processing either by introducing new methods to handle this problem or by the computational capacity of machines.

The forgetting of unused classes is an important issue that has not been vastly tackled in the state of the art. However it is a very natural occurrence and thus we have posed the objective to tackle this problem. As we could see in previous chapters, we were able to successfully develop a solution for Recursive Least Squares. We have applied our solution for various models and evaluated it on various machine learning benchmarks, showing its immunity towards the forgetting of unused classes. In many cases, we were able to almost completely eliminate the forgetting aspect of the learning process. On the other hand, we can see a slight drop in the

performance when we continue the learning process using all the classes. We explain this by the fact, that in our solution we do not learn from all the examples, but competitively choose the proper ones, whereas in the original RLS all the data is used. Nevertheless, this drop is not signiﬁcant, and we can state that our solution achieves superior results.

In our work we were able to develop a self-organized model based on neuro-fuzzy modeling. This model is capable of adding new rules to the ensemble and adding new classes to the complete system. Moreover, as a part of self-organized mechanism it is capable of determining the necessity for merging, splitting and discarding of the rules. This is a consequence of the self-learning parameters that handle the rules and their generation. Since these are changing in time, it is necessary to allow the model to change its organization, such that it is at its best with the knowledge of all the data until that specific time. In the results we have shown that our proposed self-organized model outperforms the model using free parameters that have been fixed by cross-validation. Moreover, we have compared this model to a number of online and offline state of the art models. We can see that our model either outperforms or sustains comparably high performance on all listed data sets. We need to note, that the vast majority of the listed models need some initialization and do not start from scratch, while at the same time require cross-validation. Thus, the results of our model prove that by self-organization the machine is capable to adapt dynamically to the current state of the environment, which as a result leads to superior results. This is due to the fact that when using fixed parameters and structure, this is not the best setup for all the dynamical changes in the learning process, but only in general. Thus, in online learning we find it preferable to use dynamically structured models in order to cope with dynamically changing environment.

To be able to cope with the high variance at the beginning that is caused by the lack of data, we have proposed the generation of synthetic data. We have developed a framework for real-time online processing with randomized buffer that leads to a signiﬁcantly lower variance during

the initial stages of the data stream processing. However we have shown, that using synthetic data can magnify the forgetting factor. Thus, we have applied our solution for this problem and shown, that the combination of synthetic data with the competitive nature of EML is the proper solution for the high variance and the low performance. In the results we have shown that we can get close to the elimination of the lowered performance while avoiding the unwanted forgetting of the other classes. Moreover, we have explored the length of the generation process as well as the amount of synthetic examples to be generated, which has led to improved results for this application.

We have applied our solutions to various tasks and shown their usefulness. The problems that we have tackled are often ignored in online learning field, where the highest emphasis is given to the sequential nature of the data. While this is an important aspect of online learning, by ignoring all the other problems that we have raised, these models cannot properly cope with the real-time setup, where the data are processed in their natural order and cannot be shuffled. In conclusion, we have successfully approached the objectives of this thesis and incremented the state of the art techniques for online learning. Online learning is very interesting field within machine learning and its importance rises with the rise of the use of the internet, where the data are created in a big amounts and the need of dynamic models rises as well. There is a number of various applications that can benefit from online learning and the increase in the variety of online models will help to tackle various problems specific for these applications. In this work some problems of online learning have been listed, however with the increase in the applications, new problems will need to be tackled. Online learning similarly to its own nature, is a dynamic field that need to adapt to all the new situations in the state of the real world. We believe that our research will help scientists in the future to develop solutions that will help the society to go beyond all that we have ever thought.

Future work

Our work has mainly focused on the self-organized aspect of online learning applied to the neuro-fuzzy models. We have also proposed the framework for synthetic generation of data that has been applied on our online incremental model ARTIST. We believe that such framework can be an interesting addition to the self-organized models. However, the kinematic model is directly connected to human movements, and thus not applicable for other domains. To be able to generate synthetic data is of a great interest especially in the case of learning from scratch or cold-start learning.

One of the ﬂaws that we see in our solution is the approximation needed for the update of some parameters. This is caused by the choice of the optimization, RLS, and the application to the self-organized model. We believe that a better choice of the optimization technique can lead to a superior results. This choice can be also inspired by a change in the consequent learning, that may lead to a simpliﬁed parameter learning.

To contribute to the computational cost of the system, an interesting proposal is to add a feature selection mechanism to the model. One way is to choose the best features from the complete set of features, where these best features are the ones that contribute to the right decision the most.

The feature extraction is an open problem, where the pool of possible features from handwritten data can always be increased. The higher choice of features can also complement our previous proposal, the feature selection.

In our work we have used a feature vector representation of the input data. Especially in the application of handwritten gestures, there are other interesting approaches of representing the data, such as markov models or neural networks. Here, one gesture may be represented by one model, where these models are added incrementally into an ensemble model. The models

themselves need to be able to update their parameters in an incremental manner as well, so that they can change with the increasing number of data.

Another interesting task is to apply our solution along with proper data representation to a general gesture recognition, where on the input we have gestures produced by a movement. There are still and always will be many more challenges in the area of machine learning and of online learning speciﬁcally, and we hope that our research has contributed to resolving them.

Summary of contributions

In this work we have shown our solutions for the problems posed in the Introduction and followed the objectives resulting from them. As such, we were able to achieve number of contributions to the state of the art. We list these contributions in the following:

• Incremental similarity for tackling the accuracy vs processing time problem,

• Elastic Memory Learning to tackle the forgetting of unused classes that in online learning

setup are not distributed uniformly in the data stream and thus can become sparse,

• Self-Organized model to enable learning from scratch, learning on the ﬂy and online learn-

ing in general, where the structures of the model are organized in an automated way along with the learning of all the parameters necessary for high performance,

• Framework for generating the synthetic data for online learning to tackle the problem of

high variance at the beginning of the learning process and every time a new class is added to the system.

Articles in peer reviewed journals

[1] M. Režnáková, L. Tencer, R. Plamondon, M. Cheriet, Forgetting of unused classes in miss-

ing data environment using artiﬁcially generated data. Application to on-line handwritten stroke recognition (under review), Pattern Recognition.

[2] M. Režnáková, L. Tencer, M. Cheriet, Elastic Memory Learning for Fuzzy Inference Mod-

els (under review), Applied Soft Computing.

[3] M. Režnáková, L. Tencer, M. Cheriet, Incremental Similarity for real-time on-line incre-

mental learning systems, Pattern Recognition Letters 74 (2016), pp. 61-37.

[4] M. Režnáková, L. Tencer, M. Cheriet, SO-ARTIST: Self-Organized ART- 2A inspired

clustering for online Takagi Sugeno fuzzy models, Applied Soft Computing 31 (2015), pp. 132-152.

Articles in peer reviewed conference proceedings

[1] M. Režnáková, L. Tencer, R. Plamondon, M. Cheriet, The generation of synthetic hand-

written data for improving on-line learning, in: 17th Biennial Conference of the Interna- tional Graphonomics Society, 2015.

[2] M. Režnáková, L. Tencer, M. Cheriet, ARTIST: ART-2A driven Generation of Fuzzy Rules

for Online Handwritten Gesture Recognition, in: Document Analysis and Recognition, 2013. ICDAR ’13. 12th International Conference on, 2013.

[3] M. Režnáková, L. Tencer, M. Cheriet, Online handwritten gesture recognition based on

Takagi-Sugeno fuzzy models, in: Information Science, Signal Processing and their Ap- plications (ISSPA), 2012 11th International Conference on, 2012, pp. 1247-1252.

In document Online incremental learning from scratch with application to handwritten gesture recognition (Page 192-200)