3.2.1
Overview
The goal of supervised learning algorithms is to perform a task through hypothesis space to come up with suitable hypothesis that will result in good predictions for a
particular problem (Sollich & Krogh, 1996). Even if the hypothesis space contains hypotheses that are best suited to a specific problem, it is still very difficult to select
a good one(Sollich & Krogh, 1996). In fact ensembles combine multiple hypotheses to form better hypotheses for decision making.
In other words we can say that ensemble is a technique that combines the weak learners to produce a strong learner. The term usually referred to methods that gen-
erate multiple hypotheses using the same base learners. However ensemble learning is a process in which multiple models such as classifiers or experts are tactically created
to solve a computational problem. Ensemble learning is mainly used to improve the classification, prediction and performance of a model or to avoid the selection of a
poor model (Y. Liu & Yao, 1999). Recent machine learning approaches like deep learning have tried to implement and were successful in combining multiple learners
as an ensemble to tackle dimensionality issues with continuous data (Tirumala, Ali, & Ramesh, 2016).
The prediction of an ensemble requires a lot of computation compared with assess- ing prediction of a single model. So ensembles may be considered as a technique by
working on poor learning algorithms by performing a lot of computations (Chandra & Yao, 2006). On the other hand fast learning algorithms e.g. decision trees are
commonly deployed with ensembles. However, slower learning algorithms can take advantage of ensemble techniques as well (Chandra & Yao, 2006).
3.2.2
Ensemble Theory
An ensemble is a supervised learning algorithm. In other words it can be trained and deployed to do predictions. The trained ensemble generates a single hypothesis.
However, this hypothesis is not essentially contained in the hypothesis space of the
models from which it is built. Ensembles have shown more flexibility in representation (L. I. Kuncheva & Whitaker, 2003). This flexibility in theory, results in overfitting the
training data more than a single model. However, in practical ensemble techniques tend to reduce the problem of overfitting of the training data (Webb & Zheng, 2004).
Practically ensembles tend to produce better results, when there is a significant differ-
ence among the models (L. I. Kuncheva & Whitaker, 2003). Therefore many ensemble methods focus on promotion of diversity among the models they combine. However,
a variety of strong learning algorithms that can be used to produce strong ensemble, will be considered more efficient in ensemble learning (L. I. Kuncheva & Whitaker,
2003).
3.2.3
Importance of Studying Ensembles
Studying ensemble learning methods is appealing in many ways (Parikh & Polikar, 2007). Firstly, ensemble methods are easy to implement. However, they consist of
complex underlying dynamics that require hundreds of hours to study. A key instance of this is the bagging algorithm. Bagging is considered to be one of the most widely
used ensemble techniques which can be implemented in a few lines of code.
Secondly, wide applicability of ensemble methods to research problems is also
an attractive aspect of it (Barkia, Elghazel, & Aussem, 2011). Various ensemble methods were developed in previous years, are still relevant and will be relevant to
new learning techniques in the future. However, there are various approaches for constructing ensembles, but in our research we will be primarily focusing on SVM
based ensemble methods construction. Working at this level at least allows us to contribute knowledge to various disciplines such as artificial intelligence, machine
learning, financial forecasting and statistics to name but a few. As development in machine learning is still growing, ensemble methods continue to be applied even to
more advanced models (Barkia et al., 2011; Parikh & Polikar, 2007). With ensemble methods, researchers will be able to extract a little bit more valuable information
However, the study of why ensemble methods could do well is of elementary im-
portance. If someone understands why a method needs to be applied successfully to a certain problem, then developing a new tool for machine learning will not be a
difficult task.
3.2.4
Purpose of Ensemble Based Systems
The output of a learning algorithm based on a single hypothesis suffers from three basic problems. Firstly, a statistical problem arises when the learning algorithm is
looking for a space of hypothesis which is too large for the training data. In such a scenario there may be various hypotheses that result in the same accuracy on
training data, whereas, the learning algorithm has to select one of those hypotheses from output. There is also a risk that selected hypotheses will not predict future data
points correctly. In this case, a simple vote of all equally good classifiers will reduce such statistical problems.
Secondly, a computational problem arises when a learning algorithm fails to iden- tify the best hypothesis within the hypothesis space. For example, in neural networks
and decision trees, to locate the best hypothesis that fits the training data are compu- tationally difficult to find, so heuristic methods (D. Opitz & Maclin, 1999) need to be
applied. However, these heuristic methods (such as gradient descent) can be stuck in local minima (Rokach, 2009), and hence result in failure to find the best hypothesis.
This statistical problem can be reduced by using a weighted combination of various local minima, hence reducing the risk of selecting the wrong local minima to output.
Finally, a representational problem is the result of hypothesis space that does not contain any hypothesis which provides good approximations to the true function f.
This representation problem can be reduced by using a weighted vote of hypothesis (Rokach, 2009). This will enable the learning algorithm to form an accurate approx-