• No results found

Ensemble

In this section, four methods, i.e., bagging, boosting, stacked generalisation and mix- ture of experts are compared against six characteristics of ensemble learning, to help

practitioners select the most suitable ensemble method for their specific research needs.

3.6.1

Predictive Performance - Accuracy

Predictive performance is considered to be the main feature for selecting the al- gorithm (Rokach, 2009; Statnikov, Aliferis, Tsamardinos, Hardin, & Levy, 2005).

Moreover, predictive performance measures accuracy, which can be used to bench- mark algorithms. In this regard, the bagging method is considered to be high in

accuracy meaning resulting in high prediction accuracy in percentage it is the value of correctly classified samples, it results because of its easy implementation and its

functionality on limited data size referring to less than 50 thousand samples (Zhao et al., 2007; Polikar, 2006). The boosting method has low accuracy, because of its

suffering from problems and its failure to understand complex composite classifiers (Rokach, 2009). The stacked generalisation method has low accuracy as combining

lower level models to higher level models is a complex task (Ting & Witten, 2011). A mixture of experts results in low accuracy considering the fact that assigning weights

to the classifiers from the output of T ier1 classifiers to T ier2 classifiers is a complex task too (Polikar, 2006).

3.6.2

Scalability

Scalability refers to the ability of the method to function on large data sets (Rokach, 2010). The bagging method has low scalability as it operates on limited data size

(Polikar, 2006). The boosting method operates on unlimited data size consisting of more than one hundred thousand samples, hence having high scalability (Polikar,

2006). The stacked generalisation method operates on medium sized training data consisting of 50 thousand to one hundred thousand samples resulting in medium

scalability (Wolpert, 1992). The mixture of experts method functions on low data size, hence having low scalability (Nasrabadi, 2007).

3.6.3

Computational Cost

It is important to know about the computational cost of a method,i.e., does it pro- duce results in reasonable amount of time often related to computational complexity

(Granitto, Verdes, & Ceccatto, 2005). In terms of computational complexity, the bagging and booting methods are less computational complex. Both methods obtain

an ensemble of classifiers efficiently through robust training of data, resulting in lesser computational cost (Freund et al., 2003; Polikar, 2006). The stacked generalisation

method of data training requires more resources in terms of time, and also rectifying improper training by T ier2 requires more time and complex, hence resulting in high

computational cost (Wolpert, 1992). The mixture of experts method requires more resources in terms of time for training data for classifiers and classifying problem,

hence resulting in high computational complexity leading to high computational cost (Polikar, 2006).

3.6.4

Usability

Machine learning is considered to be an iterative process (Ribeiro & Cardoso, 2008). To improve the performance of an ensemble system practitioners change parameters

to generate better classifiers.

of both these algorithms are flexible for generating better classifiers (Polikar, 2006).

The stacked generalisation method has low usability, as once the weights to T ier1 classifiers are assigned they are not flexible, resulting in low usability (Polikar, 2006).

The parameters of assigning weights to classifiers in the mixture of experts method are partially flexible to generate better classifiers, hence resulting in medium usability

(Nasrabadi, 2007).

3.6.5

Compactness

Compactness can be measured by ensemble size and complexity of classifiers in en- semble methods (Rokach, 2010). In this regard, the bagging method results are highly

compact because it only works on limited training data size and results are easy to understand (Zhao et al., 2007; Polikar, 2006). The boosting method on the other

side has low compactness, due to its functionality on unlimited data size, whereas boosting of decision trees could result in thousands (or millions) of nodes which is

difficult to visualise them (Polikar, 2006). Both stacked generalisation and mixture of experts methods have medium compactness, as they operate on low to medium sized

training data (Mitchell et al., 1986; Wolpert, 1992).

3.6.6

Speed of Classification

Computational complexity plays important role in speed of classification. Speed of

classification indicates the ability of a method to perform the classification in a cer- tain time frame (Pfahringer, Holmes, & Kirkby, 2001). The bagging method results

in robust classification because it operates on a limited data size consisting of less than 50 thousand samples and its computational complexity is lowest (Zhao et al.,

2007; Polikar, 2006). The speed of classification for the boosting method is moder- ate compared with bagging because it operates on unlimited data size consisting of

more than one hundred thousand samples (Polikar, 2006). The stacked generalisation (Mitchell et al., 1986) and (Wolpert, 1992) mixture of experts methods are slow in

ing data that results in high computational complexity, and the end output of these

classifiers is an ensemble to make decisions.

Table 3.1 provides a comparison of learning algorithms methods. The table shows

that the bagging algorithm has high accuracy compared with boosting, stacked gen- eralisation and mixture of experts. In terms of scalability, the boosting algorithm

has high ability to handle more data compared with stacked generalisation which has

medium scalability. On the other side boosting and mixture of experts resulted in low scalability. The computational cost of bagging and boosting is low compare to

stacked generalisation and mixture of experts which have a high computational cost to produce results in a reasonable amount of time. In terms of usability, bagging and

boosting have high usability, whereas stacked generalisation and mixture of experts have low usability as their parameters are not flexible for generating better results.

Bagging has high compactness, meaning it operates on limited data. Therefore, its results are easier to understand compared with boosting, stacked generalisation and

mixture of experts which have low compactness. Bagging has a high speed of clas- sification compared with boosting, stacked generalisation and mixture of experts to

perform the classification task in a certain time frame.

Characteristics Bagging Boosting Stacked Generalisation Mixture of Experts

Accuracy High Low Low Low

Scalability Low High Medium Low

Computational Cost Less Less High High

Usability High High Low Medium

Compactness High Medium Low Low

Speed of Classification High Medium Low Low

3.7

Methods for Combining Outputs of SVMs based