Methods Comparison for Constructing SVM - Support Vector Machine (SVM) aggregation modelling fo

Ensemble

In this section, four methods, i.e., bagging, boosting, stacked generalisation and mixture of experts are compared against six characteristics of ensemble learning, to help

practitioners select the most suitable ensemble method for their speciﬁc research needs.

3.6.1 Predictive Performance - Accuracy

Predictive performance is considered to be the main feature for selecting the algorithm (Rokach, 2009; Statnikov, Aliferis, Tsamardinos, Hardin, & Levy, 2005).

Moreover, predictive performance measures accuracy, which can be used to bench- mark algorithms. In this regard, the bagging method is considered to be high in

accuracy meaning resulting in high prediction accuracy in percentage it is the value of correctly classiﬁed samples, it results because of its easy implementation and its

functionality on limited data size referring to less than 50 thousand samples (Zhao et al., 2007; Polikar, 2006). The boosting method has low accuracy, because of its

suﬀering from problems and its failure to understand complex composite classiﬁers (Rokach, 2009). The stacked generalisation method has low accuracy as combining

lower level models to higher level models is a complex task (Ting & Witten, 2011). A mixture of experts results in low accuracy considering the fact that assigning weights

to the classifiers from the output of T ier1 classifiers to T ier2 classifiers is a complex task too (Polikar, 2006).

3.6.2 Scalability

Scalability refers to the ability of the method to function on large data sets (Rokach, 2010). The bagging method has low scalability as it operates on limited data size

(Polikar, 2006). The boosting method operates on unlimited data size consisting of more than one hundred thousand samples, hence having high scalability (Polikar,

2006). The stacked generalisation method operates on medium sized training data consisting of 50 thousand to one hundred thousand samples resulting in medium

scalability (Wolpert, 1992). The mixture of experts method functions on low data size, hence having low scalability (Nasrabadi, 2007).

3.6.3 Computational Cost

It is important to know about the computational cost of a method,i.e., does it produce results in reasonable amount of time often related to computational complexity

(Granitto, Verdes, & Ceccatto, 2005). In terms of computational complexity, the bagging and booting methods are less computational complex. Both methods obtain

an ensemble of classiﬁers eﬃciently through robust training of data, resulting in lesser computational cost (Freund et al., 2003; Polikar, 2006). The stacked generalisation

method of data training requires more resources in terms of time, and also rectifying improper training by T ier2 requires more time and complex, hence resulting in high

computational cost (Wolpert, 1992). The mixture of experts method requires more resources in terms of time for training data for classiﬁers and classifying problem,

hence resulting in high computational complexity leading to high computational cost (Polikar, 2006).

3.6.4 Usability

Machine learning is considered to be an iterative process (Ribeiro & Cardoso, 2008). To improve the performance of an ensemble system practitioners change parameters

to generate better classiﬁers.

of both these algorithms are ﬂexible for generating better classiﬁers (Polikar, 2006).

The stacked generalisation method has low usability, as once the weights to T ier1 classiﬁers are assigned they are not ﬂexible, resulting in low usability (Polikar, 2006).

The parameters of assigning weights to classifiers in the mixture of experts method are partially flexible to generate better classifiers, hence resulting in medium usability

(Nasrabadi, 2007).

3.6.5 Compactness

Compactness can be measured by ensemble size and complexity of classiﬁers in ensemble methods (Rokach, 2010). In this regard, the bagging method results are highly

compact because it only works on limited training data size and results are easy to understand (Zhao et al., 2007; Polikar, 2006). The boosting method on the other

side has low compactness, due to its functionality on unlimited data size, whereas boosting of decision trees could result in thousands (or millions) of nodes which is

diﬃcult to visualise them (Polikar, 2006). Both stacked generalisation and mixture of experts methods have medium compactness, as they operate on low to medium sized

training data (Mitchell et al., 1986; Wolpert, 1992).

3.6.6 Speed of Classiﬁcation

Computational complexity plays important role in speed of classiﬁcation. Speed of

classiﬁcation indicates the ability of a method to perform the classiﬁcation in a certain time frame (Pfahringer, Holmes, & Kirkby, 2001). The bagging method results

in robust classiﬁcation because it operates on a limited data size consisting of less than 50 thousand samples and its computational complexity is lowest (Zhao et al.,

2007; Polikar, 2006). The speed of classiﬁcation for the boosting method is moder- ate compared with bagging because it operates on unlimited data size consisting of

more than one hundred thousand samples (Polikar, 2006). The stacked generalisation (Mitchell et al., 1986) and (Wolpert, 1992) mixture of experts methods are slow in

ing data that results in high computational complexity, and the end output of these

classiﬁers is an ensemble to make decisions.

Table 3.1 provides a comparison of learning algorithms methods. The table shows

that the bagging algorithm has high accuracy compared with boosting, stacked generalisation and mixture of experts. In terms of scalability, the boosting algorithm

has high ability to handle more data compared with stacked generalisation which has

medium scalability. On the other side boosting and mixture of experts resulted in low scalability. The computational cost of bagging and boosting is low compare to

stacked generalisation and mixture of experts which have a high computational cost to produce results in a reasonable amount of time. In terms of usability, bagging and

boosting have high usability, whereas stacked generalisation and mixture of experts have low usability as their parameters are not ﬂexible for generating better results.

Bagging has high compactness, meaning it operates on limited data. Therefore, its results are easier to understand compared with boosting, stacked generalisation and

mixture of experts which have low compactness. Bagging has a high speed of clas- siﬁcation compared with boosting, stacked generalisation and mixture of experts to

perform the classiﬁcation task in a certain time frame.

Characteristics Bagging Boosting Stacked Generalisation Mixture of Experts

Accuracy High Low Low Low

Scalability Low High Medium Low

Computational Cost Less Less High High

Usability High High Low Medium

Compactness High Medium Low Low

Speed of Classiﬁcation High Medium Low Low

3.7 Methods for Combining Outputs of SVMs based

In document Support Vector Machine (SVM) aggregation modelling for spatio-temporal air pollution analysis (Page 71-75)