• No results found

2.4 A review of Evolutionary Algorithms for Feature Selection in a Data

2.4.2 Fitness Function

Another component of GAs is a fitness function, which aims to evaluate the fitness of individuals. The vast majority of GAs for feature selection follow the wrapper approach, where the fitness function involves the predictive performance of a clas- sifier built using the features selected by the corresponding individual. However, the filter approach could be used also, without using a classifier’s performance [34].

There are several types of feature ranking techniques used in the literature, such as Between Group to Within group sum of square ratio (BW ratio) [15][47], Entropy based [108], Information gain [5, 6, 15], T-statistics [108], the relative approximity degree [82] and Wilcoxon rank sum [75].

A search method using correlation coefficient as the evaluation function [15] and a search for the Markov blanket [125] of the class attribute are examples of a search-based method following the filter approach for feature selection.

Individual Pool

Evaluate each individual

using fitness function

Apply Genetic

operators

Survival Selection

Figure 2.10: General scheme of GAs based on the filter approach

based on the accuracy of a classifier [6, 14, 40, 49, 70, 73, 86]. Some papers use the accuracy of the classifier and another special criterion as a fitness function. For instance, in [70] they use the accuracy of k-NN and the proportion of selected features in the individual to the total number of features in the dataset; in [14] they used the accuracy, the simplicity of decision tree (tree size); and number of features in feature subset; and in [22] they used the accuracy of an SVM and the number of selected features. A list of the different types of fitness functions used by many GAs proposed in the literature is provided in Table 2.1

Individual Pool

Run Classification

algorithm on each

individual

Evaluate Fitness

function

Apply Genetic

operators

Survival Selection

Table 2.1: A summary of the literature on Genetic Algorithms for Feature Selection in a data preprocessing phase References Feature Selection Approach Ind.Rep. Fitness

Function Crossover Mutation

Other Operation

[69] Filt & Wrap List of feature indexes

BW ratio for filter approach

The accuracy of k-NN for wrapper approach Dynamic Dynamic Elitist strategy

[5] Filt & Wrap Bit string

Information content for filter and The accuracy of Decision Tree,

the classification cost for wrapper approach

not mentioned not mentioned not mentioned

[70] Wrap Bit string The accuracy of k-NN Adaptive probability Adaptive probability Elitist strategy [120] Filt & Wrap Bit string PCA for filter approach and

the accuracy of MLNB for wrapper approach Uniform not mentioned Elitist strategy [6] Wrap Bit string The accuracy of Decision Tree

and size of the feature subset not mentioned not mentioned not mentioned

[108] Filt & Wrap Bit string

Entropy based, T-statistics, SVM-recursive elimination

for filter approach and the accuracy of SVM for wrapper approach

Single-point Bit-flip not mentioned

[40] Wrap Bit string The accuracy of GRNN Half uniform Bit-flip Simulated Annealing

[14] Wrap List of feature indexes The accuracy and simplicity

of Decision Tree Uniform Bit-flip Delete Feature

[86] Wrap Bit string Feature subset cardinality

and the accuracy of 1-NN Multi-point Bit-flip

Problem–specific operation

[83] Filt & Wrap Bit string

The relative proximity degree

for filter approach and the accuracy of k-NN for wrapper approach

Multiple-point Bit-flip not mentioned

[73] Wrap Bit string The accuracy of SVM Single-point Bit-flip not mentioned

[15] Filt & Wrap Bit string

The correlation based feature weights for each feature for filter approach and the accuracy of k-NN for wrapper approach

Standard Bit-flip Taguchi method

[22] Filt & Wrap Bit string M Ranked method for filter approach and

the accuracy of SVM for wrapper approach Single-point Bit-flip not mentioned [117] Filt & Wrap Bit string Information Gain for filter approach and

the accuracy of k-NN for wrapper approach Two-point Bit-flip not mentioned

[51] Filt & Wrap Bit string

Cosine amplitude method and alpha cut method for filter approach and the accuracy of SVM

for wrapper approach

One-point Multi-uniform Elitist strategy

[75] Filt & Wrap Bit string Wilcoxon rank sum test for filter approach

and the accuracy of SVM for wrapper approach Double one-point Bit-flip not mentioned

[50] Wrapper List of feature indexes The accuracy of ANN One-point Bit-flip Speciation,

Elitist strategy

[47] Filt & Wrap 2 parts bit string

BW ratio, the correlation coefficient , the Fisher’s discriminant criterion

for filter approach and the accuracy of SVM Specialized Specialized Elitist strategy

Considering the feature selection approach, most works mentioned in the sec- ond column of the table use the filter and wrapper approaches together, in a sequential fashion. The advantage of using the filter approach before applying a GA is the reduction of the number of features in the feature space, in order to allow the subsequent use of a wrapper approach. In contrast, applying only the wrapper approach to all original features would be much more computationally expensive. On the other hand, in works like [125], they do not need to use the filter approach (for feature elimination) because the number of features in the datasets mined in those papers is no more than 100 features, which does not seem too large for a wrapper-based GA for feature selection.