Chapter 3 Machine Learning and Statistical Tools
3.5 Statistical Tool Selection
Pattern recognition technique is growing rapidly in the healthcare organization, as it has been demonstrated to be more effective than standard clinical statistical methods [113]. This tool has been utilised since 1970s for many purposes in the medical field applications [114]. It is normally characterised according to the kind of learning process used to produce the output value. Lin [115] proposed a robust diagnosis approach for liver disease treatment using regression and classification trees, and Lee et al. [116] designed a computer-aided diagnosis system for assessing pulmonary nodules using a linear discriminant classifier (LDC) and feature selection. Similarly, Dan et al. [117] effectively classified Parkinson’s illness by SVM model using structural images and functional magnetic resonance imaging (fMRI), as inputs (features). They gained remarkable outcomes with sensitivity of 78.95%, and specificity of 92.59%, and high rate of accuracy with 86.96%. A strategic protocol for the Early Detection of
39 | P a g e Neurodegenerative Diseases (NDDs) within supervised learning data patterns can be classified utilising statistical techniques, template matching, and neural networks [118].
Iram [67] proposed a new method for using the discrimination analysis of gait signals of various neurodegenerative diseases such as Amyotrophic Lateral Sclerosis Parkinson’s, and Huntington’s. This includes applicable feature extraction, solving the problems of missing entries and imbalanced datasets and most importantly, lastly classification of multiclass datasets. There were eleven models nominated for the discrimination and classification of gait signals demonstrating, Bayes normal classification, linear, and non-linear and methods. Results showed that three classifiers have provided with higher accuracy rate, which are Linear Discriminant Classifier (LDC), Uncorrelated Normal Density based Classifier (UDC) and Parzen Classifier with 62.5%, 65%, and 60% accuracy, respectively. Further to that, in statistical task analysis, demonstration of each data pattern is held in a multi-dimensional space, splitting the regions for each individual class.
3.5.1 Feature Selection and Feature Extraction
In the field of pattern recognition and machine learning domain, dimensionality reduction is a significant area, where a number of approaches have been proposed [119]. The pattern recognition technique involves two important phases; feature selection and feature extraction. In order to provide optimal representation of a particular field, features are identical input variables or the attributes of a dataset [120]. Features can be characterised into redundant or relevant, and irrelevant. In this research, the main purpose of using these types of features is to improve the predictive accuracy of classifiers and to obtain high performance of learning algorithms. The major objective of this technique is to avoid overfitting that could require further analysis. Figure 3.8 shows the procedure of Feature extraction and feature selection.
40 | P a g e
Figure 3-7: Feature extraction and feature selection procedure
Feature selection techniques offer a good way to improve prediction performance, reduce computation time, and provide better understanding of the SCD medical dataset in machine learning algorithms or pattern recognition applications [121]. Polat et al [122] proposed a robust feature selection technique known kernel F-score feature selection (KFFS) applied for pre-processing step in clinical data. KFFS consists of features of medical datasets that transformed to kernel space by Radial Basis Function (RBF). It is indicated that, The proposed feature selection techniques called KFFS is yeild promising outcomes compared to the selected methods. Santos et al [123] developed a new approach using feature selection to deal with large datasets based on ensemble classifiers. The authors have indicated the usefulness of the proposed approach, towards the development of better classification algorithms through use a number of classification algorithms that covers the current performance evaluation techniques matrices, specifically with the area under the ROC curve, sensitivity and false positive rate. Harb and Desuky [124] proposed two well-known approaches the filter and wrapper based on Particle Swarm Optimization (PSO) as a feature selection technique for clinical data. They selected number of algorithms to check the accuracy and performance with another feature selection based on Genetic algorithm. Three medical data sets were used in their experiment. The outcomes shown the proposed PSO enhanced the classification accuracy rate over the other classification models. Rajeswari and Pede [125] analysed a specific kind of approaches for classification based on feature selection by using association and correlation mechanism. The aim target of their research study is to select the correlated features of clinical data, which can be beneficial and helpful for clinical decision support system. They confirmed that after removal of some features from the medical dataset, the performance and accuracy of classifier is improved.
41 | P a g e In the case of feature selection, it is important to seek into optimize the model either to improve or maintain classification accuracy and simplify classifier complexity. A study conducted by Dash and Liu [126] indicated that, the feature selection algorithm can be separated into 6 steps as shown in algorithm 3.1 [127].
Algorithm 3.1: Feature selection procedures 1. select a criterion procedure function, 𝑓(𝑥)
2. Choose a subset 𝑥′of the complete features sets X.
3. Construct a model with the candidate subset 𝑑. 4. Calculate 𝑓(𝑥)
5. Repeat with various subsets 𝑥′⊂ X.
6. choose 𝑥 which minimises 𝑓(𝑥)
Two important procedures taken into consideration when selecting the correct feature subsets. Initially, it is required to search for the possible feature subsets based on the robustness of objective function, which is part of the search space as shown in Figure 3.8. Then, select the feature subsets in association with the objective function. Once the module is completed, the final feature subsets are ready to be used by the machine-learning algorithm.
Figure 3-8: Feature selection procedure
Feature selection is the procedure of removal of irrelevant, identification, and redundant features from the proposed clinical datasets [128]. Our datasets have hundreds of features that are related to SCD datasets. Various numbers of features that may be irrelevant, redundant information, or considered not important features for healthcare professionals, when diagnosing SCD patients. In this context, this situation increased the processing time of classification as well as possibly leading to more complications. These techniques can generate better outcomes
42 | P a g e than approaches, which do not deal with feature redundancy; however, the computational cost of the subset search makes them inefficient for high-dimensional data. Feature selection methods are divided into three stages [129-131]:
(i) The filters that extract important features from the total datasets without any learning algorithms involved.
(ii) (ii) The wrappers that utilise learning methods to examine which features are effective and useful.
(iii) (iii) The embedded approaches, which integrate the model building and the feature selection, step.
Therefore, Feature extraction comprises decreasing the amount of resources needed to represent a large set of clinical data. This type of features is the basic index of regression, detection, and classification in the domain of biomedical signal processing [67]. Data analysis with many variables normally needs a large memory as well as computation power. Moreover, it is a high potential cause for the classification model to overfit during training samples and create new poor samples.