Parameters of Ordex - Multivariate Correlation Analysis for Supervised Feature Selection in Hig

In this chapter (c.f Equation 5.1 in Section 5.3) we defined ordinality at time t in a time series X as,

Od(X, t) = (rank(X[t]), rank(X[t − 1]), ..., rank(X[t − (d − 1)])).

In the original work [BP02] that introduced the property of ordinality, the com- putation of an O at t involves an additional delay parameter τ ≥ 1. That is,

The τ parameter was also used in our work for all synthetic and real world exper- iments in Section 5.8 and we list the parameters values in Table 5.5.

Dataset

d

M

m

α

o τ

EMG limb sen

5

200

3

0.5 10 4

EMG limb pie

5

300

3

0.1 15 4

EMG limb mar

5

200

3

0.3 20 4

Character

5

300

3

0.1 5010

Activity recognition

5

100

2

0.3 20 3

User Movement

5

300

3

0.8 30 4

Occupancy

5

100

3

0.5 10 4

Bosch

5

200

3

0.1 10 4

Table 5.5: Real world data experiment parameter settings

5.11 Summary

In this chapter we proposed a feature-based time series classification approach called ordex that is purely based on the ordinality of the raw time series. Ordex extracts features based on co-occurrence of ordinalities from multiple dimensions. Hence, the interactions of multiple dimensions in the time series data are honored. The extracted features are evaluated for relevance based on our novel and efficient scoring methodology. In addition to the relevance, we evaluate the monotonicity of extracted features to estimate feature redundancy. Finally, the relevance and redundancy scores are combined to exemplify the importance of the feature to the classification task and the novelty with respect to other extracted features. By scoring relevance and non-redundancy, ordex achieves better prediction quality with fewer features.

The results of various state-of-the-art feature-based algorithms on the synthetic and real world datasets show that our method is suitable for multivariate time series datasets. For high-dimensional time series our approach efficiently converts the relevant ordinalities into features. Therefore, in real world applications, relevant

and novel ordinalities are encoded into static features. These features can be used for various data mining tasks and analysis.

Approaches presented in the previous chapters, i.e., RaR (c.f. Chapter 4) and

ordex, involves high number of Monte-Carlo iterations. The large number of com-

putations in addition to the dimensionality makes the task of understanding the feature selection algorithm difficult. In the forthcoming chapter we introduce a software framework that helps the user to understand multivariate correlations. The aim of such a software tool is to make multivariate correlation analysis transparent.

Understanding Multivariate

Correlations

6.1 Motivation

Feature selection aims to score the importance of a feature based on its correlation with the target. However, a causal relationship cannot be inferred from all correlations. For example, both wrinkles and cancer risk increase with age, but wrinkles does not cause cancer or vice-versa [LWIG06]. Here, the correlation between wrinkles and cancer risk is a mere statistical coincidence and not a causality. This makes it essential for the domain experts to understand the multivariate correlations in the datasets and decide if a correlation is causal or not. In this context, the number of feature combinations grows exponentially with the dimensionality of the feature space. This hinders the user’s understanding of the feature-target relevance and feature-feature redundancy.

In order to provide a smaller yet predictive subset of features, a large vari- ety of existing approaches [GE03, Qui14, RˇSK03, KMB12, SBS+17] compute the relevance of each feature to the target class, as well as the redundancy between features. However, the user does not get an overview of all correlations in the dataset. Furthermore, the selection process is non-transparent because the reason for a feature’s relevance or redundancy is not explained by these algorithms. Hence, the first challenge for explaining the feature selection process is to present relevance and redundancy jointly in an informative layout. The second challenge is to guide the user in understanding how features are correlated as opposed to merely returning a correlation score. We address these two challenges by contributing a Framework for Exploring and Understanding Multivariate Correlations (FEXUM)∗, that pro- vides:

(1) A visual embedding of feature correlations (relevances and redundancies). (2) User-reviewable multivariate correlations.

This leads to a more comprehensible selection process in comparison to state- of-the-art tools tabulated in Table 6.1. While most tools focus on fully-automated statistical selection of features, with FEXUM we aim at explaining the feature se- ∗_{Adapted by permission from Springer Nature: Framework for Exploring and Understanding}

Multivariate Correlations in the proceedings of the European Conference on Machine Learn- ing and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2017 [KRT+17]

Tools

Relevance

Redundancy

Correlation

overview

Correlation

explanation

KNIME

3

7 RapidMiner

3

7 Weka

3

7 FEXUM

3

Table 6.1: Comparison of feature selection tools

lection algorithm. KNIME is a renowned tool that offers filter-based feature selection using linear correlation and variance measures. However, without customized extensions, it does not address feature redundancy during selection. RapidMiner and Weka take redundancy into account, but do not provide an overview of all feature correlations. Additionally, they do not explain the reason for the relevance of a feature.

FEXUM is an application that allows instant access with a web browser. We achieve this by basing our infrastructure on AngularJS and the Django web framework. To ensure scalability for large datasets, we distribute computations to multiple machines with Celery.

In document Multivariate Correlation Analysis for Supervised Feature Selection in High-Dimensional Data (Page 152-158)