3.4 Sparse Partial Least Squares (SPLS)
3.5.3 Sparse CCA and Sparse PLS
Since there seems to be a lack of a unified naming convention in the literature, it is not always clear which algorithms the authors apply in the various SCCA/SPLS studies, i.e. some studies claim to solve a SCCA optimisation problem, when it fact it is an approximation which makes it closer/equivalent to SPLS. Therefore, the applications of SCCA/SPLS will be described together in the same section.
SCCA/SPLS has been used in several genetics studies, since it is a field which deals with high dimensional data, thus, the ability to select smaller and more inter- pretable subsets of genes is an advantage. Indeed, several papers which proposed the first SCCA/SPLS approaches used such datasets to demonstrate their perfor- mance [Waaijenborg et al., 2008, Lê Cao et al., 2008, Parkhomenko et al., 2009, Witten et al., 2009, Witten and Tibshirani, 2009].
The use of SCCA/SPLS is still not as common in neuroimaging. However, there are a few examples in the literature of combining neuroimaging information with genetics. Le Floch et al. [2012] applied SPLS to a dataset comprised of 1 054 068 Single Nucleotide Polymorphisms (SNPs) and 34 fMRI Regions of Interest (ROIs) from subjects performing a general cognitive assessment fMRI task. The framework included a dimensionality reduction step applied to the SNPs, using an univariate filter prior to applying SPLS. The first two weight vector pairs were computed, and sparsity was only applied on the filtered SNPs (not on the fMRI ROIs) [Le Floch et al., 2012]. Both the sparsity hyper-parameter and the univariate filter threshold were optimised. SPLS was also compared with other methods, including PLS and CCA. Other SPLS studies using genetic and neuroimaging data include the ones by Lin et al. [2014] and Grellmann et al. [2015], which applied it to datasets from patients with schizophrenia. Unlike the study by Le Floch et al. [2012], these studies used whole-brain data, and did not apply a dimensionality reduction step. Moreover, Lin et al. [2014] applied a group sparse version of SPLS, which took into account a
3.5. Applications in neuroimaging 71 SCCA/SPLS has been used in studies related to neurodegenerative diseases. Avants et al. [2010] applied the method proposed by Witten and Tibshirani [2009] to the study of Frontotemporal Dementia (FTD) and Alzheimer’s Disease (AD), using imaging data in two views: structural T1-weighted MRI and Diffusion Tensor Imaging (DTI). One of the most interesting results of the study is that it applied a linear regression model using the projections of both views to predict the scores of neuropsychological tests. The results showed that the projections of each disease had an association with their corresponding clinical neuropsychological test, i.e. the images from patients with AD showed a significant association with the Mini-Mental State Examination (MMSE) and no significant association with verbal fluency, while the reverse was observed for patients with FTD [Avants et al., 2010]. Despite these interesting results, there were some limitations of the study, namely: the sparsity levels were set a priori; only the first weight vector pair was considered; and only unidirectional associations were investigated, i.e. positivity constrains were imposed on the weights in order to improve the interpretability of the model [Avants et al., 2010], however, these constraints may also limit the relationships captured by the weight vector pairs. The use of neuroimaging data in both views has also been used in other settings. For example, Rosa et al. [2015] used SPLS to estimate the similarity between two Arterial Spin Labeling (ASL) datasets from the same subjects using different drugs, and Sui et al. [2015b] used it to analyse a dataset comprised of T1-weighted structural MR images and DTI data.
In another study by Avants et al., neurodegenerative diseases were again the focus. However, this time the authors used SCCA to find correlations between structural MR data (grey matter) and clinical variables coming from the Philadelphia Brief Assessment of Cognition (PBAC) [Avants et al., 2014]. This test contains 20 variables grouped into 5 psychometric sub-scales, which test different cognitive and behavioral/comportment deficits. Avants et al. [2014] performed 5 tests, each one with the MR images as one view and the clinical variables of a specific sub-scale as the other view. The results showed that the dimensionality reduction provided by SCCA enhanced the ability to detect associations between multivariate psychometric batteries and network level grey matter density measures, claiming to be the first study to have done it [Avants et al., 2014]. However, this study applied sparsity on
72 Chapter 3. Eigen-decomposition methods
the image voxels only and, once again, imposed positivity constraints on the weight vectors.
Other studies have applied SPLS to study the relationship between clinical scores and brain regions. Olson Hunt et al. [2014] have applied the method to a dataset comprised of brain ROIs (X) and the final score of the Modified Mini-Mental State Examination (Y ). The authors then fitted several models with different number of components ({u1, . . . , uk}), and sparsity levels for X, in order to determine how
often each ROI was selected across all the models.
SPLS can also be used in a supervised classification setting, in this case, the variables of Y are categorical, this is known as Sparse Partial Least Squares Discrim- ination Analysis (SPLS-DA). Labus et al. [2015] applied SPLS-DA to study pain, the authors used a dataset with patients with Irritable Bowel Syndrome (IBS) and con- trols, the features in X were derived from structural brain ROIs while Y contained the labels. The sparsity was set by stability selection, and the predictive ability of the final model was assessed on a hold-out dataset. This ability was characterised by supervised learning classification metrics, such as: sensitivity, specificity, positive predictive value, and negative predictive value.
Note that there have been other extensions of SPLS/SCCA methods with different kinds of sparsity penalties, which are usually chosen based on a priori assumptions regarding data structure [Lin et al., 2014, Du et al., 2015]. However, these specific adaptations are beyond the scope of this thesis.