were requested to look at pictures displayed on the screen and listen to a corresponding auditory guide. Two types of pictures will be on display for 7 s in random orders: emotionally neutral pictures (landscape, everyday objects, etc.) and negative pictures (car crash, nature dis- asters, etc.). An auditory guide will come after the picture on display for 1 s, instructing the participant to “Look”: viewing the neutral pictures; to “Maintain: viewing the negative pictures as they normally would; or to “Reap- praise”: viewing the negative pictures while attempting to reduce their negative emotion by reinterpreting the meaning of pictures [16, 17]. A separate session of eight- minute eyes-open EEG resting states was recorded for later manifoldlearning. All EEG data were preprocessed using Brain Vision Analyzer (Brain Products, Gilching, Germany), by first segmenting task trials into 7 s seg- ments. A sliding window in size of 0.5 s and a step size of 0.05 were applied to create the dynamic data. (The first and last five time points were discarded, resulting in 130 time points per session.) Frequencies of interest were set from 1Hz to 50Hz in increments of 1Hz. The final out- put of each subject was averaged over trials within the same task. Resting-state data were processed under the same parameter. Due to the non-trial-based setting of the recording, resting state will only serve as bases that further create contrast in manifoldlearning. Thus, the manifold properties associated with the resting-state connectomes in the Euclidean space are not included in our final analyses.
Abstract: In remote sensing, hyperspectral and polarimetric synthetic aperture radar (PolSAR) images are the two most versatile data sources for a wide range of applications such as land use land cover classification. However, the fusion of these two data sources receive less attention than many other, because of their scarce data availability, and relatively challenging fusion task caused by their distinct imaging geometries. Among the existing fusion methods, including manifoldlearning-based, kernel-based, ensemble-based, and matrix factorization, manifoldlearning is one of most celebrated techniques for the fusion of heterogeneous data. Therefore, this paper aims to promote the research in hyperspectral and PolSAR data fusion, by providing a comprehensive comparison between existing manifoldlearning-based fusion algorithms. We conducted experiments on 16 state-of-the-art manifoldlearning algorithms that embrace two important research questions in manifoldlearning-based fusion of hyperspectral and PolSAR data: (1) in which domain should the data be aligned—the data domain or the manifold domain; and (2) how to make use of existing labeled data when formulating a graph to represent a manifold—supervised, semi-supervised, or unsupervised. The performance of the algorithms were evaluated via multiple accuracy metrics of land use land cover classification over two data sets. Results show that the algorithms based on manifold alignment generally outperform those based on data alignment (data concatenation). Semi-supervised manifold alignment fusion algorithms performs the best among all. Experiments using multiple classifiers show that they outperform the benchmark data alignment-based algorithms by ca. 3% in terms of the overall classification accuracy.
A crucial step is the inverse mapping (pre-image) for the manifoldlearning methods. For di↵usion maps there are current two existing solutions. In this work, a new approach where only linear algebra is involved to construct the pre-image map (based on a continuous state space spectral analysis of certain Markov operators defined on a graph) is introduced . Unlike the existing solutions where complex optimization algorithms are involved, our method is computationally cheap and free from issues such as local optima trapping and sensitivity to initial guesses. The existing methods were developed for 2-D and 3-D shape analysis, and would be computationally infeasible for the problems considered in the thesis. Furthermore, for techniques with existing pre-image solutions, e.g., kernel PCA and Isomap, a new general framework for the pre-image map is also introduced (chapter 4).
Multi-atlas segmentation has been widely used to segment various anatomical structures. The success of this technique partly relies on the selection of atlases that are best mapped to a new target image after registration. Recently, manifoldlearning has been proposed as a method for atlas selection. Each manifoldlearning technique seeks to optimize a unique objective function. Therefore, different techniques produce different embeddings even when applied to the same data set. Previous studies used a single technique in their method and gave no reason for the choice of the manifoldlearning technique employed nor the theoretical grounds for the choice of the manifold parameters. In this study, we compare side- by-side the results given by 3 manifoldlearning techniques (Isomap, Laplacian Eigenmaps and Locally Linear Embedding) on the same data set. We assess the ability of those 3 different techniques to select the best atlases to combine in the framework of multi-atlas segmentation. First, a leave-one-out experiment is used to optimize our method on a set of 110 manually segmented atlases of hippocampi and find the manifoldlearning technique and associated manifold parameters that give the best segmentation accuracy. Then, the optimal parameters are used to automatically segment 30 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI). For our dataset, the selection of atlases with Locally Linear Embedding gives the best results. Our findings show that selection of atlases with manifoldlearning leads to segmentation accuracy close to or significantly higher than the state-of-the-art method and that accuracy can be increased by fine tuning the manifoldlearning process.
Manifoldlearning may be seen as a procedure aiming at capturing the degrees of freedom and structure characterizing a set of high-dimensional data, such as images or patterns. The usual goals are data understanding, visualization, classification, and the computation of means. In a linear framework, this problem is typically addressed by principal component analysis (PCA). We propose here a nonlinear extension to PCA. Firstly, the reduced variables are determined in the metric multidimensional scaling framework. Secondly, regression of the original variables with respect to the reduced variables is achieved considering a piecewise linear model. Both steps parameterize the (noisy) manifold holding the original data. Finally, we address the projection of data onto the manifold. The problem is cast in a Bayesian framework. Application of the proposed approach to standard data sets such as the COIL-20 database is presented.
In this paper, we investigate the use of manifoldlearning techniques to enhance the separa- tion properties of standard graph kernels. The idea stems from the observation that when we perform multidimensional scaling on the distance matrices extracted from the kernels, the resulting data tends to be clustered along a curve that wraps around the embedding space, a behaviour that suggests that long range distances are not estimated accurately, resulting in an increased curvature of the embedding space. Hence, we propose to use a number of manifoldlearning techniques to compute a low-dimensional embedding of the graphs in an attempt to unfold the embedding manifold, and increase the class separation. We perform an extensive experimental evaluation on a number of standard graph datasets using the shortest-path , graphlet , random walk  and Weisfeiler-Lehman  kernels. We observe the most significant improvement in the case of the graphlet kernel, which fits with the observation that neglecting the locational information of the substructures leads to a stronger curvature of the embedding manifold. On the other hand, the Weisfeiler-Lehman kernel partially mitigates the locality problem by using the node labels information, and thus does not clearly benefit from the manifoldlearning. Interestingly, our experiments also show that the unfolding of the space seems to reduce the performance gap between the examined kernels.
they occupy the whole space in an anisotropic manner: a manifold assumption then makes sense. Manifoldlearning algorithms assume that the original high dimensional data actually lie on an embedded lower dimensional manifold. Similarly to numerous other image processing or computer vision classification problems, one expects the data to live in a Riemannian  or statistical  manifold, the geometric understanding of which will help the interpretation of the classification model (decomposition of a class into several sub-classes, computation of the distance between each class, etc.). Thus, from a practitioner point of view, it appears sensible to have a classifier built on this manifold assumption. Techniques like PCA or Minimum Noise Fraction  can be applied to hyperspectral images in order to determine their intrinsic dimension. The mapping of the data from high to low dimensional spaces can also be learnt thanks to learning algorithms such as ISOMAP  or local linear embedding . Their applications to hyperspectral image data have been proposed recently , , , ,  and the results indicate that they can be efficiently represented and characterised in low dimensional spaces. Indeed, it has been identified in  as a way to deal with the geometry of hyperspectral data (complex and dominated by nonlinear structures) and thus to further improve the classification accuracy.
This Paper focuses on Localization in Wireless Sensor Network using a new methodology named ManifoldLearning. The localization of wireless sensor network is essential for various applications today which need to know a node’s actual position. Also, there are many factors which affect the performance of any algorithm which has been designed for localization. These factors are discussed in this paper. Localization has become a hot topic of research and many new versions and increments are made to the algorithms for improving the performance. We made use of basically two methods which are applied one after the other. These two techniques come under Manifoldlearning, the first one is Locally linear embedding; the second is its incremental version by the name of Incremental Locally linear embedding.
We present an efficient approach for broadcast news story segmentation using a manifoldlearning algorithm on latent top- ic distributions. The latent topic distribu- tion estimated by Latent Dirichlet Alloca- tion (LDA) is used to represent each text block. We employ Laplacian Eigenmap- s (LE) to project the latent topic distribu- tions into low-dimensional semantic rep- resentations while preserving the intrinsic local geometric structure. We evaluate t- wo approaches employing LDA and prob- abilistic latent semantic analysis (PLSA) distributions respectively. The effects of different amounts of training data and dif- ferent numbers of latent topics on the two approaches are studied. Experimental re- sults show that our proposed LDA-based approach can outperform the correspond- ing PLSA-based approach. The proposed approach provides the best performance with the highest F1-measure of 0.7860. 1 Introduction
Isometric feature mapping (Isomap)  is a well-known manifoldlearning al- gorithm. Its approach is to find the geodesic distances between neighboring data points using shortest-path distances. Then it uses the Multidimensional Scaling (MDS) method, which given a matrix of dissimilarity D ∈ R n × n constructs a set of points such that their Euclidean distances match the ones in D, to find points in a low-dimensional Euclidean space that match the nearest neighbors geodesic distances found in the first step.
Though manifoldlearning has been success- fully applied in wide areas, such as data visu- alization, dimension reduction and speech rec- ognition; few researches have been done with the combination of the information theory and the geometrical learning. In this paper, we carry out a bold exploration in this field, raise a new approach on face recognition, the intrinsic α-Rényi entropy of the face image attained from manifoldlearning is used as the characteristic measure during recognition. The new algorithm is tested on ORL face database, and the ex- periments obtain the satisfying results.
Abstract. Linear dimensionality reduction plays a very important role in side chan- nel attacks, but it is helpless when meeting the non-linear leakage of masking im- plementations. Increasing the order of masking makes the attack complexity grow exponentially, which makes the research of nonlinear dimensionality reduction very meaningful. However, the related work is seldom studied. A kernel function was firstly introduced into Kernel Discriminant Analysis (KDA) in CARDIS 2016 to re- alize nonlinear dimensionality reduction. This is a milestone for attacking masked implementations. However, KDA is supervised and noise-sensitive. Moreover, sever- al parameters and a specialized kernel function are needed to be set and customized. Different kernel functions, parameters and the training results, have great influence on the attack efficiency. In this paper, the high dimensional non-linear leakage of masking implementation is considered as high dimensional manifold, and manifoldlearning is firstly introduced into side channel attacks to realize nonlinear dimen- sionality reduction. Several classical and practical manifoldlearning solutions such as ISOMAP, Locally Linear Embedding (LLE) and Laplacian Eigenmaps (LE) are given. The experiments are performed on the simulated unprotected, first-order and second-order masking implementations. Compared with supervised KDA, manifoldlearning schemes introduced here are unsupervised and fewer parameters need to be set. This makes manifoldlearning based nonlinear dimensionality reduction very simple and efficient for attacking masked implementations.
Current methods for mapping networks to hyperbolic space are based on maximum likelihood estimations or manifoldlearning. The former approach is very accurate but slow; the latter improves efficiency at the cost of accuracy. Here, we analyse the strengths and limitations of both strategies and assess the advantages of combining them to efficiently embed big networks, allowing for their examination from a geometric perspective. Our evaluations in artificial and real networks support the idea that hyperbolic distance constraints play a significant role in the formation of edges between nodes. This means that challenging problems in network science, like link prediction or community detection, could be more easily addressed under this geometric framework.
Repeated evaluations of expensive computer models in applications such as de- sign optimization and uncertainty quantification can be computationally infea- sible. For partial differential equation (PDE) models, the outputs of interest are often spatial fields leading to high-dimensional output spaces. Although emu- lators can be used to find faithful and computationally inexpensive approxima- tions of computer models, there are few methods for handling high-dimensional output spaces. For Gaussian process (GP) emulation, approximations of the correlation structure and/or dimensionality reduction are necessary. Linear di- mensionality reduction will fail when the output space is not well approximated by a linear subspace of the ambient space in which it lies. Manifoldlearning can overcome the limitations of linear methods if an accurate inverse map is available. In this paper, we use kernel PCA and diffusion maps to construct GP emulators for very high-dimensional output spaces arising from PDE model simulations. For diffusion maps we develop a new inverse map approximation. Several examples are presented to demonstrate the accuracy of our approach. Keywords: Parameterized partial differential equations, Gaussian process emulation, High dimensionality, Manifoldlearning, Inverse mapping, Kernel PCA, Diffusion maps
were determined by the detected red patches (e.g., vehicle taillights) and their corresponding pairs. The unfeasible taillight pairs were pre-filtered out and eliminated using geometric rules. It is necessary to identify the vehicle types because the locations of ROI are different for different vehicle types. Before the second module, non-panel regions, e.g., vehicle windows or other reflecting area, were elim- inated. Second, the color histograms in an ROI were classified using a trained classifier. A manifoldlearning algorithm, called nearest feature line embedding (NFLE) , reduces the dimensionality of color features for redu- cing the illumination impacts. NFLE discovers the intrin- sic manifold structure from the data by considering the relationship among samples. Not only the dimensions are reduced, but also the illumination impacts are reduced. Finally, the vehicle colors were determined by the domin- ant colors in the ROI. Seven colors, e.g., red, yellow, blue, green, black, white, and gray, were identified in this study.
ment is volatile and uncertain. But from stock charts, difference of trends between different stocks reflects on their price, whether they are rising or dropping and the degree of their movements. That is to say, though stock data is high-dimensional and nonlinear, there are only three underlying parameters. As for how to find these underlying parameters, manifoldlearning provides a good solution.
The main challenge raised by this paper is the need to develop manifold-learning algorithms that have low computational demands, are robust against noise, and have theoretical convergence guarantees. Existing algorithms are only partially successful: normalized-output algorithms are efficient, but are not guaranteed to converge, while Isomap is guaranteed to converge, but is com- putationally expensive. A possible way to achieve all of the goals simultaneously is to improve the existing normalized-output algorithms. While it is clear that, due to the normalization constraints, one cannot hope for geodesic distances preservation nor for neighborhoods structure preservation, success as measured by other criteria may be achieved. A suggestion of improvement for LEM appears in Gerber et al. (2007), yet this improvement is both computationally expensive and lacks a rigorous consistency proof. We hope that future research finds additional ways to improve the existing methods, given the improved understanding of the underlying problems detailed in this paper.
Supervised manifoldlearning methods learn data representations by preserving the geomet- ric structure of data while enhancing the separation between data samples from different classes. In this work, we propose a theoretical study of supervised manifoldlearning for classification. We consider nonlinear dimensionality reduction algorithms that yield linearly separable embeddings of training data and present generalization bounds for this type of algorithms. A necessary condition for satisfactory generalization performance is that the embedding allow the construction of a sufficiently regular interpolation function in relation with the separation margin of the embedding. We show that for supervised embeddings satisfying this condition, the classification error decays at an exponential rate with the number of training samples. Finally, we examine the separability of supervised nonlinear embeddings that aim to preserve the low-dimensional geometric structure of data based on graph representations. The proposed analysis is supported by experiments on several real data sets.
ManifoldLearning (ML) is a class of algorithms seeking a low-dimensional non-linear rep- resentation of high-dimensional data. Thus, ML algorithms are most applicable to high- dimensional data and require large sample sizes to accurately estimate the manifold. De- spite this, most existing manifoldlearning implementations are not particularly scalable. Here we present a Python package that implements a variety of manifoldlearning algo- rithms in a modular and scalable fashion, using fast approximate neighbors searches and fast sparse eigendecompositions. The package incorporates theoretical advances in mani- fold learning, such as the unbiased Laplacian estimator introduced by Coifman and Lafon (2006) and the estimation of the embedding distortion by the Riemannian metric method introduced by Perrault-Joncas and Meila (2013). In benchmarks, even on a single-core desktop computer, our code embeds millions of data points in minutes, and takes just 200 minutes to embed the main sample of galaxy spectra from the Sloan Digital Sky Survey— consisting of 0.6 million samples in 3750-dimensions—a task which has not previously been possible.
In the previous section, we discussed two sampling-based techniques that generate approximations for kernel matrices. Although we analyzed the effectiveness of these techniques for approximat- ing singular values, singular vectors and low-rank matrix reconstruction, we have yet to discuss the effectiveness of these techniques in the context of actual machine learning tasks. In fact, the Nystr¨om method has been shown to be successful on a variety of learning tasks including Support Vector Machines (Fine and Scheinberg, 2002), Gaussian Processes (Williams and Seeger, 2000), Spectral Clustering (Fowlkes et al., 2004), Kernel Ridge Regression (Cortes et al., 2010) and more generally to approximate regularized matrix inverses via the Woodbury approximation (Williams and Seeger, 2000). In this section, we will discuss how approximate embeddings can be used in the context of manifoldlearning, relying on the sampling based algorithms from the previous section to generate an approximate SVD. We present the largest study to date for manifoldlearning, and provide a quantitative comparison of Isomap and Laplacian Eigenmaps for large scale face manifold construction on clustering and classification tasks.