3.5 Study of TAD Background Model
4.1.2 Locally Linear Embedding
Designed by Roweis and Saul (2000), the locally linear embedding (LLE) algorithm is a local and nonlinear manifold learning method for mapping ~Xito ~Yi, where ~Xiis the high dimensional input data and ~Yi is the lower dimensional output data. This is done via
~
CHAPTER 4. MANIFOLD LEARNING IN SPECTRAL IMAGE ANALYSIS 74
are being mapped from a d-dimensional space to an m-dimensional space with m d, then ~Xi is a d × n matrix, ~Yi is an m × n matrix, and ~Wi j is an n × n matrix. LLE does not have an intrinsic dimensionality estimation step, so the user must choose a method for determining m. There are three main steps to LLE: (1) nearest neighbor search, (2) constrained least-squares local reconstruction, and (3) spectral embedding [3, 52]. They are outlined below, and are illustrated in Figure 4.2.
1. Nearest neighbor search. Saul and Roweis designed LLE to allow for any initial graph structure as chosen by the user, but their traditional implementation has only one free parameter controlled by the user: k, the number of nearest neighbors per local neighborhood. After the dimensionality reduction transformation is performed, the user must also choose their own method for determining the dimensionality of the transformed data, i.e., the number of transformed bands to select. While the choice of k may seem innocuous, it can have a dramatic effect on the performance of LLE. Should k be too small, then the neighborhoods will be underdefined and the points will not be well-constructed in the manifold space. Should k be too large, then the neighborhoods will not satisfy the locally linear criterion and the manifold will not be successfully recovered. This does not imply, however, that there is always one perfect k value for a set of data. In fact, the opposite is usually true: different k values are appropriate for different regions of the data (due to variations in density, curvature, etc., as one moves along the manifold). This was analyzed in Section 3.3 with the presentation of adaptive nearest neighbors, but for now we are concentrating on the original implementation of LLE which only designates one single k value for all of the neighborhoods. One important property of LLE is that it requires a connected graph, so if the selection of k is too low to generate a connected graph, then a connection such as the addition of a minimum spanning tree [22] or the addition of edges between connected components [3] must be imposed. As an alternative, LLE
CHAPTER 4. MANIFOLD LEARNING IN SPECTRAL IMAGE ANALYSIS 75
Figure 4.2: Roweis’s diagram illustrating the steps of the locally linear embedding (LLE) algorithm [51].
CHAPTER 4. MANIFOLD LEARNING IN SPECTRAL IMAGE ANALYSIS 76
may be implemented on individual connected components. However, it can be difficult to stitch those results together in the manifold coordinates.
2. Constrained least-squares local reconstruction. After the local k-neighborhood of each of the n image pixels is determined, the next step is to estimate the reconstruction of each pixel xias a linear combination of its k neighbors:
ˆxi = k X
j=1
wi jxj, (4.1)
where the scalar weights wi j are constrained to Pjwi j = 1. Here, wi j is equal to the contribution of the jthpixel to the reconstruction of the ithpixel [58]. If xjdoes not belong to the neighborhood of xi, then the matrix element wi j = 0. Subject to these constraints, the optimal reconstruction weights are then computed by minimizing the reconstruction errors of the following cost function:
E(W)= n X i=1 ~Xi− k X j=1 Wi j~Xj 2 . (4.2)
Eq. 4.2 sums the squared distances between each of the pixels and their respective recon- structions.
3. Spectral embedding. After the optimized matrix of weight values ˆW is computed in Step 2, the weights are used to calculate the embedded coordinates ~Y in the m-dimensional manifold space. Because the transformation in LLE preserves local linearity (the trans- formation is invariant to scale, rotation and translation), the weights for each of the neighborhoods in the spectral space will be identical in the manifold space. As in the prior step, LLE once again minimizes a cost function; however, this time it is the weights
CHAPTER 4. MANIFOLD LEARNING IN SPECTRAL IMAGE ANALYSIS 77
that are fixed and the embedded coordinates ~Yithat are optimized:
Φ(Y) = n X i=1 ~Yi− k X j=1 ˆ Wi j~Yj 2 . (4.3)
As shown in Eq. 4.3, the embedded coordinates are computed entirely from the geometric information encoded by the weights. The original input data ~Xiare not needed in this step. Roweis and Saul (2000, 2003) show that this second cost function is minimized by solving a sparse n × n eigenvalue problem. They demonstrate that the bottom m+ 1 non-zero eigenvectors constitute an ordered set of m embedding coordinates, which are given as ~Yi. The single bottom vector of all 1s (which corresponds to an eigenvalue of 0) is ignored. The multiplicity of this eigenvector is equal to the number of connected components in the data, which is why LLE requires the data to be one connected component. If LLE is applied to individual connected components, then the multiplicity will be greater than one. In all implementations in this research, however, we impose connectivity on the graph.