Inverse Laplace Transform methods - Signal sampling and processing in magnetic resonance applic

It was shown in Sections 2.9-2.12 that the vectorised NMR signal, Yv, from either a 1D T1,

T2, D experiment, or a 2D NMR correlation experiment encoded with the joint distribution

of any two of these parameters, can be written in the form:

𝒀_v = 𝑲 𝑼_v+ 𝑬_v , (3.1)

where K is a kernel function which depends on the type of experiment, Uv is the vectorised

distribution that needs to be extracted, and Ev represents the noise vector that corrupts the

signal. Methods for extracting Uv from a knowledge of Yv and K are known collectively

as Inverse Laplace Transform (ILT) methods; the reason for the name is related to the kernel function being composed of exponentially decaying functions.

𝑼v = 𝑲−𝟏 𝒀v , (3.2)

is not suitable for three major reasons. The inverse of the kernel function may not exist as K is not necessarily a square matrix; it typically has more columns than rows which means that there are an infinite number of solutions for Uv. Even if K is designed to be a square

matrix, its condition number, κ(K), defined as the ratio of its largest to the smallest singular value [1], is so large that small changes in the acquired signal Yv (due to the random nature

of noise) can lead to very large changes in the reconstructed Uv. The large condition

number is caused by the constituent exponentially decaying functions of K being very similar to each other. The fact that K may not be a square matrix or that it has a large condition number makes the extraction of Uv an ill-conditioned inverse problem.

Physically, Uv should not have any negative entries because it represents amounts of

material. However, there is nothing constraining Uv not to have negative entries in the

solution according to Eq. (3.2).

3.1.1. Non-Negative Least Squares method

The non-negativity constraint was first addressed with the Non-Negative Least Squares (NNLS) method developed by Lawson and Hanson [2], who used an active set method to estimate Uv from the following minimization problem:

𝑼v = arg min 𝑼v≥𝟎

2‖𝑲 𝑼v− 𝒀v‖2

2_, _(3.3)

where the inequality is element-wise. The Lp-norm of a vector a, ||a||p, is defined as:

‖𝒂‖𝑝 = (∑𝑛𝑖=1|𝑎𝑖|𝑝)1/𝑝 . (3.4)

Faster algorithms for imposing the non-negativity constraint have been developed [3], but they are all based in solving the minimization problem in Eq. (3.3). In this work, the routine lsqnonneg in MATLAB [4] was used to implement the NNLS method.

Although the NNLS method has been very successful, its main disadvantage is that it does not address the ill-conditioning of the inverse problem. As a result, solutions are unstable to noise and the instability increases with the noise level.

3.1.2. Tikhonov regularization

Tikhonov regularization [5] is the most commonly used method to address the ill-conditioning issue of the inverse problem, although other methods exist [6]. In

Tikhonov regularization, the distribution Uv is estimated from the following minimization

problem: 𝑼v = arg min 𝑼≥𝟎 ( 1 2‖𝑲 𝑼v − 𝒀v‖2 2_{+ 𝛼‖𝑹 𝑼} v‖22) , (3.5)

where α is a regularization parameter and R is a regularization matrix. The first term is referred to as the fidelity term, while the second term is referred to as the penalty term. There are different explanations as to why Tikhonov regularization improves the reconstructions. The simplest explanation is that it narrows down the range of potential solutions of Uv, by searching for solutions that have a minimal penalty term. Another

explanation comes from the solution to Eq. (3.5), neglecting the non-negativity constraint:

𝑼_v = (𝑲T_{𝑲 + 𝛼𝑹}T_𝑹)−1_𝑲T_𝒀

v . (3.6)

The matrix being inverted in Eq. (3.6) is generally of a much lower condition number than the matrix that needs to be inverted for the solution of Eq. (3.3) which, when neglecting the non-negativity constraint, is:

𝑼v = (𝑲T𝑲)−1𝑲T𝒀v . (3.7)

This renders Tikhonov regularization reconstructions more stable to noise than NNLS reconstructions. Other explanations can be given in terms of a singular value decomposition [7] or overfitting argument [8].

For 1D NMR experiments, the regularization matrix, R, is either taken to be the identity matrix or the second order derivative matrix; the second order derivative matrix is a tridiagonal matrix with entries 1, -2, 1 on diagonals of number -1, 0, 1. The choice of the second order derivative matrix, also known as the Phillips-Twomey method [9, 10], constrains Uv to be smooth. For 2D NMR correlation experiments, R is almost exclusively

chosen to be the identity matrix, because of the difficulty of defining the second order derivative in two dimensions.

The regularization parameter, α, controls the amount of regularization imposed on the reconstructed distribution. A larger value of α leads to smaller values of the penalty term in Eq. (3.5) (hence, smoother distribution) and larger values of the fidelity term (hence, worse fit to the experimental data). This is known as the bias-variance trade-off. There is, therefore, a need to rationalize the choice of α. The Generalized Cross Validation (GCV) technique [11] is used in this work to choose α, although other techniques such as the

L-curve [12] and the Morozov Discrepancy Principle [13] exist. The choice was made mainly on the statistical basis of GCV. GCV is a leave-one-out cross-validation technique: one data point is used as a validation set while the other data points are used as the training set. The process is repeated for all possible choices of the validation set. The average test error over all possible validation sets is then referred to as the GCV score. A GCV score for a range of values of α, GCV(α), is calculated and the best regularization parameter is chosen to be the one that minimizes the GCV(α). An analytical solution exists for GCV(α) [11]:

GCV(𝛼) =‖𝑲 𝑼v(𝛼)−𝒀v‖22

(𝑛−𝑑𝑓(𝛼))2 , (3.8)

where Uv(α) is the reconstructed distribution for the choice of α, n is the size of Uv and

df(α) is referred to as the number of fitted parameters, or degrees of fr eedom, given by:

df(𝛼) = 𝑲(𝑲T_{𝑲 + 𝛼 𝑹}T_𝑹)−1_𝑲T_. _(3.9)

GCV is a model selection method, similar to the Akaike Information Criterion [14], Bayesian Information Criterion [15] or Cp method [16], in the sense that it invokes a

parsimony argument; as the regularization parameter α changes, the number fitted parameters df(α) varies, and model selection methods decide on the optimal α based on a trade-off between fidelity to the data and df(α). Although GCV has been the most commonly used model selection method, the other methods have also been used in the inversion of NMR data [17]. Practically, all these methods give very similar results, but GCV is easier to calculate as it does not require a knowledge of the noise level in the data.

Several algorithms have been developed over the years to numerically solve Eq. (3.5) [18-23]. In this work, Eq. (5) is cast as a quadratic programming problem [24] and the routine quadprog in MATLAB [4] is subsequently applied.

3.1.3. Data compression

For 2D NMR correlation experiments, the data size can be very large, particularly when one of the distributions being extracted is the T2 relaxation time constant. As a result, the

numerical computation of Eqs. (3.3) and (3.5) can be time and memory consuming. It is, therefore, common to perform a pre-compression of the NMR data before feeding it to the appropriate ILT algorithm. The compression technique used in this work is the one developed by Venkataramanan et al. [21]. As discussed in Section 2.12, the acquired NMR data for a 2D NMR correlation experiment can be written in a matrix form:

𝒀 = 𝑲₁ 𝑼 𝑲₂T+ 𝑬 . (3.10)

Let the singular value decomposition of K1 and K2 be U1Σ1V1T and U2Σ2V2T, where U1,

U2, V1 and V2 are orthogonal matrices of respective sizes p × s1, q × s2, n × s1 and m × s2,

while Σ1 and Σ2 are diagonal square matrices of respective sizes s1 × s1 and s2 × s2 with

sorted values on the diagonals (known as singular values). Because of the large condition number of K1 and K2, the singular values span orders of magnitude and there are only a

few important singular values. The compression is based on the truncation of the singular values such that only the most important ones are kept; the choice of how many are kept is typically made based on the noise level of the NMR signal. Let the truncated diagonal matrices be Σ1t and Σ2t, of respective sizes s1t × s1t and s2t × s2t. The corresponding

truncated orthogonal matrices U1t, U2t, V1t and V2t are of respective sizes p × s1t, q × s2t,

n × s1t and m × s2t. Pre-multiplying by U1tT and post-multiplying by U2t both sides of

Eq. (3.10), the following expression is obtained:

𝒀_t = 𝑲_1𝑡 𝑼 𝑲_2𝑡T+ 𝑬_t , (3.11)

where Yt = U1tTYU2t, K1t = Σ1t V1tT, K2t = Σ2t V2tT and Et = U1tTEU2t. Eq. (3.11) is of the

same form as Eq. (3.10) but the size of matrices involved is much smaller. Therefore, by applying the above compression method to the NMR signal and the kernel matrices, the reconstruction of distributions becomes less time and memory demanding.

Apart from making the reconstruction of distributions less time and memory demanding, the data compression technique described has also the ‘side-effect’ of making the inverse problem less ill-conditioned; the condition number of K1t and K2t is smaller than the

condition number of K1 and K2.

In document Signal sampling and processing in magnetic resonance applications (Page 56-60)