• No results found

Sample principal component analysis

In practice, one has only realizations of Y; and population covariance matrix ΣY, as well as its eigenvalues, cannot be computed. Therefore, analysis of the corresponding sample quantities is required. Let SY be the sample covariance matrix of Y: principal components of Y, which have the property [44]

s2(˜vTY,iY) = ˜λY,i i = 1, . . . , M. (3.13)

When working with the sample principal components, the following problems should be solved:

1. how to check Assumption 1 using ˜λY,1, . . . , ˜λY,M;

2. when Assumption 1 holds, how to estimate qp – the number of the eigen-values not affected by the original image;

3. when Assumption 1 holds, how to construct an estimator of noise variance σ2using ˜λY,1, . . . , ˜λY,M.

There are several solutions and they are described below.

3.4.1 Bartlett’s test

In order to check Assumption 1 and estimate qp, Bartlett’s test [8], which tests the equality of several consecutive population eigenvalues, can be utilized.

Let

H0k : λY,M −k+1= · · · = λY,M (3.14) be the hypothesis that the last k eigenvalues of ΣY are equal, and

H1k: λY,M −k+1> λY,M (3.15) be the alternative hypothesis. H0k can be tested against H1k using Bartlett’s test [8, 44], in which H0k is rejected at significance level α if

N0 chi-squared distribution with ν degrees of freedom at point 1 − α, and ν = (k + 2)(k − 1)/2.

When Assumption 1 holds, λY,M −m+1 = · · · = λY,M. Therefore, H0mcan be used as a necessary condition for the fulfillment of Assumption 1. This condition is formally not sufficient. However, in practice, nonzero eigenvalues of ΣXare distinct [44], which can be explained by the fact that vector X represents image structures, and it is very unlikely to have the same variance in different directions. Hence condition H0m practically means that δX,p = 0 and (3.16) with k = m is a reliable check of Assumption 1.

Repeating test (3.16) for k = 2, 3, 4, . . . until H0kis rejected allows estimating qp as the maximal k, for which H0k is accepted. However, since test (3.16) is repeated more than once if H02 is not rejected, the overall significance level of the sequence of the tests is not equal to significance level α of each test.

Moreover, the tests are not independent and their number is a random variable so that the overall significance level is unknown [44].

CHAPTER 3. SIGNAL-INDEPENDENT NOISE PARAMETER ESTIMATION

3.4.2 Eigenvalue difference

Another possibility to check Assumption 1 is to look at the difference (˜λY,M −m+1− λ˜Y,M). The properties of the distribution of this difference can be obtained from the following theorem.

Theorem 1. If Assumption 1 is satisfied then the following asymptotic bound holds for all i = M − qp+ 1, . . . , M :

where C0 does not depend on the distributions of X and N.

The formal proof is given in the Appendix.

Using the result of Theorem 1, let us construct an asymptotic bound for the probability that the difference (˜λY,M −m+1− ˜λY,M) is greater than some threshold Tλσ2/√

N . From Markov’s inequality,

P˜λY,M −m+1− ˜λY,M ≥ Tλσ2

√N

≤

√N

Tλσ2E ˜λY,M −m+1− ˜λY,M. (3.19) Using the triangle inequality, we have

λ˜Y,M −m+1− ˜λY,M = |˜λY,M −m+1− ˜λY,M|

= |˜λY,M −m+1− σ2+ σ2− ˜λY,M|

≤ |˜λY,M −m+1− σ2| + |σ2− ˜λY,M|. (3.20) From monotonicity of the expected value,

E(˜λY,M −m+1− ˜λY,M) ≤ E(|˜λY,M −m+1− σ2|) + E(|σ2− ˜λY,M|). (3.21) Therefore, when Assumption 1 holds, for N ≥ N0

E(˜λY,M −m+1− ˜λY,M) ≤2C0σ2 condition for the fulfillment of Assumption 1 can be written as follows:

λ˜Y,M −m+1− ˜λY,M < Tλσ2/√

N . (3.24)

A question may arise whether condition (3.24) can be made stronger. When the original image x is zero, the limiting joint probability density function of (˜λY,1− σ2)√ are independent of σ and N . Hence

E(˜λY,i) = σ2+ σ2Ci/√

As a result, (3.22) is a tight upper bound, and (3.24) cannot be improved by changing the exponents of σ or N .

3.4.3 Estimators of the noise variance

Let

CHAPTER 3. SIGNAL-INDEPENDENT NOISE PARAMETER ESTIMATION

i.e. σest,k2 converges in mean to σ2. Therefore, the noise variance can be es-timated as σest,k2 . Due to the fact that convergence in mean implies conver-gence in probability, σ2est,k is a consistent estimator of the noise variance for any k ∈ {1, . . . , qp}. By substituting specific values of k, we can obtain two important special cases:

1. k = 1:

σest,k2 = ˜λY,M (3.33)

i.e. the last sample eigenvalue is used as a noise variance estimator.

The advantage of this estimator can be seen, when the estimate of qp is larger than its actual value, or when the check of Assumption 1 is satisfied, but Assumption 1 does not actually hold. In these cases, the last qp

eigenvalues of SYare affected by the original image and correct estimation of the noise variance cannot be guaranteed. ˜λY,M is less affected by the original image than the other sample eigenvalues, because the variance of Y among ˜vY,M is the smallest among all directions. Hence its is preferable to use ˜λY,M in order to minimize the estimation error when Assumption 1 does not hold.

The disadvantage of this estimator is that it is the smallest order statistic of the sample eigenvalues representing the noise. This can be illustrated on the case when original image x is zero. Then, population variance of the projections of Y onto all directions equals σ2, i.e. p = 1, q1= M and δY,1= σ2. On the other hand, the sample variances of the projections of Y for a finite sample of size N cannot be the same in all directions, i.e. the sample eigenvalues are almost surely different. Since ˜λY,M is the smallest sample eigenvalue, it has a negative bias, i.e. its expected value is smaller than σ2. As can be seen from Fig. ??(b), the smaller N is, the larger the spread of the sample eigenvalues is, i.e. the larger the bias of ˜λY,M is. As a result, the use of ˜λY,M can cause a considerable underestimation of the noise variance for very small images satisfying Assumption 1.

2. k = qp:

σ2est,k=

λ˜Y,M −qp+1+ · · · + ˜λY,M

qp (3.34)

i.e. the average of all sample eigenvalues corresponding to the noise is utilized as a noise variance estimator.

The advantage of this estimator is that it utilizes all information provided by PCA, i.e. all sample eigenvalues corresponding to the noise are used.

Consequently, these estimator has the smallest bias among all σest,k2 pro-vided that Assumption 1 holds. This allows accurate estimation of the noise variance for small N , i.e. for small images. Additionally, one can select a small block subset from all image blocks and use only this subset

Table 3.1: The estimates of qi and δY,i for the image shown in Fig. 3.2. The estimates of δY,i have been computed with the accuracy 10−4.

i qi δY,i i qi δY,i i qi δY,i

1 1 87541.0010 8 1 2.3485 15 1 0.0085

2 1 1115.0010 9 1 1.2356 16 1 0.0051

3 1 612.8945 10 1 0.7212 17 1 0.0047

4 1 61.4036 11 1 0.1731 18 1 0.0014

5 1 46.7131 12 1 0.1080 19 1 0.0007

6 1 25.0071 13 1 0.0571 20 1 0.0002

7 1 3.5403 14 1 0.0157 21 5 0.0000

for noise variance estimation without significant loss of accuracy, which results in smaller computation time.

The disadvantage of this estimator is that it relies on the estimate of qp. If the estimate of qp is larger than its true value, some sample eigenvalues affected by the original image are used, which leads to overestimation of the noise variance.

3.4.4 Example

Let us consider an example of sample PCA.

Since Bartlett’s test can be used to test the equality of any consecutive eigenvalues [44], the estimation of qi can be continued in the way described in Section 3.4.1 for i = p − 1, . . . , 1. Besides, the average of the sample eigenvalues can be used to estimate δY,i for i = p − 1, . . . , 1 as well.

The image shown in Fig. 3.2 was taken as a test image. This is the standard test image ’Cameraman’, which has been blurred with Gaussian kernel in order to remove possible noise. The estimates of qi and δY,i computed using 5 × 5 blocks are shown in Table 3.1; and the nonzero estimates of δY,i are plotted in Fig. 3.3. As one can see, all nonzero eigenvalues are distinct and have approximately exponential decay. The last five eigenvalues equal zero, which means that PCA can be used for noise variance estimation for this image.

3.4.5 Summary

All in all, sample PCA provides the following possibilities for assessment of the population properties:

1. for checking Assumption 1: test (3.16) with k = m; check (3.24);

2. for estimating qp: sequence of tests (3.16) with k = 2, 3, . . .;

3. for estimating σ2: estimators (3.33) and (3.34).

CHAPTER 3. SIGNAL-INDEPENDENT NOISE PARAMETER ESTIMATION

Figure 3.2: 512 × 512 ’Cameraman’ image blurred with Gaussian kernel with standard deviation σ = 2.

1 3 5 7 9 11 13 15 17 19

−10

−5 0 5 10 15

i lnδY,i

Figure 3.3: Semi-logarithmic plot of the nonzero estimates of δY,i for the image shown in Fig. 3.2.

These tests and estimators can be combined in different ways, which results in a family of methods. In the next two sections, two efficient algorithms from this family, which provide good trade-offs between accuracy and computation time, are described.