Discussion and further considerations - Machine Learning in Multi-frame Image Super-resolution

The images of Figures 6.7 to 6.9 show that our prior offers a qualitative improvement over the generic prior, especially when few input images are available (see the top row in each set of image results). In general, larger patch sizes (11×11 pixels) give smaller errors for the noisy inputs, while small patches (5×5) are better for the less noisy images.

Quantitatively, our method gives an RMS error of approximately 25 grey levels from only 2 input images with 2 grey levels of additive Gaussian noise on the “text” input images, whereas the best Huber-MAP prior super-resolution image for that image set and noise level uses all 10 available input images, and still has an RMS

Huber−MRF prior

Noise (grey levels)

Number of Images 2 6 12 2 5 10 Texture−based prior

Noise (grey levels)

Number of Images

2 6 12

Figure 6.7: “Text” results: Huber-MRF and texture-based. Top: super- resolution results using the Huber-MRF prior on datasets with 2, 5, or 10 images and a noise standard deviation of 2, 6 or 12 grey levels. Bottom: The texture-based prior on the same nine datasets.

Huber−MRF prior

Noise (grey levels)

Number of Images 2 6 12 2 5 10 Texture−based prior

Noise (grey levels)

Number of Images

2 6 12

Figure 6.8: “Brick” results: Huber-MRF and texture-based. Top: super- resolution results using the Huber-MRF prior on datasets with 2, 5, or 10 images and a noise standard deviation of 2, 6 or 12 grey levels. Bottom: The texture-based prior on the same nine datasets.

Huber−MRF prior

Noise (grey levels)

Number of Images 2 12 32 2 5 10 Texture−based prior

Noise (grey levels)

Number of Images

2 12 32

Figure 6.9: “Beads” results: Huber-MRF and texture-based. Top: super- resolution results using the Huber-MRF prior on datasets with 2, 5, or 10 images and a noise standard deviation of 2, 12 or 32 grey levels. Bottom: The texture-based prior on the same nine datasets. Note how well the texture-based version performs even on the extremely noisy case (right-hand column of both sets).

Figure 6.10: Ground truth leaf image and a “bad” texture match. The ground truth “leaf” image (120 × 120 pixels, left) is used to generate the low- resolution image data. The “spiral” image (80×80 pixels, right) is a poor choice of image from which to build a patch dictionary for super-resolution reconstruction.

beta=0.01 beta=0.04 beta=0.16 beta=0.64

Figure 6.11: Reconstruction using a “bad” texture. These four 120× 120 super-resolution images of the “leaf” image are reconstructed using different values of the prior strength parameter φ: 0.01, 0.04, 0.16, 0.64, from left to right, using a patch dictionary formed from the “spiral” image.

error score of almost 30 grey levels.

Figure 6.12 plots the RMS errors from the Huber-MAP and sample-based priors against each other. In all cases, the sample-based method fares better, with the difference most notable in the text example.

Further work on improving the computational complexity of the texture-based prior algorithm still needs to be carried out. For finding the gradient with respect to the high-resolution image pixels, the same k-nearest-neighbour variation introduced

0 10 20 30 40 50 60 0 10 20 30 40 50 60

Comparison of RMSE (grey levels)

Huber−MRF RMS Texture−based RMS equal−error line text images brick images bead images

Figure 6.12: Comparison of RMS errors. The error with respect to ground truth is measured for each image in Figures 6.7, 6.8 and 6.9 (18 images per figure), and the result from each texture-based prior image is plotted against the Huber-MAP prior error for the corresponding dataset (i.e.identical noise and number of images). This gives nine datapoints per texture type, as shown. In every case, the error associated with the texture-based prior is lower than the Huber-MRF error.

in [28], where multiple neighbourhoods are found for each pixel under consideration, could be adopted to smooth this response, which may also lead to better super- resolution image outputs.

Since in general the textures for the prior will not be invariant to rotation and scaling, consideration should be given to the orientation of the super-resolution image frame, i.e.so that the scene shares its horizontal and vertical directions with the image set from which the prior’s patch dictionary was sampled. The optimal patch size will be a function of the image textures as well as noise levels in the input dataset, so learning it as a parameter of an extended model is another direction of

interest.

Finally, handling multiple image textures from different dictionaries in the same optimization would allow the texture-based prior to be applied in more situations. Existing specialist super-resolution techniques for faces rely on registering the low- resolution images precisely so that different face regions (eyes, nose etc.) occupy specific pixels in the super-resolution image [18]. A texture-based prior could make use of this by learning a dictionary from many registered faces, and only checking the patches from face regions within a few pixels of the current candidate patch. Similarly, any application that incorporates both recognition and super-resolution could draw patches from more specialized dictionaries as regions in an image are recognized (e.g. field, road or water textures for satellite images).

Conclusions and future research

directions

7.1 Research contributions

The two main themes around which the research presented in this thesis have been based are firstly the benefits of using sensible and applicable image priors to help the super-resolution reconstruction process, and secondly the ways in which a super- resolution image estimate can be improved by considering uncertainties in the un- derlying image registrations and other parameters as part of the image estimation process.

Chapter 3 discusses the interdependent structure of multi-frame super-resolution, and explores the way errors induced in one part of the problem have coupled effects in the estimation of other components when the registrations and imaging parameters are estimated in a sequential manner before the main image reconstruction is carried out. No other surveys of super-resolution have touched on the way in which different sub-components of the problem interrelate in this way.

In Chapter 4 the question of how to handle the uncertainty in the parameter estimates in order to avoid the errors presented in the previous chapter is answered

with an algorithm for making a simultaneous point estimate all the values concerned: the super-resolution image, geometric and photometric registration parameters, and parameters for a non-Gaussian image prior. The work presented shows that image registration accuracy can be improved by taking the high-resolution image estimate into account, and these improvements are significant enough to be visible easily even on noisy input data from a DVD source.

A second key point in Chapter 4 is that the selection of parameter values for a parametric image prior, such as the Huber-MRF, can be made with reference to the input data. As well as taking another degree of trial-and-error out of the super- resolution problem, the scheme we present avoids the flaw in the cross-validation suggestion of previous authors, because it is handled in the same iterative framework as the image registration, so problems with mis-registered input images, which would cause serious problems for such methods previously, are avoided here.

An alternative answer to the question of handling uncertainty in the parameter estimates is presented in Chapter 5, which takes the Bayesian approach of integrating uncertainty out of the problem. We propose a scheme for marginalizing over residual uncertainty in the registrations, leaving an objective function that can be optimized directly with respect to the variables of interest, which are the intensity values of the super-resolution image pixels. As in the previous chapter, the approach is successful because it considers the super-resolution problem as a whole. In addition to the registration parameters estimated in the previous chapter, some slight improvement is also shown here in marginalizing over the parameter governing the point-spread function size. On the theme of correct choices of image priors, this method also shows an additional advantage over previous Bayesian super-resolution techniques because where they are limited to Gaussian priors for reasons of tractability, this

marginalization is viable for a wide range of image priors, including, but not limited to, Huber-MAP and texture-based priors.

Finally, Chapter 6 explores the possibilities of using image patches to improve the prior term in the MAP approach. Patches have been popular in single-image super- resolution methods, and the approach here shows how the patch-based methods can be brought into multi-frame super-resolution problems. In this work, we take the particular case of highly-textured scenes and show how a prior based on samples from images known to have similar textures to the target high-resolution image can be used to achieve a very significant improvement in output image quality over standard “smoothness” priors.

In document Machine Learning in Multi-frame Image Super-resolution (Page 167-176)