Image Reconstruction Process - Machine learning techniques for high dimensional data

The image reconstruction process consists of the following three steps: generating low bit depth images, registering generated low bit depth images, and fusing aligned low bit depth images.

Generation of Low Bit Depth Images

A sequence of 2-bit images are considered low bit depth imagery. To equally divide the pixel intensity range (i.e. 0 to 255) into four sub-ranges for 2-bit image encoding (i.e. {00b, 01b, 10b, 11b}), three quantisation thresholds are set, respectively at 64, 128 and 192. It results of four equal pixel intensity sub-ranges: 0 to 63, 64 to 127, 128 to 191, and 192 to 255. In order to minimise the quantisation error, on reconstruction, the grey-levels after quantisation are chosen to be the mid point of the sub-ranges, i.e. 32, 96, 160 and 224. Low bit depth images can be sensitive to noise fluctuating near the thresholds, causing large intensity differences on reconstruction, although when averaged this can contain useful information [29]. The conversion from a 8-bit depth image to a 2-bit depth image can be easily accomplished by: preserving the 2 MSBs of the full-bit depth image image, whilst always setting the third MSB to one and the remaining 5 bits to zero. Examples of two full bit depth images and the corresponding low bit depth images in 2-bit are shown in Figure 5.1.

Registration of Low Bit Depth Images

Mathematically, a projective transformation (up to 8 degrees of freedom) is required to describe the geometrical alignment between two images of the same 2D scene. How- ever, the projective transformation is likely to be influenced by small errors, and any small distortion will spread after a number of images [287]. To overcome this problem, a similarity transformation (up to 4 degrees of freedom) is used instead of a projective transformation for images captured by small or micro-scale UAVs. A similarity transformation consists of a uniform scaling, a rotation, and a translation. Under a similarity distortion, angles between lines are preserved, which excludes more complex

(a) aerial image (b) aerial image in 2-bit

Figure 5.1: Examples of two 8-bit depth images and the corresponding 2-bit depth images: Gaussian noise is present in the 8-bit depth images, and the image structures are preserved in the 2-bit depth images.

transformations, such as shearing. As long as the UAV holds an approximate nadir view (looking downwards), using a similarity transformation to approximate the geometrical alignment between two images can lead to a less deformation of the resulting images [287, 293]. Since only a similarity transformation needs to be estimated, the Phase Correlation (Fourier) method is employed here. Phase Correlation (PC) can achieve a low computational complexity, because it can be computed efficiently for images using the Fast Fourier Transform (FFT) [17, 152].

Fusion of Low Bit Depth Images

The image details of the target scene discarded due to quantisation may be recon- structed by fusing a number of low bit depth images of the target area, which have been properly aligned [29]. The resultant fused image is easier to analyse or interpret than any individual source image. Because the image contrast changes from image to image, images of the same target scene taken at different times, from different view- points, and/or by different sensors are more likely to have a large variations in grey levels for the same pixel location. These crossings of the quantisation thresholds con- tains information about the original image, which can be recovered by averaging [29]. After quantisation, image details would initially appear to have vanished. However, once the images have been aligned by the image registration process, the contribution from the differences in spatial and intensity to the recovery of finer image details is beyond that one would expect from a single image.

Here, the 2-bit images are combined together by using a simple weighted sum. The weight associated with each 2-bit image depends on the quality of image registration result, which can be measured based on the Normalised Cross Correlation (NCC) between the reference and registered sensed imagesIRand ˜IS, respectively as:

N CC= P x P y (IR(x, y)−µR)( ˜IS(x, y)−µS) r P x P y (IR(x, y)−µR)2( ˜IS(x, y)−µS)2 (5.1)

where µR and µS are the mean images of the corresponding images, and they are

computed according to:

µR= 1 N X x X y IR(x, y) (5.2) µS = 1 N X x X y ˜ IS(x, y) (5.3)

where N is the number of pixels in the image. Also the NCC result can be used to indicate poorly registered images, by setting a certain threshold. Any poorly aligned image should be removed from the fusion process, since it is likely to degrade the quality of the final fused result [29].

Given a number of m aligned images I1, I2, . . . , Im, the pixel intensity value of the resultant fused imageI0 at the pixel location (x, y) can be computed by:

I0 = m

i=1

wiIi (5.4)

where the weightwi can be obtained as:

wi =     

N CCi, ifIi has a valid pixel intensity value at the pixel location (x, y)

0, otherwise

where N CCi represents the NCC result between the aligned image Ii (i = 1, . . . , m)

and its reference image. Additionally, the weights{wi}mi=1 are normalised such that m

i=1

wi= 1 (5.6)

5.3 Experimental Evaluation

In document Machine learning techniques for high dimensional data (Page 132-135)