Correlation estimation with reference image

Given the set of K atoms {g_γ_i} that approximate the first image Î1, the disparity or motion estimation problem consists in finding the corresponding visual patterns in the second image I2, while the latter is given only by the compressed random measurements ˆY2. This is equivalent to finding the correlation between images I1 and I2 with the joint sparsity model based on local geometrical transformations described in Section 5.2.3. We describe here the proposed regularized optimization framework that leads to the estimation of the correlation between images I1and I2.

5.3.1 Regularized energy function

We are looking for a set of K atoms in I2 that corresponds to the K visual features{g_γ_i} selected in the ﬁrst image. We denote their parameters by Λ, where Λ = (γ₁, γ₂, . . . , γ_K ) for some γ_i, ∀i, 1 ≤ i ≤ K. We propose to select this set of atoms in a regularized energy minimization framework as a trade-oﬀ between the set that well approximates I2 and the set that results in smooth local transformations between images.

The energy model E proposed in our scheme is expressed as

E(Λ) = E_d(Λ) + α1Es(Λ), (OPT-1)

where E_dand E_srepresent data and smoothness terms respectively; α1is the regularization parameter that

5.3 Correlation estimation with reference image 69

set of K atom parameters Λ∗ that minimizes the energy E, i.e., Λ∗= arg min

Λ∈SE(Λ), (5.4)

where S represents the search space given by

S ={(γ₁, γ₂, . . . , γ_K )| γ_i = δγ_i◦ γ_i, 1≤ i ≤ K, δγ ∈ L} , (5.5) where{γ_i} are the parameters of the atoms in the reference image. The L ⊂ R5is given byL = [−δt_x, δt_x]× [−δt_y, δt_y]× [−δθ, δθ] × [−δs_x, δs_x]× [−δs_y, δs_y], where δt_x, δt_y, δθ, δs_x, δs_ydetermine the search window size corresponding to the translation parameters t_x, t_y, rotation θ, and scales s_x, s_yrespectively. We describe below the two cost functions used in OPT-1.

5.3.2 Data cost function

The data ﬁdelity term computes in the compressed domain, the accuracy of the sparse approximation of the second image with geometric atoms. The decoder receives the measurements ˆY2 that are computed by the

quantized projections of I2 onto a sensing matrix Φ. For each set of K atom parameters Λ ={γ_i}, the data

cost function E_d then reports the error between the measurements ˆY2 and the orthogonal projection of ˆY2

onto ΨΛthat is formed by the compressed versions of the atoms, i.e., ΨΛ= Φ[gγ₁|gγ₂| . . . |gγ_K ]. It turns out

that the orthogonal projection of ˆY2 is given as ΨΛΨ†_ΛYˆ2, where† represents the pseudo-inverse operator. More formally, the data cost is computed using the following relation:

E_d(Λ) = ˆY2− ΨΛΨ†ΛYˆ222= ˆY2− ΨΛc22. (5.6)

The data cost function given in Eq. (5.6) first calculates the coefficients c = Ψ†_ΛYˆ2 and then measures the distance between the observations ˆY2 and ΨΛc. In other words, the data cost function E_d accounts for the intensity variations between images by estimating the coefficients c of the warped atoms.

However, when the measurements are quantized, the coeﬃcient vector c fails to properly account for the error introduced by the quantization. The quantized measurements only provide the index of the quantization interval containing the actual measurement value and the actual measurement value could be any point in the quantization interval. Let Y_2,p be the pth _{coordinate of the original measurement, and}

Y_2,p be the corresponding quantized value. It should be noted that the joint decoder has only access to the quantized value ˆY_2,p, but not the original value Y_2,p. Henceforth, the joint decoder knows that the quantized measurement lies within the quantized interval, i.e., ˆY_2,p ∈ R_Yˆ_p = (rp, rp+1], where rp and

r_p+1 define the lower and upper bounds of the quantizer bin Q_p. We propose to refine the data term by computing a coefficient vector ˜c as the best solution when considering all the valid measurement values in

the quantization interval, i.e., ˜Y2 ∈ R_Yˆ, where R_Yˆ is the Cartesian product of all the quantized regions

R_Yˆ_p, i.e.,R_Yˆ =

p R

Yp. The coeﬃcients ˜c and the measurements ˜Y2 can be jointly estimated by solving the

following optimization problem:

(˜c, ˜Y2) = arg min ˜

c, ˜Y2

 ˜Y2− ΨΛ˜c22 s.t. ˜Y2∈ R_Yˆ. (5.7)

By following the steps in Proposition 1 described in Chapter 4, it can be easily shown that the Hessian of the objective function h( ˜Y2, ˜c) =  ˜Y2− ΨΛ˜c2₂ is positive semideﬁnite, i.e., ∇2h  0, and hence the

objective function h is convex (not strictly). Also the regionR_Yˆ forms a closed convex set, as each region

R_Yˆ_p= (r_p, r_p+1],∀p forms a convex set. Therefore, the optimization problem given in Eq. (5.7) is convex, but not strictly convex as the Hessian of the objective function is not positive deﬁnite. Finally, the data

fidelity term given in Eq. (5.6) can be modified with the estimated coefficients ˜c and the measurements ˜Y2

E_d(Λ) = ˜Y2− ΨΛ˜c22, (5.8)

where ˜E_d(Λ) represents the robust data function that eﬃciently accounts for the quantization noise.

5.3.3 Smoothness cost function

The goal of the smoothness term E_sis to penalize the atom transformations such that they result in coherent transformations of neighbor atoms. In other words, the atoms in the neighborhood are likely to undergo similar transformations, when the correlation between images is due to object or camera motion. Instead of penalizing directly the transformations{Fi_{} (or equivalently {δγ}) such that they tend to be coherent}

for neighbor atoms, we propose to generate a dense motion ﬁeld from the atom transformations and to penalize the motion (or disparity) ﬁeld such that it is coherent for adjacent pixels. This regularization is easier to handle than a regular set of transformations Fi_{and directly corresponds to the physical constraints}

that explain the formation of correlated images. For a given transformation ﬁeldf = (fh_,_fv_,_fθ_,_fa_,_fb_{), we}

compute the horizontalmh _{and vertical}_mv _{components of the motion ﬁeld as}

mh₍_z) mv₍_z) = m(z) − t_x(z) n(z) − t_y(z) − fa₍_z) ₀ 0 fb₍_z) S cos(fθ₍_z)) _sin(_fθ₍_z)) −sin(fθ₍_{z)) cos(f}θ₍_z)) R m(z) − t_x(z) − fh₍_z) n(z) − t_y(z) − fv₍_z) , T (5.9) where (m(z), n(z)) represent the Euclidean coordinates, and t_x(z) and t_y(z) represent the translation pa- rameters at location z. The matrices S, R and T represent the grid transformations due to scale, rotation and translation changes respectively. Finally, the smoothness cost E_sin OPT-1 is computed as

E_s=

z,z_∈N

V_z,z =

z,z_∈N

min|mh(z) − mh(z)| + |mv(z) − mv(z)|, τ , (5.10) wherez and z are the adjacent pixel locations, andN represents a usual 4-pixel neighborhood. The param- eter τ sets a maximum limit to the smoothness penalty term, and thus helps to preserve the discontinuities in the motion ﬁeld [130].

Now, we describe the methodology to estimate the dense transformation ﬁeld f from the sets of atom transformations. Given the ith_{pair of atoms g}

γi and gγiwith γi = (tx, ty, θ,sx, sy) and γi= (tx, ty, θ, sx, sy)

in the images I1 and I2 respectively, we ﬁrst calculate the local transformation captured by these atoms

given by

δγ_i = (t_x− t_x, t_y− t_y, θ− θ, s_x/s_x, s_y/s_y). (5.11) It should be noted that all the pixels in the support of the atom g_γ_i take the transformation value δγ_i, i.e.,

f(z) = δγi,∀z ∈ Zi, whereZi denotes the set of pixels in the support of the atom gγi given as

Zi={z = (m, n) | |gγi(m, n)| > }, (5.12)

where > 0 is a constant. Using a similar process (see Eq. (5.11)) a local transformation is established for all the atom pairs. Then, the transformations{δγ_k} captured by the K pairs of atoms are fused together to estimate a global transformation ﬁeldf. For a given location z, we ﬁrst assign relative weights {w(k)_z } to each candidate transformation δγ_i based on the response of the ith _{atom at the pixel location}_{z. Then, the}

fusion process is simply implemented by choosing the most conﬁdent transformation for each pixel position

z. Mathematically, we write the transformation at position z as

In document Distributed Compressed Representation of Correlated Image Sets (Page 84-87)