Depth from Differential Gaussian Aperture Modification

Chapter 3 describes in detail the effect of changes in blur scale, but it does not discuss the effect of changing the aperture filter between shots, another common way of of generating a defocus change for depth inference. This

appendix will describe how a differential change in a Gaussian aperture filter can be used to extract scene information. There are two key differences between the results in this appendix and those in chapter 3. The first is that the proof of Gaussian uniqueness does not apply to this scenario, which permits a differential shift approach that works for any filter. The second is that, unlike the

unambiguous results from the defocus brightness constancy constraint, there is a side of focal plane ambiguity that arises when recovering depth from such

situations as an overall aperture scaling, a stretching along one axis, rotation of an anisotropic Gaussian, or a combination of all of these.

In this appendix, the Gaussian filter will be generalized beyond its covariance (a,b,c)to include a variable mean(μx_,_μy₎_{and overall transmittance factor}_d.

That is, it can now stretch, rotate, shift, and dim, according to:

From this filter, a corresponding post-processing operation for depth recovery can be derived. Recall that, with no scene motion or change in camera

parameters, image changeItis simplykt∗P. The goal is to recover a

depth-dependent scalarvand a depth-blind post-processing operatorm(x,y) such that

It =v m∗I (B.2)

for any underlying pinhole imageP. This universality requirement implies that the operatormmust take the blur kernelkto something proportional to its time derivative:

kt∗P =v m∗k∗P, ∀P (B.3)

→kt =v m∗k. (B.4)

In the frequency domain, this convolution takes the form of a multiplication, so that, withFindicating the Fourier transform,

v m=F−1[F[kt]/F[k]]. (B.5)

Combining equations B.1 and B.5, the general form ofvmcan be seen to contain the blur scaleσraised to the zeroth, first, and second powers:

v m= ( Σ2 2 (bbt −2act−2cat)−dt ) δ−σ(μx_t∂_x+μ_ty∂_y) +σ2∇_Σ2,t, (B.6)

where the notation∇2

Σ,tindicates a new warped Laplacian, which now depends

not only on the Gaussian widthsa,_b,_c_{but also on their time derivatives:}

∇2 Σ,t =− Σ4 4 (4c 2_a t−2bcbt+b2ct)∂xx − Σ4 4 (−4bcat +b 2 bt+4acbt−4abct)σ2∂xy − Σ4 4 (b 2_a t−2abbt+4a2ct)σ2∂yy. (B.7)

The multiple powers ofσin equation B.6 are a problem because it makes the expression forvmimpossible to split into the product of a termvthat depends on σwith a functionm(x,y)that does not. However, this equation can still reveal depth with some manipulation and by respecting certain physical limitations.

The first step towards a useful constraint is removing the first,

depth-independent Dirac delta term fromvm. This term reflects the overall change in light efficiency of the aperture as it simultaneously shrinks or broadens (throughat,bt, andct) while brightening or dimming (throughdt). Because the

lightness changeΔL, with

ΔL=Σ

2 (bbt −2act−2cat)−dt, (B.8)

depends only on known quantities, it can be accounted for with a simple pre-processing step:

It−(ΔL)I=v′m′∗I, (B.9)

wherev′m′is a new operator that now contains onlyσandσ2terms:

v′m′ =−σ(μx_t∂_x+μy_t∂_y)−σ2∇_Σ2,t. (B.10)

The combination ofσpowers in equation B.10 still prevents the desired depth-blind factoring into aσ-dependentv′and aσ-independentm′. It points to an underlying limitation on the kinds of aperture changes that can reveal depth in the desired way. Specifically, it leaves two possibilities: 1) when theσ2term vanishes, thenσand therefore depth can be recovered from the first spatial image derivatives, and 2) when theσterm vanishes, thenσ2_{and therefore depth up to a}

side of focal plane ambiguity can be recovered from the second spatial image derivatives.

In the first case,at =bt =ct =0, and the aperture filter is a Gaussian that

undergoes a lateral shift(μx_t,μyt). Then the light-efficiency-corrected image

changeIt−(ΔL)Itakes the form

It+dtI=−σ (

μx_tIx+μy_tIy )

, (B.11)

which reveals depth directly through the equation

Z= ( 1 Zf − It+dtI fμx tIx +fμ y tIy )₋₁ . (B.12)

This is essentially a differential version of depth from stereo, where the baseline is theσ-magnified filter shift. Depth from defocus and stereo are known to share fundamental similarities [96], and this approach has been used for pinhole [51]

and pillbox apertures [111], and would work in the same way for any filterκ. In the second case, depth can be recovered up to a side of focal plane ambiguity as long as the Gaussian does not shift laterally(_μx

t =μ y

t =0). Then,

image values are constrained by

It−(ΔL)I=−σ2∇Σ2,tI, (B.13)

and per-pixel depth is recovered, up to a side of focal plane ambiguity, by

Z= ( 1 Zf ± √ It−(ΔL)I −f2∇2 Σ,tI )−1 . (B.14)

Colophon

T

his thesis was typeset using LA_{TEX, originally developed by Leslie} Lamport and based on Donald Knuth’s TEX. The body text is set in 11 point Arno Pro, designed by Robert Slimbach in the style of book types from the Aldine Press in Venice, and issued by Adobe in 2007. A template, which can be used to format a PhD thesis with this look and feel, has been released under the permissive mit (x11) license, and can be found online at

github.com/suchow/or from the author at [email protected].

In document A Theory of Depth From Differential Defocus (Page 113-117)