Chapter 3 describes in detail the effect of changes in blur scale, but it does not discuss the effect of changing the aperture filter between shots, another common way of of generating a defocus change for depth inference. This
appendix will describe how a differential change in a Gaussian aperture filter can be used to extract scene information. There are two key differences between the results in this appendix and those in chapter 3. The first is that the proof of Gaussian uniqueness does not apply to this scenario, which permits a differential shift approach that works for any filter. The second is that, unlike the
unambiguous results from the defocus brightness constancy constraint, there is a side of focal plane ambiguity that arises when recovering depth from such
situations as an overall aperture scaling, a stretching along one axis, rotation of an anisotropic Gaussian, or a combination of all of these.
In this appendix, the Gaussian filter will be generalized beyond its covariance (a,b,c)to include a variable mean(μx,μy)and overall transmittance factord.
That is, it can now stretch, rotate, shift, and dim, according to:
From this filter, a corresponding post-processing operation for depth recovery can be derived. Recall that, with no scene motion or change in camera
parameters, image changeItis simplykt∗P. The goal is to recover a
depth-dependent scalarvand a depth-blind post-processing operatorm(x,y) such that
It =v m∗I (B.2)
for any underlying pinhole imageP. This universality requirement implies that the operatormmust take the blur kernelkto something proportional to its time derivative:
kt∗P =v m∗k∗P, ∀P (B.3)
→kt =v m∗k. (B.4)
In the frequency domain, this convolution takes the form of a multiplication, so that, withFindicating the Fourier transform,
v m=F−1[F[kt]/F[k]]. (B.5)
Combining equations B.1 and B.5, the general form ofvmcan be seen to contain the blur scaleσraised to the zeroth, first, and second powers:
v m= ( Σ2 2 (bbt −2act−2cat)−dt ) δ−σ(μxt∂x+μty∂y) +σ2∇Σ2,t, (B.6)
where the notation∇2
Σ,tindicates a new warped Laplacian, which now depends
not only on the Gaussian widthsa,b,cbut also on their time derivatives:
∇2 Σ,t =− Σ4 4 (4c 2a t−2bcbt+b2ct)∂xx − Σ4 4 (−4bcat +b 2 bt+4acbt−4abct)σ2∂xy − Σ4 4 (b 2a t−2abbt+4a2ct)σ2∂yy. (B.7)
The multiple powers ofσin equation B.6 are a problem because it makes the expression forvmimpossible to split into the product of a termvthat depends on σwith a functionm(x,y)that does not. However, this equation can still reveal depth with some manipulation and by respecting certain physical limitations.
The first step towards a useful constraint is removing the first,
depth-independent Dirac delta term fromvm. This term reflects the overall change in light efficiency of the aperture as it simultaneously shrinks or broadens (throughat,bt, andct) while brightening or dimming (throughdt). Because the
lightness changeΔL, with
ΔL=Σ
2
2 (bbt −2act−2cat)−dt, (B.8)
depends only on known quantities, it can be accounted for with a simple pre-processing step:
It−(ΔL)I=v′m′∗I, (B.9)
wherev′m′is a new operator that now contains onlyσandσ2terms:
v′m′ =−σ(μxt∂x+μyt∂y)−σ2∇Σ2,t. (B.10)
The combination ofσpowers in equation B.10 still prevents the desired depth-blind factoring into aσ-dependentv′and aσ-independentm′. It points to an underlying limitation on the kinds of aperture changes that can reveal depth in the desired way. Specifically, it leaves two possibilities: 1) when theσ2term vanishes, thenσand therefore depth can be recovered from the first spatial image derivatives, and 2) when theσterm vanishes, thenσ2and therefore depth up to a
side of focal plane ambiguity can be recovered from the second spatial image derivatives.
In the first case,at =bt =ct =0, and the aperture filter is a Gaussian that
undergoes a lateral shift(μxt,μyt). Then the light-efficiency-corrected image
changeIt−(ΔL)Itakes the form
It+dtI=−σ (
μxtIx+μytIy )
, (B.11)
which reveals depth directly through the equation
Z= ( 1 Zf − It+dtI fμx tIx +fμ y tIy )−1 . (B.12)
This is essentially a differential version of depth from stereo, where the baseline is theσ-magnified filter shift. Depth from defocus and stereo are known to share fundamental similarities [96], and this approach has been used for pinhole [51]
and pillbox apertures [111], and would work in the same way for any filterκ. In the second case, depth can be recovered up to a side of focal plane ambiguity as long as the Gaussian does not shift laterally(μx
t =μ y
t =0). Then,
image values are constrained by
It−(ΔL)I=−σ2∇Σ2,tI, (B.13)
and per-pixel depth is recovered, up to a side of focal plane ambiguity, by
Z= ( 1 Zf ± √ It−(ΔL)I −f2∇2 Σ,tI )−1 . (B.14)
Colophon
T
his thesis was typeset using LATEX, originally developed by Leslie Lamport and based on Donald Knuth’s TEX. The body text is set in 11 point Arno Pro, designed by Robert Slimbach in the style of book types from the Aldine Press in Venice, and issued by Adobe in 2007. A template, which can be used to format a PhD thesis with this look and feel, has been released under the permissive mit (x11) license, and can be found online atgithub.com/suchow/or from the author at [email protected].