Computer Vision cmput 428/615
3D Modeling from images 3D Modeling from images
Martin Jagersand Martin Jagersand
Pinhole camera
• Central projectionCentral projection
• Principal point & aspectPrincipal point & aspect
0 1 1 0 0
0 0 0
0 0 0
~ 1
) / , / ( )
, , (
Z Y X f
f
Z fY fX y
x
Z fY Z fX Z
Y
X T T
1 1 0
0 0 1 1 0
1 1
1
~ 1
y x p c
p c c
p y
c p x
v u
y y
x x
y y
x
x c
px
py
The projection matrix:
cam
cam y
y
x x
I K
p c f p c
f
X 0 x
X x
]
| [
0 1 0
0
0 0
0 0
Projective camera
• Camera rotation and translationCamera rotation and translation
• The projection matrixThe projection matrix
tX X
t
XX R cam cam RT RT
tX
x KRT I
P In general:
•P is a 3x4 matrix with 11 DOF
•Finite: left 3x3 matrix non-singular
•Infinite: left 3x3 matrix singular
Properties: P=[M p4]
•Center:
•Principal ray (projection direction)
0 0 ,
1 0
4
1
d d p C
C C
M M P
) 3
det( m v M
•Infinite cameras where the last row of Infinite cameras where the last row of P is P is (0,0,0,1)(0,0,0,1)
•Points at infinity are mapped to points at infinityPoints at infinity are mapped to points at infinity
Affine cameras
z z
y x
T
y x
t d
t t t R
d t t K
P
0
0
t k
j i 0
j i
d0 t
j
k i
{camera}
Z
X
{world} Y
x0
•ErrorError
)
( 0
0
x x
x
x
persp proj
aff d
Good approximation:
• small compared to d0
• point close to principal ray
X
X’
xaff xpersp
Camera calibration
•11 DOF => at least 6 points11 DOF => at least 6 points
•Linear solutionLinear solution
– Normalization requiredNormalization required – Minimizes algebraic errorMinimizes algebraic error
•Nonlinear solutionNonlinear solution
– Minimize geometric error (pixel re-projection)Minimize geometric error (pixel re-projection)
•Radial distortionRadial distortion
– Small near the center, increase towards peripherySmall near the center, increase towards periphery
1
0 min
p
p A
i
i P X
x
known known ?
...
1 1 2 2
K r K r
r
Application: raysets
Gortler and al.; Microsoft Lumigraph
t
v
s
(s,t) u
(u,v)
H-Y Shum, L-W He; Microsoft Concentric mosaics
Ck Cl
C0
Li
Lj vi
vj
i j
•Projection equationProjection equation xxii=P=PiiXX
•Resection:Resection:
– xxii,X P,X Pii
Multi-view geometry - resection
Given image points and 3D points calculate camera projection matrix.
• Projection equationProjection equation xxii=P=PiiXX
• Intersection:Intersection:
– xxii,P,Pi i XX
Multi-view geometry - intersection
Given image points and camera projections in at least 2 views calculate the 3D points (structure)
• Projection Projection equation equation xxii=P=PiiXX
• Structure from Structure from motion (SFM) motion (SFM)
– xxii P Pii,,XX
Multi-view geometry - SFM
Given image points in at least 2 views calculate the 3D points (structure) and camera projection matrices (motion)
•Estimate projective structure
•Rectify the reconstruction to metric (autocalibration)
Depth from stereo
•Calibrated aligned camerasCalibrated aligned cameras
Z
(0,0) (d,0) X Z=f
xl xr
r l
r l
x x
Z df
d x X
X f x Z f
( )
Disparity d
Trinocular Vision System
(Point Grey Research)
Application: depth based reprojection
3D warping,
3D warping, McMillanMcMillan
Plenoptic modeling,
Plenoptic modeling, McMillan & BishopMcMillan & Bishop
Application: depth based reprojection
Layer depth images,
Layer depth images, Shade et al. Shade et al.
Image based objects,
Image based objects, Oliveira & Bishop Oliveira & Bishop
Affine camera factorization
3D structure from many images
The affine projection equations are The affine projection equations are
1
j j
j
y i
x i ij
ij
Z Y X
P P y
x
0001 1
1
j j
j y
i x i ij
ij
Z Y X P
P y
x
~
~
4 4
j j
j y
i x i ij
ij y
i ij
x i ij
Z Y X P
P y
x P
y
P x
Orthographic factorization
The ortographic projection equations are The ortographic projection equations are where
where m ij Pi M j , i 1,...,m , j 1,...,n
All equations can be collected for all
All equations can be collected for all i and i and jj where
where
n
mn m m
m
n n
M ,..., M
, M ,
, m
m m
m m
m
m m
m
2 1
2 1
2 1
2 22
21
1 12
11
M
P P
P P
m
M P m
~ M
~ m
j j
j y j
i x i i
ij ij ij
Z Y
X P ,
, P y
x P
Note that P and M are resp. 2mx3 and 3xn matrices and therefore the rank of m is at most 3
(Tomasi Kanade’92)
Orthographic factorization
Factorize
Factorize mm through singular value decomposition through singular value decomposition An affine reconstruction is obtained as follows
An affine reconstruction is obtained as follows
V T
U m
V T
M U
P ~
~ ,
(Tomasi Kanade’92)
n
mn m m
m
n n
M ,..., M
, M m
m m
m m
m
m m
m
min 2 1 2
1
2 1
2 22
21
1 12
11
P P P
Closest rank-3 approximation yields MLE!
~ 0
~
~ 1
~
~ 1
~
T T 1
T T 1
T T 1
y i x
i
y i y
i
x i x
i
P P
P P
P P
Q Q
Q Q
Q Q
~ 0
~
~ 1
~
~ 1
~
T T T
y i x
i
y i y
i
x i x
i
P P
P P
P P
C C C
A metric reconstruction is obtained as follows A metric reconstruction is obtained as follows
Where A is computed from Where A is computed from
Orthographic factorization
Factorize
Factorize mm through singular value decomposition through singular value decomposition An affine reconstruction is obtained as follows
An affine reconstruction is obtained as follows
V T
U m
V T
M U
P ~
~ ,
M Q M
Q P
P ~
~ 1,
0 1 1
T T T
y i x i
y i y i
x i x i
P P
P P
P
P 3 linear equations per view on symmetric matrix C (6DOF)
Q can be obtained from C
through Cholesky factorisation and inversion
(Tomasi Kanade’92)
Weak perspective factorization
[D. Weinshall]
[D. Weinshall]
•Weak perspective cameraWeak perspective camera
•Affine ambiguityAffine ambiguity
•Metric constraintsMetric constraints
j i s M s
ˆ ) ˆ )(
ˆ ( ˆ
ˆ MQQ 1X MQ Q 1X
W
ˆ 0 ˆ
ˆ ˆ
ˆ
ˆ 2
j i
j j
i i
s QQ s
s s
QQ s
s QQ s
T T
T T
T T
Extract motion parameters
Eliminate scale
Compute direction of camera axis k = i x j
parameterize rotation with Euler angles
Full perspective factorization
The camera equations The camera equations
for a fixed image
for a fixed image ii can be written in matrix form can be written in matrix form asas
where where
n j
m
j i
i ij
i jm M , 1,..., , 1,...,
λ P
M P m i i i
i i im
i
m im
i i
i m m m
λ ,..., λ
, λ diag
M ,..., M
, M ,
,..., ,
2 1
2 1
2 1
M
m
Perspective factorization
All equations can be collected for all All equations can be collected for all ii as as
where
where m P M
m n
n P
P P P
m m m
m , ...
...
2 1 2
2 1 1
In these formulas
In these formulas mm are known, but are known, but ii,,PP and and MM are are unknown
unknown
Observe that
Observe that PMPM is a product of a 3 is a product of a 3mmx4 matrix and a x4 matrix and a 4x4xnn matrix, i.e. it is a rank 4 matrix matrix, i.e. it is a rank 4 matrix
Perspective factorization algorithm
Assume that i are known, then PM is known.
Use the singular value decomposition PM=U VT
In the noise-free case
S=diag(1,2,3,4,0, … ,0)
and a reconstruction can be obtained by setting:
P=the first four columns of U.
M=the first four rows of V.
Iterative perspective factorization
When i are unknown the following algorithm can be used:
1. Set ij=1 (affine approximation).
2. Factorize PM and obtain an estimate of P and M.
If 5 is sufficiently small then STOP.
3. Use m, P and M to estimate i from the camera equations (linearly) mi i=PiM
4. Goto 2.
In general the algorithm minimizes the proximity measureNote that structure and motion recovered P(,P,M)=5
up to an arbitrary projective transformation
N-view geometry Affine factorization
(HZ Ch 17, 18)
[Tomasi &Kanade ’92]
[Tomasi &Kanade ’92]
•Affine camera Affine camera
•ProjectionProjection
•n points, n points, m views: measurement matrixm views: measurement matrix
]
| [M t
P M 2x3 matrix; t 2D vector
t
Z Y X y M
x
n
m m
n m
n
M M
W X X
x x
x x
1 1
1
1 1
1
~
~
~
~
W: Rank 3
X M V
D U
W
UDV W
T n m
T
ˆ ˆ ˆ 2 3 3 3 3
Assuming isotropic zero-mean Gaussian noise, factorization achieves ML affine reconstruction.
t x x
~
Projective factorization
Homogeneous coord &scale factors
[Sturm & Triggs’96][ Heyden ‘97 ] [Sturm & Triggs’96][ Heyden ‘97 ]
•Measurement matrixMeasurement matrix
•Known projective depthKnown projective depth
– Projective ambiguity Projective ambiguity
•Iterative algorithmIterative algorithm
– Reconstruct withReconstruct with
– Reestimate depth and iterate Reestimate depth and iterate
n
m m
n m n m
m
n n
P P
W X X
x x
x x
1 1
1 1
1 1 1
1 1 1
3mxn matrix
Rank 4
i
j
X P V
D U
W
UDV W
T n m
T
ˆ ˆ ˆ 2 4 4 4 4
1
i
j i
j
Further Factorization work
Factorization with uncertainty Factorization with uncertainty
Factorization for dynamic scenes Factorization for dynamic scenes
(Irani & Anandan, IJCV’02)
(Costeira and Kanade ‘94) (Bregler et al. 2000,
Brand 2001)
Sequential 3D structure from motion using 2 and 3 view geom
• Initialize structure and motion from two viewsInitialize structure and motion from two views
• For each additional viewFor each additional view
– Determine poseDetermine pose
– Refine and extend structureRefine and extend structure
• Determine correspondences robustly by jointly Determine correspondences robustly by jointly estimating matches and epipolar geometry
estimating matches and epipolar geometry
Images
2 view geometry
Epipolar geometry and
Fundamental matrix F
The epipolar geometry
C,C’,x,x’ and X are coplanar
The epipolar plane
All points on project on l and l’
The epipolar planes
Family of planes and lines l and l’
Intersection in e and e’
The epipoles
epipoles e,e’
= intersection of baseline with image plane
= projection of projection center in other image
= vanishing point of camera motion direction
an epipolar plane = plane containing baseline (1-D family) an epipolar line = intersection of epipolar plane with image
(always come in corresponding pairs)
Example: converging cameras
Example: motion parallel with image plane
Example: forward motion
e e’
The fundamental matrix F
algebraic representation of epipolar geometry
x l'
we will see that this mapping is (singular)
correlation (i.e. projective mapping from points to lines) represented by the fundamental matrix F
The fundamental matrix F
algebraic derivation (of existence)
λ P x λCX
PP I
e' P'P F
x P P' C
P'
l
(note: doesn’t work for C=C’ F=0)
x P
λ
X
e' H
F
H K1RK
Alternatively can write:
The fundamental matrix F
geometric derivation
x H x' π
x' e'
l'
e' Hπx Fxmapping from 2-D to 1-D family (rank 2) Step 1: X on a plane
Step 2: epipolar line l’
The fundamental matrix F
correspondence condition
0 Fx
x'T
The fundamental matrix satisfies the condition that for any pair of corresponding points x↔x’ in the two images
x'T l'0
The fundamental matrix F
F is the unique 3x3 rank 2 matrix that satisfies x’TFx=0 for all x↔x’
(i) Transpose: if F is fundamental matrix for (P,P’), then FT is fundamental matrix for (P’,P)
(ii) Epipolar lines: l’=Fx & l=FTx’
(iii) Epipoles: on all epipolar lines, thus e’TFx=0, x
e’TF=0, similarly Fe=0
(iv) F has 7 d.o.f. , i.e. 3x3-1(homogeneous)-1(rank2)
(v) F is a correlation, projective mapping from a point x to a line l’=Fx (not a proper correlation, i.e. not invertible)
Fundamental matrix, summary
•Algebraic representation of epipolar geometryAlgebraic representation of epipolar geometry
Step 1: X on a plane Step 2: epipolar line l’
x x
e
x e
x e l
x x
F H
H
] ' [
' ] ' [ ' ' '
'
0 ' x x FT
F
•3x3, Rank 2, det(F)=0
•Linear sol. – 8 corr. Points (unique)
•Nonlinear sol. – 7 corr. points (3sol.)
•Very sensitive to noise & outliers
[Faugeras ’92, Hartley ’92 ]
Epipolar lines:
Epipoles:
Projection matrices:
[ '] ' | '
'
]
| [
0 ' 0
' '
e v
e e
0 e e
x l
x l
T T
T
F P
I P
F F
F F
(i) Correspondence geometry: Given an image point x in
the first view, how does this constrain the position of the corresponding point x’ in the second image?
(ii) Camera geometry (motion): Given a set of corresponding image points {xi ↔x’i}, i=1,…,n, what are the cameras P and P’ for the two views?
(iii) Scene geometry (structure): Given corresponding image points xi ↔x’i and cameras P, P’, what is the position of (their pre-image) X in space?
F Relates to three questions:
Relating 3D geometry and 2D images
The Fundamental Matrix F
Computing F; 8 pt alg
0 Fx
x'T
separate known from unknown
0 '
' '
' '
' x f11 x yf12 x f13 y x f21 y yf22 y f2 3 x f31 yf3 2 f33 x
x' x, x' y, x ,' y'x, y' y, y ,' x, y 1, f11, f12, f13, f21, f22, f23, f31, f32, f33T 0
(data) (unknowns)
(linear)
0 Af
0 1 f
' '
' '
' '
1 '
' '
' '
'1 1 1 1 1 1 1 1 1 1 1 1
n n
n n
n n
n n
n n n
n x x y x y x y y y x y
x
y x
y y
y x
y x
y x x
x
0 1
´
´
´
´
´
´
1
´
´
´
´
´
´
1
´
´
´
´
´
´
33 32 31 23 22 21 13 12 11
2 2
2 2
2 2
2 2
2 2 2
2
1 1
1 1
1 1
1 1
1 1 1
1
f f f f f f f f f
y x
y y
y y
x x
x y x
x
y x
y y
y y
x x
x y x
x
y x
y y
y y
x x
x y x
x
n n
n n
n n
n n
n n n
n
8-point algorithm
Solve for nontrivial solution using SVD:
0 Af USVT
A USVT SVT x Vx
Var subst: y Vx Now Min Sy y 0,0,...,0,1T
Hence x = last vector in V
(Also 3,4,N view geometry. HZ 15,16)
•Trifocal tensor (3 view geometry)Trifocal tensor (3 view geometry)
] , , [
: T1 T2 T3
T 3x3x3 tensor;
27 params. (18 indep.)
0 ]
"
[ ) (
] ' [
"
] 3 , 2 , 1 ['
x x
x
l l l
i
i i
T
T T T
T lines
points
•Quadrifocal tensor (4 view geometry) Quadrifocal tensor (4 view geometry) [Triggs ’95]
•Multiview tensors Multiview tensors [Hartley’95][ Hayden ‘98]
There is no additional constraint between more than 4 images. All the constraints There is no additional constraint between more than 4 images. All the constraints
can be expressed using F,triliear tensor or quadrifocal tensor.
can be expressed using F,triliear tensor or quadrifocal tensor.
[Hartley ’97][Torr & Zisserman ’97][ Faugeras ’97]
Using Fundamental Matrix F to compute structure and motion
e F ea e
P
0 I
P
T x
2 1
Epipolar geometry Projective calibration
1 0
2 Fm
mT
compatible with F
Yields correct projective camera setup
(Faugeras´92,Hartley´92)
Obtain structure through triangulation
Use reprojection error for minimization Avoid measurements in projective space
1. Compute P1 and P2 2. Triangulate 3D points
M P
mi1 i1
2D-2D
2D-3D 2D-3D
mi mi+1 M
new view
Determine coordinates of 3D Points compatible with P1 and P2
Structure from images:
3D Point reconstruction