Computer Vision cmput 428/615

(1)

Computer Vision cmput 428/615

Basic 2D and 3D geometry Basic 2D and 3D geometry

and Camera models and Camera models

Martin Jagersand

(2)

The equation of projection

Mathematically:

•Cartesian coordinates:Cartesian coordinates:

•Projectively: x = PXProjectively: x = PX

(x, y, z)  ( f x

z , f y z ) How do we develop a consistent mathematical

framework for projection calculations?

Intuitively:

(3)

Challenges in Computer Vision:

What images don’t provide

lengths

depth

(4)

Distant objects are smaller

(5)

Visual ambiguity

• Will the scissors cut the paper in the Will the scissors cut the paper in the middle?

middle?

(6)

Ambiguity

• Will the scissors cut the paper in the middle? Will the scissors cut the paper in the middle?

NO! NO!

(7)

Visual ambiguity

• Is the probe contacting the wire? Is the probe contacting the wire?

(8)

Ambiguity

• Is the probe contacting the wire? Is the probe contacting the wire? NO! NO!

(9)

Visual ambiguity

• Is the probe contacting the wire? Is the probe contacting the wire?

(10)

Ambiguity

• Is the probe contacting the wire? Is the probe contacting the wire? NO! NO!

(11)

History of Perspective

Roman Roman

Prehistoric:

(12)

Perspective: Da Vinci

(13)

Visualizing perspective: Dürer

Perspectograph

1500’s

(14)

Parallel lines meet

common to draw image plane in front of the focal point

Centre of projection Image

plane

3D world

(15)

Perspective Imaging Properties

90

Challenges with measurements in multiple images:

• Distances/angles change

• Ratios of dist/angles change

• Parallel lines intersect

(16)

What is preserved?

horizon

Invariants:

• Points map to points

• Intersections are preserved

• Lines map to lines

• Collinearity preserved

• Ratios of ratios (cross ratio)

• Horizon

What is a good way to represent imaged geometry?

(17)

Vanishing points

• each set of parallel lines (=direction) meets each set of parallel lines (=direction) meets at a different point

at a different point

– The The vanishing point vanishing point for this direction for this direction – How would you show this? How would you show this?

• Sets of parallel lines on the same plane lead Sets of parallel lines on the same plane lead to to collinear collinear vanishing points. vanishing points.

– The line is called the The line is called the horizon horizon for that plane for that plane

(18)

Geometric properties of projection

• Points go to points Points go to points

• Lines go to lines Lines go to lines

• Planes go to whole image Planes go to whole image

• Polygons go to polygons Polygons go to polygons

• Degenerate cases Degenerate cases

– line through focal point to line through focal point to point

point

– plane through focal point plane through focal point to line

to line

(19)

Polyhedra project to polygons

• (because lines project to (because lines project to lines)

lines)

(20)

Junctions are constrained

• This leads to a This leads to a

process called “line process called “line

labelling”

– one looks for consistent one looks for consistent sets of labels, bounding sets of labels, bounding polyhedra

polyhedra

– disadv - can’t get the lines disadv - can’t get the lines and junctions to label

and junctions to label from real images

from real images

(21)

Back to projection

• Cartesian coordinates: Cartesian coordinates: _{(x, y, z)} _{ ( f} ^x

z , f y z )

We will develop a framework to express projection as x=PX, where x is 2D image projection, P a

projection matrix and X is 3D world point.

(22)

Basic geometric transformations:

Translation

• A translation is a straight line movement of an A translation is a straight line movement of an object from one postion to another.

object from one postion to another.

A point A point (x,y) is transformed to the point (x,y) is transformed to the point (x’,y’) by adding the (x’,y’) by adding the translation distances

translation distances T T

_x_x

and and T T

_y_y

: :

x’ = x + T x’ = x + T

_x_x

y’ = y + T y’ = y + T

_y_y

z’ = z + T z’ = z + T

_z_z

(23)

Coordinate rotation

• Example: Around y-axis Example: Around y-axis

p

⁰

= x

⁰

y

⁰

z

⁰

" #

=

cos÷ 0 sin÷

0 1 0

à sin÷ 0 cos÷

2 4

3 5 x y z

" #

= R

_y

p

X’

Z’

X

P

(24)

Euler angles

• Note: Successive rotations. Order matters. Note: Successive rotations. Order matters.

R = R _z R _y R _x

= cos' à sin' 0 sin' cos' 0

0 0 1

" # cos÷ 0 sin÷

0 1 0

à sin÷ 0 cos÷

2 4

3 5 1 0 0

0 cos à sin 0 sin cos 2

4

3

5

(25)

Rotation and translation

• Translation Translation t’ in new t’ in new o’ coordinates o’ coordinates

p

⁰

=

cos÷ 0 sin÷

0 1 0

à sin÷ 0 cos÷

2 4

3 5 p + t

⁰

X’

Z’

X

P

(26)

Basic transformations

Scaling

• A scaling transformation alters the scale of an object. A scaling transformation alters the scale of an object.

Suppose a point (x,y) is transformed to the point (x',y') by Suppose a point (x,y) is transformed to the point (x',y') by

a scaling with scaling factors S

_x_x

and S and S

_y_y

, then: , then:

x' = x S x' = x S

_x_x

y' = y S y' = y S

_y_y

z' = z S z' = z S

_z_z

• A uniform scaling is produced if S A uniform scaling is produced if S

_x_x

= S = S

_y_y

= S = S

_z_z

. .

(27)

Basic transformations

Scaling

The previous scaling transformation leaves the origin The previous scaling transformation leaves the origin unaltered. If the point (x

unaltered. If the point (x

_f_f

,y ,y

_f_f

) is to be the fixed point, the ) is to be the fixed point, the transformation is:

transformation is:

x' = x

_f_f

+ (x - x + (x - x

_f_f

) S ) S

_x_x

y' = y

_f_f

+ (y - y + (y - y

_f_f

) S ) S

_y_y

This can be rearranged to give:

x' = x S

_x_x

+ (1 - S + (1 - S

_x_x

) x ) x

_f_f

y' = y S

_y_y

+ (1 - S + (1 - S

_y_y

) y ) y

_f_f

(28)

Affine Geometric Transforms

In general, a point in n-D space transforms by P’ = rotate(point) + translate(point)

In 2-D space, this can be written as a matrix equation:

 

 



 

 

 



 



 



 

 



 





ty tx y

x Cos

Sin

Sin Cos

y x

) ( )

(

) ( )

( '

'



In 3-D space (or n-D), this can generalized as a matrix equation:

p’ = R p + T or p = R

^t

(p’ – T)

(29)

A Simple 2-D Example

p = (1,0)’

Suppose we rotate the coordinate system through 45 degrees (note that this is measured relative to the rotated system!

 

 



 

 

 





 

 



 



 



 

 



 





) 4 / (

) 4 / ( '

'

0 1 )

4 / ( )

4 / (

) 4 / ( )

4 / ( '

'



Sin Cos y

x

Cos Sin

Sin Cos

y

p = (0,1)’

x

 

 



 

 

 





 

 



 



 



 

 



 





) 4 / (

) 4 / ( '

'

1 0 )

4 / ( )

4 / (

) 4 / ( )

4 / ( '

'



Cos Sin y

x

Cos Sin

Sin Cos

y

x

(30)

Matrix representation and Homogeneous coordinates

• Often need to combine several transformations to build Often need to combine several transformations to build the total transformation.

the total transformation.

• So far using affine transforms need both add and multiply So far using affine transforms need both add and multiply

• Good if all transformations could be represented as matrix Good if all transformations could be represented as matrix multiplications then the combination of transformations multiplications then the combination of transformations

simply involves the multiplication of the respective simply involves the multiplication of the respective

matrices matrices

• As translations do not have a 2 x 2 matrix representation, As translations do not have a 2 x 2 matrix representation, we introduce homogeneous coordinates to allow a 3 x 3 we introduce homogeneous coordinates to allow a 3 x 3

matrix representation.

(31)

How to translate a 2D point:

• Old way: Old way:

• New way: New way:

(32)

Relationship between 3D

homogeneous and inhomogeneous

• The Homogeneous coordinate corresponding to the point The Homogeneous coordinate corresponding to the point (x,y,z) is the triple (x

(x,y,z) is the triple (x

_h_h

, y , y

_h_h

, z , z

_h_h

, w) where: , w) where:

x

_h_h

= wx = wx y

y

_h_h

= wy = wy z

z

_h_h

= wz = wz

We can (initially) set w = 1.

• Suppose a point P = (x,y,z,1) in the homogeneous Suppose a point P = (x,y,z,1) in the homogeneous coordinate system is mapped to a point

coordinate system is mapped to a point

P' = (x',y',z’,1) by a transformations, then the P' = (x',y',z’,1) by a transformations, then the

transformation can be expressed in matrix form.

(33)

• For the basic transformations we have: For the basic transformations we have:

– Translation Translation

– Scaling Scaling

P

⁰

=

x

⁰

y

⁰

z

⁰

w 2 6 4

3 7 5 =

1 0 0 T

_x

0 1 0 T

_y

0 0 1 T

_z

0 0 0 1 2

6 4

3 7 5

x y z w 2 6 4

3 7 5

P

⁰

=

x

⁰

y

⁰

z

⁰

w 2 6 4

3 7 5 =

s

_x

0 0 0 0 s

_y

0 0 0 0 s

_z

0 0 0 0 1 2

6 4

3 7 5

x y z w 2 6 4

3 7 5

Matrix representation and

Homogeneous coordinates

(34)

Geometric Transforms

Using the idea of homogeneous transforms, we can write:

T p

p R 



 



 

1 0

0 ' 0

R and T both require 3 parameters.

= cos' à sin' 0 sin' cos' 0

0 0 1

" # cos÷ 0 sin÷

0 1 0

à sin÷ 0 cos÷

2 4

3 5 1 0 0

0 cos à sin 0 sin cos 2

4

3 R 5

(35)

Geometric Transforms

If we compute the matrix inverse, we find that

1 ' 0

0 0

' '

T p R

p R 



 



 



R and T both require 3 parameters. These correspond

to the 6 extrinsic parameters needed for camera calibration

(36)

Rotation about a Specified Axis Rotation about a Specified Axis

• It is useful to be able to rotate about any axis in It is useful to be able to rotate about any axis in 3D space

3D space

• This is achieved by composing 7 elementary This is achieved by composing 7 elementary transformations (next slide)

transformations (next slide)

(37)

Rotation through  about Specified Rotation through  about Specified Axis

Axis

x y

z

x y

z rotate through requ’d angle, 

x y

z

x y

z P2

P1 x y

z

P2 P1

x y

z initial position translate P1

to origin

rotate so that P2 lies on z-axis (2 rotations)

rotate axis

to orig orientation translate back

(38)

Comparison:

• Homogeneous coordinates Homogeneous coordinates

– Rotations and translations are represented in a uniform way Rotations and translations are represented in a uniform way

– Successive transforms are composed using matrix products: Successive transforms are composed using matrix products: y = y = Pn..P2P1x

Pn..P2P1x

• Affine coordinates Affine coordinates

– Non-uniform representations: Non-uniform representations: y = Ax + b y = Ax + b

– Difficult to keep track of separate elements Difficult to keep track of separate elements

(39)

Camera models and projections Geometry part 2.

• Using geometry and homogeneous Using geometry and homogeneous transforms to describe:

transforms to describe:

– Perspective projection Perspective projection

– Weak perspective projection Weak perspective projection – Orthographic projection Orthographic projection

x y z

x y z x

y

(40)

The equation of projection

• Cartesian coordinates: Cartesian coordinates:

– We have, by similar triangles, that (x, y, z) -> We have, by similar triangles, that (x, y, z) ->

(f x/z, f y/z, -f) (f x/z, f y/z, -f)

– Ignore the third coordinate, and get Ignore the third coordinate, and get

(x, y, z)  ( f x

z , f y

z )

(41)

The camera matrix

• Homogenous coordinates for 3D Homogenous coordinates for 3D

– four coordinates for 3D point four coordinates for 3D point

– equivalence relation (X,Y,Z,T) is the same as (k X, k Y, k Z,k T) equivalence relation (X,Y,Z,T) is the same as (k X, k Y, k Z,k T)

• Turn previous expression into HC’s Turn previous expression into HC’s

– HC’s for 3D point are (X,Y,Z,T) HC’s for 3D point are (X,Y,Z,T) – HC’s for point in image are (U,V,W) HC’s for point in image are (U,V,W)

U V W





 





 

1 0 0 0

0 1 0 0

0 0 1

f 0













X Y

Z T







 







 

) , ( ) ,

( )

, ,

( u v

W V W W U

V

U  

(42)

• Issue Issue

– camera may not be at the origin, looking down the z-axis camera may not be at the origin, looking down the z-axis – extrinsic parameters extrinsic parameters

– one unit in camera coordinates may not be the same as one one unit in camera coordinates may not be the same as one unit in world coordinates

unit in world coordinates

– intrinsic parameters - focal length, principal point, intrinsic parameters - focal length, principal point, aspect ratio, angle between axes, etc.

aspect ratio, angle between axes, etc.

Camera parameters



 









 







 







 







 







 







 







 







 

 





 







T Z Y X

W V U

parameters extrinsic

ng representi

tion Transforma

model projection

ng representi

tion Transforma

parameters intrinsic

ng representi

tion Transforma

Note: f moved from proj to intrinsics!

(43)

Intrinsic Parameters

Intrinsic Parameters describe the conversion from metric to pixel coordinates (and the reverse)

x

_mm

= - (x

_pix

– o

_x

) s

_x

y

_mm

= - (y

_pix

– o

_y

) s

_y

p M

w y x o

s f

o s

f w

y x

mm y

y

x x

pix

int

1 0

0 / 0

0 /

 

 





 







 







 









 

 





 







or

Note: Focal length is a property of the camera and can be

incorporated as above

(44)

Example:

A real camera

• Laser range finder Laser range finder • Camera Camera

(45)

Relative location Camera-Laser

• Camera Camera • Laser Laser

R=10deg

T=(16,6,-9)’

(46)

In homogeneous coordinates

• Rotation: Rotation: • Translation Translation

T =

1 0 0 16 0 1 0 6 0 0 1 à 9 0 0 0 1 0

B @

1 C A

R =

cosà 10 0 sinà 10

0 1 0

à sinà 10 0 cosà 10 2

4

3

5

(47)

Full projection model

• Camera internal Camera internal parameters

parameters

• Camera Camera projection projection

0:985 0 à 0:174 0

0 1 0 0

0:174 0 0:985 0

0 0 0 1

0 B @

1 C A

1 0 0 16

0 1 0 6

0 0 1 à 9

0 0 0 1

0 B @

1 C A

0:6612 à 10:55

108:0 1 0

B @

1 C A =

22262 16755 97:47

!

p

_camera

= 1278:6657 0 256

0 1659:5688 240

0 0 1

! 1 0 0 0 0 1 0 0 0 0 1 0

!

Extrinsic rot and translation

(48)

• Issue Issue

– camera may not be at the origin, looking down the z-axis camera may not be at the origin, looking down the z-axis – extrinsic parameters extrinsic parameters

– one unit in camera coordinates may not be the same as one one unit in camera coordinates may not be the same as one unit in world coordinates

unit in world coordinates

– intrinsic parameters - focal length, principal point, intrinsic parameters - focal length, principal point, aspect ratio, angle between axes, etc.

aspect ratio, angle between axes, etc.

Camera parameters



 









 







 







 







 







 







 







 







 

 





 







T Z Y X

W V U

parameters extrinsic

ng representi

tion Transforma

model projection

ng representi

tion Transforma

parameters intrinsic

ng representi

tion Transforma

Note: f moved from proj to intrinsics!

(49)

Result

• Camera image Camera image • Laser measured 3D Laser measured 3D structure

structure

(50)

Hierarchy of different camera models

Camera center

Image plane Object plane

X

₀

(origin) x

_persp

Perspective: non-linear

x

_parap

Para-perspective: lin

x

_orth

Orthographic: lin, no scaling

Weak perspective: linear approx

x

_wp

(51)

Orthographic projection

y v

x u



(52)

The fundamental model for orthographic projection

U V W



  

 

1 0 0 0 0 1 0 0 0 0 0 1



  

 

X Y Z T







 







 

(53)

Perspective and Orthographic Projection

perspective Orthographic

(parallel)

(54)

Weak perspective

Z f

T

Ty v

Tx u

 /



• Issue Issue 

– perspective effects, but not perspective effects, but not over the scale of individual over the scale of individual objects

objects

– collect points into a group at collect points into a group at about the same depth, then about the same depth, then divide each point by the depth divide each point by the depth of its group

of its group – Adv: easy Adv: easy

– Disadv: wrong Disadv: wrong

(55)

The fundamental model for weak perspective projection



 









 







 







 







 

 





 







T Z Y X

Z f

W V U

* /

0 0

0 0 0

1 0

0 0

0 1

Note Z* is a fixed value, usually mean distance to scene

(56)

Weak perspective projection

 







 







 







 











k t

t

y x

/ 1 0

r r 1 α

α

P

^2T ₂

1 1T

(7dof)

Weak perspective projection for an

arbitrary camera pose R,t

(57)

1. Affine camera=camera with principal plane coinciding with 

_∞

2. Affine camera maps parallel lines to parallel lines

3. No center of projection, but direction of projection P

_A

D=0

(point on 

_∞

)

Affine camera

 







 







 







 









k t

t s

y x

A

/ 1 0

r r 1 α

α

P

^2T ₂

1

(8dof)

1T

 





 



 

1 0

0 0

P m

¹¹₂₁

m

¹²₂₂

m

¹³₂₃

t

₂¹

t m

m m

A

   ⁴ ⁴ ^affine 

1 0 0

0 1 0 0 0 0 0 0 affine 1

3 3

P 

 





 



 

A



Full Affine linear camera

(58)

Hierarchy of camera models

Camera center

Image plane Object plane

X

₀

(origin) x

_persp

Perspective:

x

_parap

Para-perspective:

First order approximation of perspective

 







 







 







 









z y x

persp

t t t f

f P

k j i 1 x

_orth

Orthographic:

 







 









T

1

y x

orth

t

t P

0 j i

Weak perspective:

x

_wp



































1

1 ^T

y x

wp t

t k

k P

0 j i

(59)

Camera Models

C

aff

C

inj

C

proj

C

sim

reconstruction up to a bijection of task space

up to a projective transformation of task space

up to an affine transformation of task space

up to a similarity (scaled Euclidean transformation)

• Internal calibration: Internal calibration:

• Weak calibration: Weak calibration:

• Affine calibration: Affine calibration:

• Stratification of stereo vision: Stratification of stereo vision:

- - characterizes the reconstructive certainty of characterizes the reconstructive certainty of

weakly, affinely, and internally calibrated stereo rigs weakly, affinely, and internally calibrated stereo rigs

(60)

Visual Invariance



_inj



_inj



_inj



_inj



_proj



_sim



_aff



_aff



_proj



_proj

(61)

Perspective Camera Model Structure

Assume R and T express camera in world coordinates, then

T p R

p R

^w

c





 



 

 0 0 0 1

'

Combining with a perspective model (and neglecting internal parameters) yields

p f

T R

f R R R p

M

u

^y_z ^w

x

z y x w

c

 







 











 '

' '

'

Note the M is defined only up to a scale factor at this point! If M is

viewed as a 3x4 matrix defined up to scale, it is called the projection

matrix.

(62)

Perspective Camera Model Structure

Assume R and T express camera in world coordinates, then

T p R

p R

^w

c





 



 

 0 0 0 1

'

Combining with a weak perspective model (and neglecting internal parameters) yields

p f

T P R

T R

T R R

R p

M

u

_z ^y ^w

x y

x w

c



 







 









 ( )

' ' 0

' '

Where is the nominal distance to the viewed object P

(63)

Other Models

• The The affine camera affine camera is a generalization of weak is a generalization of weak perspective.

perspective.

• The The projective camera projective camera is a generalization of the is a generalization of the perspective camera.

perspective camera.

• Both have the advantage of being linear models on real Both have the advantage of being linear models on real and projective spaces, respectively.

and projective spaces, respectively.

• But in general will recover structure up to an affine or But in general will recover structure up to an affine or projective transform only. (ie distorted structure)

projective transform only. (ie distorted structure)

(64)

Camera Internal Calibration Recall: Intrinsic Parameters

Intrinsic Parameters describe the conversion from metric to pixel coordinates (and the reverse)

x

_mm

= - (x

_pix

– o

_x

) s

_x

y

_mm

= - (y

_pix

– o

_y

) s

_y

p M

w y x o

s

o s

w y x

mm y

y

x x

pix

int

1 0

0 / 1 0

0 /

1  

 





 







 







 









 

 





 







or

(65)

CAMERA INTERNAL CALIBRATION

Known distance d

known regular offset r

x i

i

x x

i i

s x d x

r

s o

d x rk

) (

1











A simple way to get scale parameters; we can compute the optical center as the numerical center and therefore have the intrinsic parameters

Compute Sx

Focal length = 1/ Sx

(66)

Camera calibration

• Issues: Issues:

– what are intrinsic parameters of what are intrinsic parameters of the camera?

the camera?

– what is the camera matrix? what is the camera matrix?

(intrinsic+extrinsic) (intrinsic+extrinsic)

• General strategy: General strategy:

– view calibration object view calibration object – identify image points identify image points – obtain camera matrix by obtain camera matrix by

minimizing error minimizing error

– obtain intrinsic parameters from obtain intrinsic parameters from camera matrix

camera matrix

• Error minimization: Error minimization:

– Linear least squares Linear least squares

– easy problem numerically easy problem numerically – solution can be rather bad solution can be rather bad – Minimize image distance Minimize image distance

– more difficult numerical more difficult numerical problem

problem

– solution usually rather good, solution usually rather good, but can be hard to find

but can be hard to find – start with linear least start with linear least

squares squares

– Numerical scaling is an issue Numerical scaling is an issue

(67)

Stereo Vision

• GOAL: GOAL: Passive 2- Passive 2- camera system for camera system for

triangulating 3D triangulating 3D

position of points in position of points in

space to generate a space to generate a

depth map of a depth map of a

world scene.

• Humans use stereo Humans use stereo vision to obtain

vision to obtain depth

depth

(68)

Stereo depth calculation:

Simple case, aligned cameras

Z

(0,0) (d,0) X

f XL XR

Z = (f/XL) X Z= (f/XR) (X-d)

(f/XL) X = (f/XR) (X-d) X = (XL d) / (XL - XR)

Z = **d*f (XL - XR)**

DISPARITY=

DISPARITY= (XL - XR) (XL - XR)

Similar triangles:

Solve for X:

Solve for Z:

(69)

Epipolar constraint

Special case: parallel cameras – epipolar lines are parallel and aligned with rows

(70)

Stereo measurement example:

• Left image Left image

Resolution = 1280 x 1024 Resolution = 1280 x 1024

pixels pixels

f = 1360 pixels f = 1360 pixels

• Right image Right image Baseline d = 1.2m Baseline d = 1.2m

Q: How wide is the hallway

(71)

How wide is the hallway?

General strategy

• Similar triangles: Similar triangles:

• Need depth Need depth Z Z

• Then solve for Then solve for W W

f v

W  Z W

Z

f

v

(72)

How wide is the hallway?

Steps in solution:

1. 1. Compute focal length f in meters from pixels Compute focal length f in meters from pixels

2. 2. Compute depth Z using stereo formula (aligned Compute depth Z using stereo formula (aligned camera planes)

camera planes)

3. 3. Compute width: Compute width:

Z = **d*f (XL - XR)**

f

Z v

W 

(73)

Focal length:

0.224m is 1280 pixels

0.238m 224

. 0 1280 *

1360 

 f

f = 1360 pixels

Here screen projection is metric image plane.

(74)

How wide…

Depth calculation

Disparity:

Disparity: XL – XR = 0.07m XL – XR = 0.07m

(Note in the disparity calculation the choice of reference (Note in the disparity calculation the choice of reference (here the edge) doesn’t matter. But in the case of say (here the edge) doesn’t matter. But in the case of say X-coordinate calculation it should be w.r.t. the center X-coordinate calculation it should be w.r.t. the center of the image as in the stereo formula derivation