Fast 3D Reconstruction using Structured Light Methods
RODRIGUES, Marcos <http://orcid.org/0000-0002-6083-1303>
Available from Sheffield Hallam University Research Archive (SHURA) at: http://shura.shu.ac.uk/5037/
This document is the author deposited version. You are advised to consult the publisher's version if you wish to cite from it.
Published version
RODRIGUES, Marcos (2011). Fast 3D Reconstruction using Structured Light Methods. In: International Conference on Medical Image Computing and Computer Assisted Intervention, Toronto, Canada, 18-22 September. (Unpublished)
Copyright and re-use policy
See http://shura.shu.ac.uk/information.html
Sheffield Hallam University Research Archive
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery
Marcos A Rodrigues
Geometric Modelling and Pattern Recognition Research Group Sheffield Hallam University, Sheffield, UK
www.shu.ac.uk/gmpr
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Structured light scanning
Chapter 2 – Prior work
and featureless is debatable. It may contain a number of strong features such as the eyes and eyebrows, but may also have smooth featureless parts such as the cheeks and forehead. However since the saliency of features on arbitrary (unknown) faces can vary significantly, and since stereo matching is a complex and challenging task, we decide not to focus on this scanning method.
2.1.4 Structured light scanning
The method ofstructured light scanning[35] operates by projecting a pattern of light onto the target surface. A camera then records the pattern as it reflects from the surface. The shape of the captured pattern is combined with the spatial relationship between the light source and the camera, to determine the 3D position of the surface along the pattern. Figure 2.2 shows an example where the pattern consists of a single horizontal line. A synthetic recorded image of a grid pattern of dots is also shown.
camera
light source
target object image capturedby camera pattern of dotsprojecting a
(a) (b)
Figure 2.2: A structured light scanner projects a pattern of light, in this case a horizontal stripe, onto the target surface. A camera records the reflected pattern. The image marked (b) shows a captured pattern of dots.
Early structured light systems originated from laser range finding and used a single stripe of laser light to measure a particular “slice” of the target object [99], as shown in Figure 2.2. The target surface would then be moved in relation to the scanner to measure another part, thus generating a sequence of surface parts that could be stitched together. The availability
16
1. Operates by projecting a pattern of light onto the target surface
2. A camera records the pattern as it reflects from the surface
3. The shape of the captured pattern is combined with the spatial relationship
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Important characteristics
• The cost of the system is fairly low as it requires only a light
source, such as a slide or LCD projector, and a standard digital camera (for a visible spectrum solution).
• The speed of acquisition allows for rapid, almost instantaneous,
acquisition.
• The accuracy of the system can be controlled by the projected
pattern and camera resolution.
• The process can be made completely non-‐intrusive by
implementing a near-‐infrared (NIR) emitter and an NIR sensitive sensor to capture the image.
• As with stereo vision, the system is constrained only by the
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Coding schemes
a) Time-‐multiplexing projects a series of luminance patterns over
time so that every recorded point is encoded by a sequence of intensities. Only for motionless objects.
b) Colour coding stripes are used to differentiate between the
recorded stripes. It can cause problems where the surface exhibits weak or ambiguous reflection properties.
c) Uneven distribution in stripe width reduces the resolution and
can cause difficulties on a surface with variable depth. d) Projecting uncoded stripes is the preferred method.
Chapter 2 – Prior work
suited. It is opaque, fairly diffuse [34] and, apart from small features, fairly smooth and monotone. Specular highlights (or glossiness) may occur due to the presence of oily sebum on the skin [87], but we have not found this to be a major concern.
Various projected patterns have been considered such as dots [97], horizontal and vertical stripes [32, 54, 101], and diagonal stripes [14]. An advantage of stripes over dots, and the reason why we choose stripes, is that an improved sample resolution is achievable. Note that for the camera to sense pattern deformity the stripes need to be normal (not parallel) to the plane spanned by the camera and projector axes (this follows from the epipolar constraint which we discuss in section 3.5.3).
For a direct and unambiguous mapping between image space and 3D space it is important to know which captured pattern element corresponds to which projected element. Many coding schemes have been proposed to distinguish between different captured stripes (see [107] for a review). Figure 2.3 shows some examples. One proposed scheme, called time-multiplexing, projects a series of luminance patterns over time so that every recorded point is encoded by a sequence of intensities [32, 122]. This method can be robust and accurate but, because it requires multiple scans over time, is suitable only for motionless objects. To avoid the need for multiple scans single static patterns have been proposed where colour [102], stripe width [13, 81], or a combination of both [134] is used to differentiate between the recorded stripes. An uneven distribution in stripe width, however, reduces the resolution and can cause difficulties on a surface with variable depth. Colour coding can cause problems as well where the surface exhibits weak or ambiguous reflection properties.
(a) time-multiplexing (b) colour coding (c) variable width (d) uncoded stripes
Figure 2.3: Different coding schemes (a)–(c) which may aid in solving the correspondence problem in a structured light scanning system.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Advantages of uncoded stripes
• It allows for “one-‐shot” scanning, such that the acquisition is
nearly instantaneous and can be performed at frame rate (unlike time-‐multiplexing methods),
• the maximum resolution from a single scan can be achieved
with a sufficiently dense pattern of stripes (unlike variable width coding), and
• it can be applied consistently to a variety of different object
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
A scanner layout
Chapter 3 – Structured light scanning
3.1 THE LAYOUT OF OUR SCANNER
Figure 3.1 illustrates the basic concept of a structured light scanner which utilizes a pattern of horizontal stripes. A projector is utilized to cast the pattern onto the surface of a target object, and a camera captures an image of the scene. Information retrieved from the image is combined with the geometric relationship between the projector and camera in order to reconstruct a cloud of points in 3D. This is possible because each stripe corresponds to a sheet of light originating from the centre of the projector lens and the image reveals where those sheets hit the target surface.
recorded image
←−
projector camera
target object
[image:7.842.502.624.129.286.2]reconstructed point cloud
Figure 3.1: A series of parallel stripes is projected onto the surface of an object. A recorded image of this scene reveals the shape of the surface along those stripes, and can be used to reconstruct a 3D point cloud.
We define a Cartesian coordinate system spanning a 3D space in which the scanner is cali-brated and in which the surface can be reconstructed. Refer to Figure 3.2(a). The axes are chosen in relation to the projector such that: the X-axis coincides with the central projector
axis; the X–Y plane coincides with the horizontal sheet of light cast by the projector; and
the system origin is at an arbitrary, but fixed and known, distance Dp >0 from the centre of the projector lens. We call the space spanned by these axes thesystem space.
29
A series of parallel stripes is projected onto the surface of an
object. A recorded image of this scene reveals the shape of
the surface along those stripes, and can be used to
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Defining the coordinate system
a) We define a coordinate system in relation to the projector.
b) Indices are assigned to planes emanating from the projector.
The planes project to evenly spaced stripes on the Y –Z plane. Chapter 3 – Structured light scanning
projector X
Y Z
Dp
horizontal sheet of light
projector
X Z
W
−2
−1 0
1 2 3
stripe indices
[image:8.842.212.631.106.288.2](a) defining the coordinate system (b) assigning stripe indices
Figure 3.2: (a) We define a coordinate system in relation to the projector. (b) Indices are assigned to planes emanating from the projector. The planes project to evenly spaced stripes on theY–Z plane.
Each projected stripe lies in a specific plane that originates from the projector, and the position and shape of a certain stripe in such a plane depend on the surface it hits. Figure 3.2(b) shows the arrangement of these planes as viewed down theY-axis. They are all parallel to theY-axis and their intersections with theY–Z plane are evenly spaced. To discriminate between the planes we assign successive indices as shown. Note that the horizontal plane containing the projection axis (and therefore also theX-axis) has an index of 0. The distance W between two consecutive stripes on the Y–Z plane can be measured and enables us, for example, to write the point of intersection between the Z-axis and a plane with index n as (0,0, W n) in system coordinates.
The position of the camera is fixed in system space on top of the projector, as shown in Figure 3.3. We require that:
• the central camera axis lies in the X–Z plane, and is parallel to the X-axis (therefore
also to the projector axis);
• the centre of the camera lens is at point (Dp,0, Ds) in system space, with Ds>0; • and the camera is not “tilted” (rotated about its central axis) with respect to the
projector, such that any point in theX–Z plane is captured to the central image column.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
The relationship between camera and projector
The camera is positioned at a distance Ds above the projector so that its central axis lies in the X–Z plane and is parallel to the
projector axis.
Chapter 3 – Structured light scanning
camera
projector
Y
Z
camera
projector
X
Z
[image:9.842.247.606.155.277.2]Ds
Figure 3.3: The camera is positioned at a distanceDs above the projector so
that its central axis lies in the X–Z plane and is parallel to the projector axis.
Note that we fix the camera to be parallel to the projector. This differs from conventional structured light systems, such as those described in [35, 47, 54, 91, 101, 122, 134], where the camera and projector axes angle towards each other and intersect at a certain point close to the target surface. A number of reasons why we favour the parallel configuration is presented at the end of this chapter (section 3.5).
3.2 MAPPING IMAGE POINTS TO SURFACE POINTS
Next we derive formulae for mapping points on stripes in the recorded image to their cor-responding surface points. These formulae will be used to map points on all the captured stripes in the image, thereby producing a point cloud in system space that represents a dis-crete sampling of the target surface. A mesh structure can then be fitted to such a point cloud to produce a surface model. The process of mapping points in the image to points in system space consists of the following three steps:
(1) an image point (or pixel) is transformed to a point on the sensor plane of the camera in system space;
(2) the effects of radial lens distortion are corrected for;
(3) the intersection of the appropriate camera ray and the plane cast from the projector, corresponding to a specific stripe index, is found.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Mapping image points to surface points
Three steps:
1. An image point (or pixel) is transformed to a point on the
sensor plane of the camera in system space;
2. the effects of radial lens distortion are corrected for;
3. the intersection of the appropriate camera ray and the plane
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Image space to system space
The position of a pixel in the image at row r and column c is
transformed to coordinates (v,h), and then to a point p in system space.
Note that we express the pixel size specifically as PF so that F cancels in some of the equations that follow.
Chapter 3 – Structured light scanning
3.2.1 Image space to system space
The recorded image is generated in thesensor plane of the camera which is behind the lens perpendicular to the camera axis. It may, for example, consists of an array of charge-coupled device (CCD) sensors that capture light and convert it into electrical signals which are then used to produce a digital image. We consider such an image as a matrix withM rows andN
[image:11.842.195.623.107.258.2]columns containing as elements the greyscale information (integer values between 0 and 255) of pixels. We will write (r, c) to denote the pixel at row r and columnc.
Figure 3.4 illustrates how a pixel (r, c) in the image is mapped to a point p on the sensor
plane of the camera in system space. The pixel is first transformed to centred coordinates (v, h), with
v=r−12(M + 1) and h=c−12(N + 1). (3.1)
Since the focal point of the camera is located at (Dp,0, Ds) in system space, the centre of the
sensor plane is located at c = (Dp+F,0, Ds). Here F is the focal length of the camera, as
illustrated in Figure 3.4(c). Assuming that each pixel on the sensor plane is a square of size
P F ×P F, we can write the coordinates of pointp as
p=c+ (0,−hP F, vP F). (3.2)
Note that we express the pixel size specifically as P F so that F cancels in some of the
equations that follow.
M
N
r
c
(a) image space
1 2M 1 2N v h (b) centred F camera ←−p
(c) system space
Figure 3.4: The position of a pixel in the image at row r and column c is
transformed to coordinates (v, h), and then to a point p in system space.
32
Chapter 3 – Structured light scanning
3.2.1 Image space to system space
The recorded image is generated in the sensor plane of the camera which is behind the lens
perpendicular to the camera axis. It may, for example, consists of an array of charge-coupled device (CCD) sensors that capture light and convert it into electrical signals which are then
used to produce a digital image. We consider such an image as a matrix with M rows and N
columns containing as elements the greyscale information (integer values between 0 and 255)
of pixels. We will write (r, c) to denote the pixel at row r and column c.
Figure 3.4 illustrates how a pixel (r, c) in the image is mapped to a point p on the sensor
plane of the camera in system space. The pixel is first transformed to centred coordinates (v, h), with
v = r − 12(M + 1) and h = c − 12(N + 1). (3.1)
Since the focal point of the camera is located at (Dp,0, Ds) in system space, the centre of the
sensor plane is located at c = (Dp + F,0, Ds). Here F is the focal length of the camera, as
illustrated in Figure 3.4(c). Assuming that each pixel on the sensor plane is a square of size
P F × P F, we can write the coordinates of point p as
p = c + (0,−hP F, vP F). (3.2)
Note that we express the pixel size specifically as P F so that F cancels in some of the
equations that follow.
M
N
r
c
(a) image space
1 2M 1 2N v h (b) centred F camera
←−
p(c) system space
Figure 3.4: The position of a pixel in the image at row r and column c is
transformed to coordinates (v, h), and then to a point p in system space.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Radial distortion correction
A recorded image can suffer from radial lens distortion, which can be corrected (in an estimated sense) by a transformation.
We use the Tsai camera model:
k is the lens distortion coefficient which should be determined for a specific camera and lens prior to scanning.
Chapter 3 – Structured light scanning
3.2.2 Radial distortion correction
The coordinates of a surface point will be found by triangulation. To accomplish this we need to model the camera as an idealpinhole camera and assume the lens of the camera projects light rays linearly onto its sensor. In practice however a lens typically distorts the image slightly [46]. This distortion is radial meaning that it is greater for points further away from the centre of the image. Figure 3.5 illustrates the effect on a synthetic image.
⇒
V
H
(vu, hu) (vd, hd)
r
Figure 3.5: A recorded image can suffer from radial lens distortion, which can be corrected (in an estimated sense) by a transformation.
We will use the Tsai camera model [119] to correct for radial lens distortion. The model approximates the undistorted position (vu, hu) of a pixel from its distorted location (vd, hd)
given by (3.1), as follows:
hu =hd(1 +kr2),
vu =vd(1 +kr2),
with r=
�
h2d+v2d. (3.3)
In the model k is the lens distortion coefficient which should be determined for a specific
camera and lens prior to scanning. A value of k = 0 would imply no distortion, and the
greaterk is the greater the distortion.
We assume that the lens of the projector does not distort its output. This is true for many LCD projectors which automatically apply correction by an inverse distortion. It can also be verified by observing that the projection of a rectangle onto a plane normal to the projector axis remains a rectangle (i.e. the edges remain straight and rectilinear).
33
Chapter 3 – Structured light scanning
3.2.2 Radial distortion correction
The coordinates of a surface point will be found by triangulation. To accomplish this we need
to model the camera as an ideal pinhole camera and assume the lens of the camera projects
light rays linearly onto its sensor. In practice however a lens typically distorts the image
slightly [46]. This distortion is radial meaning that it is greater for points further away from
the centre of the image. Figure 3.5 illustrates the effect on a synthetic image.
⇒
V
H
(vu, hu) (vd, hd)
r
Figure 3.5: A recorded image can suffer from radial lens distortion, which can be corrected (in an estimated sense) by a transformation.
We will use the Tsai camera model [119] to correct for radial lens distortion. The model
approximates the undistorted position (vu, hu) of a pixel from its distorted location (vd, hd)
given by (3.1), as follows:
hu = hd(1 +kr2),
vu = vd(1 +kr2),
with r =
�
h2d +vd2. (3.3)
In the model k is the lens distortion coefficient which should be determined for a specific
camera and lens prior to scanning. A value of k = 0 would imply no distortion, and the
greater k is the greater the distortion.
We assume that the lens of the projector does not distort its output. This is true for many LCD projectors which automatically apply correction by an inverse distortion. It can also be verified by observing that the projection of a rectangle onto a plane normal to the projector axis remains a rectangle (i.e. the edges remain straight and rectilinear).
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Mapping from (v, h, n) to (x, y, z)
The geometry of the scanner, as viewed down the Y -‐axis (left) and down the Z-‐axis (right), is used to map points in the image to
corresponding surface points in system space Chapter 3 – Structured light scanning
3.2.3 Mapping from (v, h, n) to (x, y, z)
After a point in the captured image is corrected for in terms of radial distortion and mapped to a point in system space, it is mapped to a surface point. Figure 3.6 shows schematic diagrams of the scanner and target object in system space as viewed downY-axis (left) and Z-axis (right). The system parametersW,Dp,Ds and F are also shown.
A pixel (r, c) on a captured stripe is transformed to pointpon the sensor plane. It then follows
that the corresponding surface point q = (x, y, z) lies on the line through p and the focal
point of the camera. Suppose, for now, it is also known that the particular captured stripe has indexn(we consider the significant problem of establishing stripe indices in Chapter 4).
Then q is also on the plane parallel to the Y-axis through points (Dp,0,0) and (0,0, W n). The surface pointqis therefore the intersection of that plane and the camera ray.
X Z
Dp
W n
Ds
vP F F
x z
stri p e
n
p
q
camera pro
jector
θ
φ
X
Y
Dp
hP F F
x
y
p q
[image:13.842.220.598.99.375.2]camera
Figure 3.6: The geometry of the scanner, as viewed down the Y-axis (left)
and down theZ-axis (right), is used to map points in the image to
correspond-ing surface points in system space.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Mapping surface points
By triangulation:
Note that by construction
Combining these expressions:
Chapter 3 – Structured light scanning
Based on those arguments we can find expressions for the coordinates x, y and z of point q.
Similar triangles in Figure 3.6 (left) yield
z
Dp −x
= W n
Dp
=⇒ z = W n
Dp
(Dp −x), (3.4)
and
vP F
F =
Ds −z
Dp − x
=⇒ z = Ds − vP (Dp −x). (3.5)
Note that by construction Dp > x, Dp > 0 and F > 0.
Combining (3.4) and (3.5) yields
W n
Dp
(Dp −x) = Ds −vP (Dp − x) =⇒ x = Dp −
DpDs
vP Dp+ W n
, (3.6)
which can be substituted into (3.4) to give
z = W n
Dp
�
Dp− Dp +
DpDs
vP Dp +W n
�
= W nDs
vP Dp +W n
. (3.7)
Also, by similar triangles in Figure 3.6 (right),
y
Dp −x
= hP F
F =⇒ y = hP(Dp −x) =
hP DpDs
vP Dp +W n
. (3.8)
We can show that vP Dp + W n �= 0 by referring to Figure 3.6 (left), and observing that
vP = tan(φ) and −W n/Dp = tan(−θ). Since both φ and θ are in (−π/2,π/2) it follows
that vP = −W n/Dp if and only if φ = −θ. But if φ = −θ then the ray emanating from the
camera is parallel to the plane containing stripe n, so that they can never intersect at a point
q. Therefore such a scenario will never occur, and vP Dp +W n �= 0. In fact, for the camera
ray to intersect the stripe plane at a point in front of the scanner (i.e. such that x < Dp), the
angle φ must be strictly larger than −θ, so that
vP Dp+ W n > 0. (3.9)
To recapitulate,
x = Dp(1−K),
y = hP DpK,
z = W nK,
with K = Ds
vP Dp + W n
. (3.10)
These formulae can be applied to map a pixel with distortion corrected image coordinates (v, h) that lies on stripe n, to its corresponding surface point (x, y, z) in system space.
35 Chapter 3 – Structured light scanning
Based on those arguments we can find expressions for the coordinates x, y and z of point q. Similar triangles in Figure 3.6 (left) yield
z Dp −x
= W n Dp
=⇒ z = W n Dp
(Dp −x), (3.4)
and
vP F F =
Ds −z
Dp −x =⇒ z =Ds−vP (Dp −x). (3.5) Note that by construction Dp > x, Dp >0 and F > 0.
Combining (3.4) and (3.5) yields W n
Dp (
Dp −x) = Ds −vP (Dp −x) =⇒ x =Dp − DpDs vP Dp +W n
, (3.6)
which can be substituted into (3.4) to give
z = W n Dp
�
Dp −Dp +
DpDs vP Dp +W n
�
= W nDs
vP Dp +W n. (3.7)
Also, by similar triangles in Figure 3.6 (right), y
Dp −x = hP F
F =⇒ y =hP(Dp −x) =
hP DpDs
vP Dp +W n. (3.8)
We can show that vP Dp + W n �= 0 by referring to Figure 3.6 (left), and observing that vP = tan(φ) and −W n/Dp = tan(−θ). Since both φ and θ are in (−π/2,π/2) it follows that vP =−W n/Dp if and only if φ = −θ. But if φ = −θ then the ray emanating from the
camera is parallel to the plane containing stripe n, so that they can never intersect at a point
q. Therefore such a scenario will never occur, and vP Dp +W n �= 0. In fact, for the camera ray to intersect the stripe plane at a point in front of the scanner (i.e. such that x < Dp), the
angle φ must be strictly larger than −θ, so that
vP Dp +W n > 0. (3.9)
To recapitulate,
x =Dp(1−K), y =hP DpK, z =W nK,
with K = Ds vP Dp +W n
. (3.10)
These formulae can be applied to map a pixel with distortion corrected image coordinates (v, h) that lies on stripe n, to its corresponding surface point (x, y, z) in system space.
35 Chapter 3 – Structured light scanning
Based on those arguments we can find expressions for the coordinates x, y and z of point q. Similar triangles in Figure 3.6 (left) yield
z
Dp−x
= W n
Dp
=⇒ z = W n
Dp
(Dp−x), (3.4)
and
vP F
F =
Ds −z
Dp −x
=⇒ z = Ds −vP (Dp−x). (3.5)
Note that by construction Dp > x, Dp > 0 and F > 0.
Combining (3.4) and (3.5) yields
W n
Dp
(Dp− x) = Ds −vP (Dp−x) =⇒ x = Dp −
DpDs
vP Dp +W n
, (3.6)
which can be substituted into (3.4) to give
z = W n
Dp
�
Dp −Dp +
DpDs
vP Dp +W n
�
= W nDs
vP Dp +W n
. (3.7)
Also, by similar triangles in Figure 3.6 (right),
y
Dp −x
= hP F
F =⇒ y = hP(Dp−x) =
hP DpDs
vP Dp +W n
. (3.8)
We can show that vP Dp + W n �= 0 by referring to Figure 3.6 (left), and observing that vP = tan(φ) and −W n/Dp = tan(−θ). Since both φ and θ are in (−π/2,π/2) it follows
that vP = −W n/Dp if and only if φ = −θ. But if φ = −θ then the ray emanating from the
camera is parallel to the plane containing stripe n, so that they can never intersect at a point
q. Therefore such a scenario will never occur, and vP Dp +W n �= 0. In fact, for the camera
ray to intersect the stripe plane at a point in front of the scanner (i.e. such that x < Dp), the
angle φ must be strictly larger than −θ, so that
vP Dp +W n > 0. (3.9)
To recapitulate,
x = Dp(1−K),
y = hP DpK,
z = W nK,
with K = Ds
vP Dp +W n
. (3.10)
These formulae can be applied to map a pixel with distortion corrected image coordinates (v, h) that lies on stripe n, to its corresponding surface point (x, y, z) in system space.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Fixing system variables
1. Choose a calibration plane:
1. This plane will coincide with the Y –Z plane in system space and the
scanner will be positioned and calibrated in relation to it. 2. Calibrate the projector [Dp, W, centre stripe]:
1. The projector is positioned in front of the calibration plane such that its
central axis is normal to that plane. The parameter Dp can then be
measured directly as the perpendicular distance between the focal point of the projector and the calibration plane.
2. The parameter W is measured as the spacing between two consecutive
stripes on the calibration plane. Greater accuracy can be achieved by measuring a width W∗ over m stripes, and calculating W = W∗/m.
3. It is also necessary to identify and mark one reference stripe of a known
index, so that the other stripes in the image can be assigned indices
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Fixing system variables
3. Calibrate the camera [P]: a horizontal distance d is marked on
the calibration plane and recorded by the camera. The number of pixels, p, between the endpoints of d is determined. Then by triangulation:
Chapter 3 – Structured light scanning
[image:16.842.220.616.113.255.2]X Y Dp F pP F d camera
Figure 3.7: The central vertical column of the projection pattern should be captured to the central image column (left), regardless of depth. Similar trian-gles can be used (right) to determine the value of parameter P.
and in the exact centre of the image both vertically and horizontally. An image of this chart is recorded, and two points (v1, h1) and (v2, h2) on the same vertical grid line are selected
such that h1 �=h2. Since they lie on the same vertical grid line they should have the same
undistorted h-value, hence by (3.3),
h1
�
1 +k(h21+v12)�=h2
�
1 +k(h22+v22)� =⇒ k= h2−h1
h31−h32+h1v21−h2v22
. (3.11)
The procedure can be repeated for different pairs of points in the image, and along horizontal grid lines, which can then also give an indication of the consistency and suitability of the distortion model.
The last parameter to determine isP which indicates the size of a pixel on the sensing plane of
the camera. Refer to Figure 3.7 (right). A horizontal distance dis marked on the calibration
plane and recorded by the camera. The endpoints of this distance are corrected by (3.3), and the number of pixels, p, between them is determined. Then by similar triangles,
pP F F =
d Dp
=⇒ P = d pDp
. (3.12)
This concludes the calibration procedure.
38 Chapter 3 – Structured light scanning
X Y Dp F pP F d camera
Figure 3.7: The central vertical column of the projection pattern should be
captured to the central image column (left), regardless of depth. Similar trian-gles can be used (right) to determine the value of parameter P.
and in the exact centre of the image both vertically and horizontally. An image of this chart is recorded, and two points (v1, h1) and (v2, h2) on the same vertical grid line are selected
such that h1 �= h2. Since they lie on the same vertical grid line they should have the same
undistorted h-value, hence by (3.3),
h1 �1 + k(h21 + v12)� = h2 �1 + k(h22 + v22)� =⇒ k = h2 − h1
h31 − h32 + h1v12 − h2v22
. (3.11)
The procedure can be repeated for different pairs of points in the image, and along horizontal grid lines, which can then also give an indication of the consistency and suitability of the distortion model.
The last parameter to determine is P which indicates the size of a pixel on the sensing plane of
the camera. Refer to Figure 3.7 (right). A horizontal distance d is marked on the calibration
plane and recorded by the camera. The endpoints of this distance are corrected by (3.3), and the number of pixels, p, between them is determined. Then by similar triangles,
pP F
F =
d
Dp
=⇒ P = d
pDp.
(3.12)
This concludes the calibration procedure.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Choosing system parameter values
Dp: the distance between the projector and the system origin
W: the width between successive stripes on the calibration plane
Ds: the distance between the camera and the projector
P: the pixel size on the sensor plane of the camera
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Viewing volume
The viewing volume of the scanner is the space visible by both the camera and the projector. This volume is likely to increase as Ds is decreased.
Thus, Ds should be fixed as small as possible, i.e. have the camera as close to the projector as possible, because it may (1) increase the viewing volume and (2) decrease the likelihood of occlusions.
Chapter 3 – Structured light scanning
The remaining parameter that can be modified is Ds. We propose to fix Ds as small as
possible, i.e. have the camera as close to the projector as possible, because it may (1) increase the viewing volume and (2) decrease the likelihood of occlusions. These two concepts are discussed in more detail next.
The target surface has to be in the field of view of both the projector (so that stripes can be projected onto it) and the camera (so that the projected stripes can be recorded). The space visible from both the projector and the camera is called the viewing volume of the scanner. Usually the viewing volume can be bounded by a plane parallel to the Y–Z plane, because of a maximum focus distance of either the projector or the camera. Figure 3.8 illustrates. It seems, from the figure, that as Ds is decreased the viewing volume increases. There may
however be a lower bound on an optimal Ds due to the asymmetric field of view of the
projector. It depends on the specific equipment used, but as a general rule of thumb we say that a smallDs should yield a large viewing volume.
b
oun
ding
pla
ne
Ds
projector camera
viewing volume
−→
camera’s field of view
←−
projector’s field of view
Ds
projector camera
[image:18.842.199.625.102.223.2]viewing volume
Figure 3.8: The viewing volume of the scanner is the space visible by both the camera and the projector. This volume is likely to increase asDs is decreased.
The second reason for choosing a small distance between the camera and projector is to reduce the likelihood ofocclusions, which are regions of the target surface not visible from either the projector or the camera, or both. We define the following three types:
• camera occlusion : a region visible from the projector but not from the camera; • projector occlusion : a region visible from the camera but not from the projector; • scanner occlusion : a region not visible from either the projector or the camera.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Occlusions
• camera occlusion : a region visible from the projector but not
from the camera;
• projector occlusion : a region visible from the camera but not
from the projector;
• scanner occlusion: a region not visible from either the
projector or the camera.
Chapter 3 – Structured light scanning
The remainder of the surface, that which is visible from both the camera and projector, is the
scannable region. Occlusions are a well-known problem in surface scanning (see e.g. [31, 91]).
[image:19.842.183.606.108.248.2]Such regions cannot be reconstructed because sufficient information cannot be retrieved.
Figure 3.9 shows an example of a spherical target surface, and the different occlusion types.
As the camera is moved closer to the projector the occluded regions reduce in size, and
(in theory) setting Ds to zero would yield a maximum scannable area, and no camera or
projector occlusions. It seems therefore thatDsshould be fixed as small as possible to reduce
the likelihood of occlusions and increase the scannable area.
projector camera
Ds
←−
scanner occlusion
−→
projector occlusion
←−
camera occlusion
projector camera
Ds
←−
scanner occlusion
−→
projector occlusion
←−
camera occlusion
Figure 3.9: Occlusions, i.e. areas not visible from the projector and/or the camera, can be reduced by positioning the camera closer to the projector.
Of course, the size and location of occlusions depend on the shape of the specific target
surface. The effect of an increasing Ds on the scannable region of a synthetic face model is
shown in Figure 3.10†, for an arrangement similar to the one in Figure 3.1. It illustrates that,
as expected, the area of the scannable region decreases as Ds is increased.
†The pictures in Figure 3.10 were generated by testing angles. Suppose the focal points of the projector and
camera are positioned atpandcrespectively. A surface pointxwith outward normalnis then visible from
the projector if the line segment between pand x does not intersect the surface, and if the angle between
(p−x) and nlies in (−90◦,90◦), i.e. if n·(p−x) >0. Similarly, x is visible from the camera if the line
segment between c andx does not intersect the surface and n·(c−x) >0. The scannable region is then
the union of all points visible from both the projector and the camera. We did not take into account their
bounded fields of view which would reduce the scannable area even more.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Occlusions and angle of projection
• Scannable area is decreased as Ds is increased
• Size of scannable region may also depend on angle of projection
Chapter 3 – Structured light scanning
Ds= 50 Ds= 100 Ds= 250 Ds= 400 Ds= 600 Ds= 1,000
Figure 3.10: The area of the scannable region decreases as the distance Ds (shown in mm) between the camera and projector is increased, as illustrated here on a synthetic face model.
3.3.3 The angle of projection
So far we have considered as the projection pattern an array of evenly spaced horizontal stripes. As mentioned in section 2.1.4, an advantage of continuous stripes over a dot pattern is the improved sample resolution attained. We now consider the angle at which these stripes are projected on the face.
For the camera to record a sensible deformation caused by the shape of the target surface, and for the mapping in (3.10) to be applicable, the stripes have to be projected parallel to theY-axis. Figure 3.7 (left) illustrates this well, where the projected horizontal stripe reveals
[image:20.842.165.629.285.461.2]the shape of the surface it hits while no deformation is visible for the vertical one. We can however freely rotate the entire scanner, i.e. the projector and camera as a combined whole, which enables the projection of stripes vertically, diagonally, or indeed at any desired angle. An optimal angle of projection depends entirely on the shape of the target surface.
Figure 3.11 illustrates the effect of rotating the scanner on the size of the scannable region of a face. Because of the sides of the face and nose, both large surface regions which are almost parallel to theX–Z plane, the horizontal configuration shown in Figure 3.11(a) seems
favourable. The projection of vertical stripes requires the camera to be next to (as opposed to on top of) the projector with respect to the face, which can easily cause the sides of the face and nose to be occluded from either the projector or the camera.
42
Chapter 3 – Structured light scanning
(a) horizontal stripes (b) diagonal stripes (c) vertical stripes
Figure 3.11: An illustration of the difference in size of the scannable region (shown in dark) of a typical face attained from projecting (a) horizonal, (b) diagonal and (c) vertical stripes.
3.4 ANALYSIS OF THE RESOLUTION
In this section we explore the resolution of the scanner. By resolution we mean the smallest measurable distance that the scanner is able to recognize, or into how small an interval the reconstructed points can be resolved. It is therefore related to the sample density, and also gives an indication of the achievable degree of accuracy.
The stripes project to a discrete set of curves on the surface of the target object, and the captured image consists of a discrete array of pixels in which we locate recorded stripes. The resolution of the scanner would therefore be directly related to the spacing of the projected stripes and the resolution of the pixel space. For a theoretical surface point we will measure the change (e.g. inmm) caused by a perturbation of a single sample unit, which can be either a stripe index or one pixel. We distinguish between the following three resolutions:
(1) thevertical resolution measured along theZ-axis, affected by the stripe index;
(2) thehorizontal resolution measured along the Y-axis, affected by the horizontal
dimen-sion of the pixel space;
(3) thedepth resolution measured along the X-axis, affected by the vertical dimension of
the pixel space.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Resolution
• The vertical resolution (left) of the scanner is affected by the
spacing of the projected stripes. The horizontal resolution (right) is dependent on the horizontal dimension of the pixel space.
Chapter 3 – Structured light scanning
Refer to Figure 3.12 (left). The vertical resolution is dependent on the spacing of the stripes and clearly also on the depth, or location in the X-direction, of the target surface. Consider
an arbitrary surface point (x, y, z) on stripen, and letδzdenote the change of this point along
the Z-axis caused by perturbing the stripe index from n to n+ 1. It follows from similar
triangles that
δz= W Dp
(Dp−x). (3.15) This value is the smallest measurable distance (i.e. resolution) of the scanner along theZ-axis,
and it gets smaller as the target surface is moved closed to the projector.
X Y Z projector plane of interest ← −
δz X
Y Z cam plane of interest ←− δy image plane
Figure 3.12: The vertical resolution (left) of the scanner is affected by the spacing of the projected stripes. The horizontal resolution (right) is dependent on the horizontal dimension of the pixel space.
For the horizontal resolution refer to Figure 3.12 (right). One row of pixels maps to a uni-formly spaced row of points in a particular fixed depth. Consider one such point (x, y, z)
coinciding with pixel (v, h) in the image, and let δy denote the change of the surface point
along the Y-direction caused by moving fromh toh+ 1. Similar triangles yield
δy=P(Dp−x). (3.16) This expresses the smallest measurable horizontal distance (resolution) of the scanner along the Y-axis. Again, this distance decreases as the target surface is moved nearer to the
projector.
44 Chapter 3 – Structured light scanning
Refer to Figure 3.12 (left). Thevertical resolution is dependent on the spacing of the stripes and clearly also on the depth, or location in theX-direction, of the target surface. Consider
an arbitrary surface point (x, y, z) on stripe n, and letδz denote the change of this point along
the Z-axis caused by perturbing the stripe index from n to n + 1. It follows from similar
triangles that
δz = W Dp
(Dp−x). (3.15)
This value is the smallest measurable distance (i.e. resolution) of the scanner along theZ-axis,
and it gets smaller as the target surface is moved closed to the projector.
X Y Z projector plane of interest ← −
δz X
Y Z cam plane of interest ←− δy image plane
Figure 3.12: The vertical resolution (left) of the scanner is affected by the spacing of the projected stripes. The horizontal resolution (right) is dependent on the horizontal dimension of the pixel space.
For the horizontal resolution refer to Figure 3.12 (right). One row of pixels maps to a uni-formly spaced row of points in a particular fixed depth. Consider one such point (x, y, z)
coinciding with pixel (v, h) in the image, and let δy denote the change of the surface point
along the Y-direction caused by moving from h to h+ 1. Similar triangles yield
δy =P(Dp−x). (3.16)
This expresses the smallest measurable horizontal distance (resolution) of the scanner along the Y-axis. Again, this distance decreases as the target surface is moved nearer to the
projector.
44
Chapter 3 – Structured light scanning
Refer to Figure 3.12 (left). The vertical resolution is dependent on the spacing of the stripes and clearly also on the depth, or location in the X-direction, of the target surface. Consider
an arbitrary surface point (x, y, z) on stripe n, and letδz denote the change of this point along
the Z-axis caused by perturbing the stripe index from n to n+ 1. It follows from similar
triangles that
δz = W Dp
(Dp −x). (3.15)
This value is the smallest measurable distance (i.e. resolution) of the scanner along theZ-axis,
and it gets smaller as the target surface is moved closed to the projector.
X Y Z projector plane of interest ← −
δz X
Y Z cam plane of interest ←− δy image plane
Figure 3.12: The vertical resolution (left) of the scanner is affected by the spacing of the projected stripes. The horizontal resolution (right) is dependent on the horizontal dimension of the pixel space.
For the horizontal resolution refer to Figure 3.12 (right). One row of pixels maps to a uni-formly spaced row of points in a particular fixed depth. Consider one such point (x, y, z)
coinciding with pixel (v, h) in the image, and let δy denote the change of the surface point
along the Y-direction caused by moving from h to h+ 1. Similar triangles yield
δy =P(Dp−x). (3.16)
This expresses the smallest measurable horizontal distance (resolution) of the scanner along the Y-axis. Again, this distance decreases as the target surface is moved nearer to the
projector.
44
Chapter 3 – Structured light scanning
The ratio between the horizontal and vertical resolutions is given by
δz δy =
W
P Dp
, (3.17)
which, with our current system parameters of Dp = 790mm, P = 0.0006 and W = 3.08mm,
yields a value of about 6.5. Therefore a local region of the reconstructed surface, where the depth is approximately constant, would be roughly 6.5 times more densely sampled in the horizontal dimension (along a stripe) than in the vertical (across the stripes).
Next we investigate the depth resolution, measured along the X-axis, which depends on the
vertical dimension of the pixel space. If a captured stripe has no vertical variation in the pixel space it implies that the target surface along that stripe has no variation in depth. As the depth of a particular surface point on a stripe changes the corresponding pixel of that stripe moves in the vertical direction (i.e. up or down).
Consider a surface point (x, y, z) on stripe n coinciding with pixel (v, h) in the image. We
aim to find an expression that measures the difference in depth, δx, of this point caused by
perturbing v to v+ 1, as Figure 3.13 (left) illustrates. Using the formula for x in (3.10),
δx = DpDs
vP Dp+W n −
DpDs
(v+ 1)P Dp+W n
= P Dp2Ds
[vP Dp+W n][(v+ 1)P Dp +W n]
. (3.18)
projector camera
image plane
stripe n
[image:21.842.214.623.104.268.2]1 p ixe l δx X Z x x� A B C D E F G H image plane 1p x 1p x n1 n2
Figure 3.13: The depth resolution is measured as the change δx along the
X-axis caused by a perturbation of one pixel in the vertical direction. It can
be shown that for a given X-position, δx is independent of the stripe index.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Building 3D models
• The image contains a darker stripe which is the centre of the system
• Steps to 3D reconstruction: locate stripes, index stripes, map to 3D
space.
Chapter 4 – Building 3D face models
[image:22.842.131.728.94.282.2]including sub-pixel estimation, the incorporation of black stripes, generating a mesh surface from the output 3D point cloud and mapping stripe-free texture onto it.
Figure 4.1 shows an image of a face recorded by our structured light scanner. The dense pattern of horizontal stripes can be seen in the close-up (left). The single darker stripe across the upper lip is a reference stripe which has a known index. The algorithms we discuss in this chapter are designed to establish arelative indexing, and in section 4.6.1 we will use the reference stripe to shift from relative to absolute indices.
Figure 4.1: A captured image in which the projected stripes can be seen. A close-up of an area around the nose tip and right nostril is shown on the left.
In Chapter 3 we derived equations for mapping a single pixel in such an image to its corre-sponding point in system space. The process of reconstructing an entire surface consists of the following three steps, illustrated in Figure 4.2 on a small image region:
(1) pixels in the image that contain the vertical centres of stripes are located;
(2) these pixels are indexed, by which we mean a stripe indexnis allocated to each (in the
figure the different indices are depicted as different colours);
(3) every stripe pixel now has a position (v, h) and an index n, and the mapping (3.10) is
applied to each to determine a collection of points in system space that can be connected by a mesh surface.
54 Chapter 4 – Building 3D face models
captured image (1) locate stripe pixels (2) index stripes (3) map to 3D space
Figure 4.2: The stripes in a 2D image captured by the scanner have to be (1)
located and then (2) indexed in order to (3) reconstruct a 3D surface.
4.1 LOCATING STRIPES IN THE IMAGE
[image:22.842.269.560.100.268.2]Firstly we need to locate the stripes in the image. This can be achieved by a simple and fast column-by-column scan through the image, marking potential stripe pixels along the way. Figure 4.3 illustrates the basic principle on a small image region. Potential stripe pixels are
determined aslocal maxima in the greyscale luminance values of every column. The output
of the procedure can be a binary matrix P such thatP(r, c) is 1 if pixel (r, c) is marked as a
potential stripe pixel, and 0 otherwise.
column
image
grey
scale
row output
Figure 4.3: Stripe pixels are determined as local maxima in the greyscale
values of every column in the image.
[image:22.842.203.639.335.446.2]Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Locating stripes
• Stripe pixels are determined as local maxima in the greyscale
values of every column in the image:
Chapter 4 – Building 3D face models
captured image (1) locate stripe pixels (2) index stripes (3) map to 3D space
Figure 4.2: The stripes in a 2D image captured by the scanner have to be (1)
located and then (2) indexed in order to (3) reconstruct a 3D surface.
4.1 LOCATING STRIPES IN THE IMAGE
Firstly we need to locate the stripes in the image. This can be achieved by a simple and fast column-by-column scan through the image, marking potential stripe pixels along the way. Figure 4.3 illustrates the basic principle on a small image region. Potential stripe pixels are
determined aslocal maxima in the greyscale luminance values of every column. The output
of the procedure can be a binary matrixP such thatP(r, c) is 1 if pixel (r, c) is marked as a
potential stripe pixel, and 0 otherwise.
column
image
grey
scale
row output
Figure 4.3: Stripe pixels are determined as local maxima in the greyscale
values of every column in the image.
55 Chapter 4 – Building 3D face models
We define a local maximum primarily to be any pixel (r, c) satisfying
A(r, c) > A(r − 1, c) and A(r, c) > A(r + 1, c). (4.1)
Here A(r, c) indicates the greyscale value of a pixel (r, c). To reduce the effects of noise in the
image we can set a threshold � below which local maxima are ignored. Our algorithm also
has an amplitude threshold α which indicates the smallest permissible difference between the
luminance values of a local maximum and its preceding local minimum.
The complete pseudo-code of our procedure for locating stripe pixels is given in Appendix B.
It traverses every pixel of every column in an input M × N image once, without retracing,
[image:23.842.210.648.105.222.2]and the computational complexity is therefore linear in the number of pixels. In Chapter 6 we provide some experimental results on the performance and speed of our methods in a biometric setting (where large numbers of faces need to be scanned and recognized).
Figure 4.4 shows the located stripe pixels for the image in Figure 4.1. The next step is to index these pixels, i.e. to decide which of the located stripe pixels correspond to which of the projected stripes. Finding correct indices may be rather simple for flat and “uncomplicated” surfaces but, as the close-up in Figure 4.4 suggests, a definitive solution can be much harder to determine in areas where the curving or bending of the target surface causes discontinuities in the recorded stripes.
Figure 4.4: The located stripes for the image in Figure 4.1. A close-up of the same area around the nose is shown on the left.
56 Chapter 4 – Building 3D face models
We define a local maximum primarily to be any pixel (r, c) satisfying
A(r, c)> A(r−1, c) and A(r, c)> A(r+ 1, c). (4.1)
HereA(r, c) indicates the greyscale value of a pixel (r, c). To reduce the effects of noise in the image we can set a threshold � below which local maxima are ignored. Our algorithm also
has an amplitude thresholdα which indicates the smallest permissible difference between the
luminance values of a local maximum and its preceding local minimum.
The complete pseudo-code of our procedure for locating stripe pixels is given in Appendix B. It traverses every pixel of every column in an input M ×N image once, without retracing, and the computational complexity is therefore linear in the number of pixels. In Chapter 6 we provide some experimental results on the performance and speed of our methods in a biometric setting (where large numbers of faces need to be scanned and recognized).
Figure 4.4 shows the located stripe pixels for the image in Figure 4.1. The next step is to index these pixels, i.e. to decide which of the located stripe pixels correspond to which of the projected stripes. Finding correct indices may be rather simple for flat and “uncomplicated” surfaces but, as the close-up in Figure 4.4 suggests, a definitive solution can be much harder to determine in areas where the curving or bending of the target surface causes discontinuities in the recorded stripes.
Figure 4.4: The located stripes for the image in Figure 4.1. A close-up of the same area around the nose is shown on the left.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Indexing the stripes
• Starting from the centre stripe, different indices are depicted as
alternating colours in a repeating sequence
• A number of algorithms have been implemented based on flood
filling (Robinson), MST-‐Maximum Spanning Tree (Brink)
• This is a problem that can be improved
Chapter 4 – Building 3D face models
Figure 4.11: Result of applying the MST indexing algorithm on the stripe pixel array in Figure 4.4. Different indices are depicted as alternating colours in a repeating sequence, and the same area around the nose is shown on a larger scale.
A remaining issue is that of the connectivity of the graph G. A tree that spans all the
vertices in G can be found only if G is connected, such that every possible pair of vertices
is connected by a series of edges. Since no relationship can be established between two disconnected components ofG, there is also no conceivable way (in an uncoded pattern) to
relate their indices. For this reason we will always find an MST of the largest connected component. The image in Figure 4.4, for example, contains a number of remote vertices in the bottom right corner that could not be indexed.
This concludes the description of the MST indexing algorithm for establishing indices in an uncoded structured light scanning system. The method should be far less likely to cause errors than those mentioned in the previous section. The result is affected neither by a specific starting point nor by the order in which individual stripe pixels are visited, and a single missing stripe pixel can no longer have a major impact on the outcome. However, as is shown in the next section, there is a problem inherent to the indexing of uncoded stripe patterns. We claim that in general this problem cannot be solved without incorporating some additional distinguishing information in the projected stripes.
Marcos A Rodrigues – Fast 3D Reconstruction using Structured Light Methods
Tutorial on 3D Surface Reconstruction in Laparoscopic Surgery. 22nd September ,Toronto, Canada
Undetectable index shifts
• Sudden drop in
surface depth
• Critical depth
changes
Chapter 4 – Building 3D face models
4.5 UNDETECTABLE INDEX SHIFTS
An image which causes our indexing algorithm to assign incorrect indices to some stripe pixels is shown in Figure 4.12(a). The problem occurs in an area around the nose. A close-up of this region is shown in (b) and the stripe pixels in (c). A number of indices determined by the MST algorithm is shown in (d) and a correct indexing in (e). The reason for the incorrect indexing is quite clear. Parts of two adjacent stripes appear as one continuing stripe in the image, meeting where the arrow points in Figure 4.12(d). The MST indexing algorithm unknowingly joins these pixels together in a singlewe–connected group and proceeds to index
them with the same index. If the edge of the nose were able to create a break in that seemingly continuous stripe, as is illustrated in (e), the algorithm can index correctly.
We call the phenomenon observed in Figure 4.12 which causes incorrect indexing an unde-tectable index shift. In this section we first establish when it can occur, and then argue why it is quite likely to arise in the scanning of faces. We also discuss a number of approaches for dealing with the problem.
−→
←− edge of nose
(a) captured image
(b) close-up (c) stripe pixels
[image:25.842.240.732.98.513.2](d) incorrect indexing (e) correct indexing
Figure 4.12: For this image the MST algorithm indexes incorrectly across
the edge of the nose. This is due to a sudden drop in surface depth (from the nose to the cheek) which causes two different stripes to align vertically.
67
Chapter 4 – Building 3D face models
4.5.1 Critical depth changes
We aim to find conditions under which an undetectable index shift may occur, i.e. when parts of two stripes appear on the same row in the recorded image. Figure 4.13 (left) shows two adjacent stripes n and n+ 1 projected onto a target surface across an abrupt drop in the
depth. Suppose that this change in surface depth, which we denote by ∆x, causes the left
part of stripen+ 1 and the right part of stripento be captured to equal vertical coordinates
in the image, as shown.
cam
projector
∆x
stripen
stripen+1
view from the camera
n
n+1 n
n+1 A B C D E F G H n1
n1+1 n2 n2
+1
Figure 4.13: An abrupt change ∆xin surface depth (left) causes two succes-sive stripes n and n+ 1to align from the camera’s point of view (middle). It can be shown that the value of∆x is independent of n(right).
We call ∆x a critical change in depth and proceed to derive an explicit expression for it.
Consider surface pointA on stripe n+ 1before the drop in surface depth, and surface point B on stripenafter the drop in depth. These points are assumed to map to the same vertical
coordinate, say v, in the image. If we define x to be the depth of A and x� the depth of B
then, using (3.10),
∆x=x−x�=Dp
�
1− Ds
vP Dp+W(n+ 1)
� −Dp
�
1− Ds vP Dp+W n
�
. (4.8)
From equations (3.4) and (3.5) we have
v = Ds−
W n
Dp(Dp−x)
P(Dp−x)
= Ds
P(Dp−x) −
W n P Dp
, (4.9)