A reconstruction method presented in this thesis premises that silhouette images belong to a certain motion to facilitate the projection estimation, e.g., a circular motion and an approximate circular motion. However, recent vision systems can determine a camera motion from a sequence of images. Therefore, it is possible to realise a more practical reconstruction system that adaptively selects effective images for SfS from a video clip. Furthermore, object detection and segmentation become more efficient if the motion is analysed.
The goal of this thesis is the generation of 3D object model from 2D images, but not involving texture generation which is essential for generating a photo-realistic model. Although SfS methods includes many object images, it is a challenging task to estimate true texture from them. This is because each image is captured at various illumination conditions, and the shading effects on multiple images complicate the prediction of the
true colour of a 3D point. Moreover, in order to avoid the texture of an occluded object in the current viewing direction, multiple views used in SfS should be ordered according to the viewing direction before a true texture estimation. For example, two images at 0◦ and 180◦ rotation cannot be used simultaneously, when approximating a colour of a
reconstructed 3D point. Also, 3D smoothing algorithm and normal vector estimation are important for enhancing the quality of the visualisation.
Finally, it is worth exploring a matching method that can extend the proposed clique descriptor to cope with 3D object recognition. When true texture information corresponding to a 3D position is available, a new feature descriptor can be defined from the association of photometric feature with 3D geometric feature, which can enhance the performance of vision-based recognition further.
Appendix A
List of publications
1. Shin, D. and Tjahjadi, T.,“Clique descriptor of maximally stable regions” submit- ted to the 12th IAPR international workshop on structural and syntactic pattern recognition. Jun. 2008.
2. Shin, D. and Tjahjadi, T., “Similarity invariant Delaunay graph matching,” sub- mitted tothe 12th IAPR international workshop on structural and syntactic pattern recognition, Jun. 2008.
3. Shin, D. and Tjahjadi, T., “Local hull-based surface construction from octree,” IEEE Transactions on Image Processing, vol. 17, no. 9, Aug. 2008, pp.1251- 1260. 4. Shin, D. and Tjahjadi, T., “Triangular mesh generation of octrees of non-convex 3D objects,” The 18th International Conference on Pattern Recognition (ICPR2006), Aug. 2006, pp. 950-953.
5. Shin, D. and Tjahjadi, T., “3D Object reconstruction from multiple views in ap- proximate circular motion,”Proceedings of IEEE SMC UK-RI Chapter Conference on Applied Cybernetics 2005, Sept. 2005, pp. 70-75.
Appendix B
Preliminary projective
geometry
In a perspective view, two parallel lines converge to a point, and the size of an object is scaled according to the focal length of a camera. To efficiently express these phenomenon in terms of mathematical notation, homogeneous representation has been developed in a projective space. This section introduces some preliminary projective geometry including explanations of mathematical notations of geometric primitives in a projective space.
B.0.1
Geometric primitives in 2D projective space
Suppose that an image captures a point in a 3D Euclidean spaceR3. A position of the point in an image plane is then determined by the intersection of an image plane with a ray that starts from a camera centre toward the point. A point notation should account for this pinhole camera geometry and always indicates an identical point even though an image plane changes their position under the fixed camera position. Furthermore, it should be possible for the notation to represent an image of a point at infinite called a vanishing point. A homogeneous notation describes a point!pin a 2D projective spaceP2 as a vector defined by two positional elements and a scaling element, i.e., !p= [x y s]T.
Therefore, a normalised homogeneous point ¯p, which hass= 1, is equivalent to a point
!
at infinity.
A line inP2 can also be described by a vector with three elements. For example, coefficients of a line equationax+by+c= 0 defines a homogeneous line representation
!l= [a b c]T, which becomes!l= [ka kb kc]T when it is scaled by k. Thus, a line and
a point in P2 are not distinguishable in the homogeneous representation, so that any theorem devised for a point inP2 can be interchangeable to a line and vice versa, i.e., a line is dual to a point [3]. By expressing a point and line geometry in this way, this duality is more beneficial than a traditional Euclidean vector notation. For example, an intersection point!xof two lines,!l1 and!l2, are expressed by simple cross product of two
vectors,
!x=!l1×!l2= [!l1]×!l2, (B.1)
where [·]×denotes a 3-by-3 skew symmetry matrix defined by 3 line coefficients with zero
diagonal elements, so that [!l1]T×=−[!l1]× and
[!l1]×= 0 −c1 b1 c1 0 −a1 −b1 a1 0 . (B.2)
When a point!xlies on a line!l1, they comply with
!xT!l1= 0, (B.3)
and a line!l1associated with two points !x1 and!x2is represented as
!l1= [!x1]×!x2. (B.4)
A conic is a second order 2D curve produced by the intersection of an image plane with quadrics (e.g., a sphere, a elliptical surface, and a quadratic surface defined inP3). Thus, a conic includes a circle, an ellipse, a hyperbola and a parabola inP2. Since a conic is a second order polynomial in an image plane, it is sufficient to describe a conic with six coefficients of the polynomial, which are stored in a 3-by-3 symmetric matrix. For
example, if a point!xlies on a conicC, it satisfies
!xTC!x= 0, (B.5)
whereC is defined by the coefficients ofax2+bxy+cy2+dx+ey+f = 0, i.e.,
C= a b2 d2 b 2 c e 2 d 2 e 2 f . (B.6)
Similar to the third element of a 2D point or a line, the last elementf of a conic C in (B.6) becomes a scaling factor. In particular, tangent lines can also be used to define a conic, assuming the point and line duality, and this conic representation is called a dual conic.