7 EVALUATION OF SUITABILITY OF EXISTING TECHNOLOGIES FOR UNDERTAKING ROUTINE
7.2 Potential approaches for image processing and display
7.2.2 Three-dimensional image-stitching techniques
Description of method
Using different, but similar approaches of identifying specific features in multiple images it is possible to calculate the 3-D shape of a scene. Such systems determine the 3-D location of the identified features, and use these to create point clouds of the scene, upon which they can display the rest of the image data, either as a full 3-D model, as can be achieved with software such as Autodesk 123D Catch® (Lo Brutto & Meli, 2012), or as a sort of 2.5-D model such as produced by Photosynth (http://www.photosynth.net/) in which the individual 2-D images are displayed having been aligned and oriented as if three-dimensional.
As part of the assessment of systems for potential use in routine visual inspections the capabilities of Photosynth were investigated. Photosynth is a web based system provided by Microsoft Live Labs. The application can create 2-D panoramas in a way similar to that described in Section 7.2.1, but can also produce more advanced pseudo-3-D reconstructions. These reconstructions are useful for displaying scenes where the images have been taken from different locations, showing different views of the scene. The system requires no special equipment, and there is no need for information about the camera settings, location, or orientation for any of the images. The system can use images from multiple sources in the creation of models.
The application looks for common features in the series of photographs, and uses these to generate a point cloud and 3-D model of the scene (Snavely, 2008). The system then aligns the images with the point cloud, and allows the user to navigate images in a way which gives a more interactive experience than simply clicking on an array of images side by side, or one after the other. The user can change viewpoint, or look round features as desired, as long as images are available and have been correctly aligned.
The Photosynth online documentation and help (http://photosynth.net/help.aspx) suggests that the feature matching algorithms may struggle when confronted with scenes containing too many or too few features. This means that Photosynth may struggle to accurately locate images of relatively blank surfaces, such as concrete, or highly patterned surfaces, such as a masonry structure, as was also seen with the 2-D processing software (Section 7.2.1).
To test the potential usefulness of the system a test was undertaken. A standard point and shoot digital camera was used to take 347 photos of a bridge following the advice on the Photosynth website, imaging out from the middle of the road under the bridge in a circle, then in from the edges of the bridge, then filling in gaps on the east and west approaches to the bridge. The image collection took approximately 45 minutes. The images were then supplied to the stitching software, which took several hours to process them and produce the output model.
Figure 70 shows some of the images presented within the Photosynth application. The images present progressively closer and more detailed images of the north east corner of the bridge, and within each display the main image can be seen, as can other images showing the same, or similar overlapping parts of the scene. Selecting these neighbouring images changes the display to focus on these images instead, and presents a new set of similar alternative images and views. This demonstrates how the images can be viewed and manipulated to see both contextual and detailed views of the structure. Figure 71 shows a plan view of the point cloud of the bridge, created using image pixel data alone.
Figure 70: Series of images, displayed in Photosynth browser window, moving progressively closer to part of east end of north abutment. Also
Figure 71: Point cloud of bridge showing a selection of ‘highlight’ views. The green circle shows the location of the current image (which is displayed above the green circle). The white triangular shape shows the location from where the image was taken, and the field of view of the image. The smaller images shown on the right of the display show features which have been marked and labelled as ‘highlights’ by the inspector. These can be zoomed to by either clicking on the images, or by clicking on the point cloud tags.
Benefits of method
The lack of requirement for information about camera position, or orientation means that such an approach would be compatible with free- form, unsystematic collection of images with normal cameras. This could be done as part of a standard GI;
Interacting with the models (when they have been successfully created) enables inspectors to get a sense of being at the bridge;
The system can incorporate images from multiple sources and cameras, and can provide more detail when necessary;
Models are quite intuitive with very little practice (when successfully created).
Problems with method
Struggles to create models if too many, or too few features in images;
Mis-location of images can cause confusing effects and produce parts of the model which are hard to navigate;
No way of correcting or over-ruling mistakes;
Interface does not allow inspection results to be overlaid on the images or model, and producing inspection tools to work with 3-D models is not a trivial task.
Systems such as Photosynth can produce impressive visualisations of scenes, and where absolute accuracy is not required; these can be informative and help to mentally place a remote observer in a scene or location. However, they are susceptible to similar problems to the 2-D visualisation systems in that they can struggle to accurately and reliably locate images within their models, particularly when presented with too many, or insufficient features. It is also difficult to interact with the models and mark the locations of defects in quantitative and objective manners. It is felt that the benefits of the models produced are largely cosmetic and that adoption of such tools is, at the present time, unnecessary in routine visual inspection of bridges.