Discussion - Testing and evaluating non extractive sampling platforms to assess deep water rock

The composition of benthic habitats along a 4.6 km AUV track at the O’Hara Bluff site was predicted, with a generally high degree of confidence, using 26 automatically extracted image features in conjunction with the random forests ensemble classifier. With the exception of the rugosity feature, which requires stereo-imagery, all feature extraction techniques presented in this study can be applied to any kind of imagery data and to other (non-biological) disciplines, e.g., to classify sediment morphology in geological studies. Using random forests to classify images according to habitat type based on image features proved to be successful and is considerably more cost- effective than traditional techniques, such as manually assigning habitat classes. Cost-effectiveness is particularly important because we found that the full predictor dataset achieved the most accurate ensemble classifier. Large datasets may rapidly exceed the capacity of non-automated methods, for example, our 4.6 km single-site dataset comprised 3586 images each with 26 predictors. This finding is consistent with the view of Breiman (2002) who stated that “newer methods in machine learning thrive on variables — the more the better. There is no need for variable selection, . . . ”.

To understand the reasons that led to habitat misclassification, wrongly classified images were reviewed manually. The majority of misclassified images was partly- or underexposed, which reduced the information obtained by the feature extraction

routines. The remaining images were wrongly or inconsistently labelled by the

fold: (i) uneven illumination and (ii) substrate reflectivity. Uneven, partial or insufficient illumination occurred mostly over extremely rugged terrain and during obstacle avoidance manoeuvres over highly complex reef environment with vertical drops of several metres. Different reflectivity and absorption properties of different substrates, e.g., sand, shell rubble, macroalgae, sponges, etc. also caused under/over-exposure. Whereas sand has a relatively high reflectivity and images of this habitat class might be slightly over-exposed, thick macroalgal cover has a low reflectivity resulting in slightly under-exposed images. Although this was a minor effect, it was evident in the imagery. With respect to the poorly performing patch- gap summaries (PG) as a predictor, it is conceivable that adjusting the threshold value, used in converting grey-scale images to black and white images based on the overall brightness of the image, could improve PG predictor performance. The

current method used a constant threshold value of 128. Wrong or inconsistent

labelling of imagery by the human annotator occurred primarily for three reasons. First, the threshold that distinguished habitat classes ‘screw shell rubble’ and ‘screw

shell rubble/sand’ was defined as >50% and <50% screw shell cover, respectively.

Without actually measuring the area, this estimate could be out by up to 20% (personal communication, Mark Green, CSIRO) and will differ between different observers. The aforementioned principle also applies to the habitat class ‘Ecklonia’,

which was scored whenEcklonia radiata cover exceeded 50% of the image. Second,

camera orientation affects perception of sloping surfaces and appeared to account for

the high misclassification rate of ‘low relief reef’ as ‘high relief reef’ and vice versa.

The rule distinguishing ‘high’ and ‘low’ relief reef was>20 cm or<20 cm elevation change, respectively. In some situations this might be a difficult annotation decision and is prone to error with a downward-looking field of view. Thirdly, distinguishing between the habitat classes ‘patch reef’ and ‘reef-sand ecotone’ might in some cases require knowledge of neighbouring images to come to a conclusive decision, that is, whether a small isolated reef is ‘patch reef’ or the beginning of a larger reef and therefore ‘reef-sand ecotone’. Although this knowledge is available to a human annotator, random forests is oblivious of this knowledge since it classifies habitats on a case by case (image by image) basis.

Jensen (2004) advocates the use of a hybrid neural network – classification tree

system, to reduce classification error. We cannot use our data to validate this

proposal but were able to demonstrate that there are differences in prediction accuracy between different tree-based classification methods — CART, QUEST and random forests. Although Pal and Mather (2003) report QUEST to outperform CART using terrestrial remotely sensed imagery, our study results show the opposite; CART outperformed QUEST. Notably, random forests outperformed both CART and QUEST and also offered some valuable built-in assessment tools to evaluate

model performance. The first tool offers insights into the ‘variable (predictor)

importance’ and can be useful in the exclusion of predictors that contribute little or nothing to prediction accuracy. Using this feature, the patch-gap summaries proved to be of little importance for overall accuracy, but were essential to increase specific habitat classification accuracy for ‘screw shell rubble/sand’. However, an overall increase in accuracy will also be reflected in habitat specific prediction accuracy.

The discrepancy between estimated error rate of the bootstrapped training data set and prediction error rate could be remedied by ‘growing’ more trees, thereby giving random forests a better chance to learn. However, in some instances the number of trees ‘grown’ does not warrant a lower estimated error rate due to the nature of the data. In our case, increasing the number of trees did not change the error rate estimation. In some cases, increasing the number of images in the training data set, which at the same time will magnify human work load, can help decrease the estimated error rate. However, since this paper is presenting methods to decrease human work load, a balance must be struck between work load and accuracy.

Prediction accuracy is related to the number of habitat classes. Reducing our

habitat classes to two, i.e., rock and sand, would have resulted in virtually 100% prediction accuracy (Friedman, 2010). Lucieer and Pederson (2008) found similar improvements in accuracy by reducing the number of classes from 3 to 2 (72% vs 81%). It should be noted though, that misclassifications are usually random. For example, assuming a 100 m stretch of homogenous habitat, a minimum of 5 images

out of∼80 images would have been misclassified at a significance level of 5%. These

5 images would have been placed randomly along the 100 m stretch and would not have changed the overall impression of the habitat composition. As with many other predictions in statistical modelling, it is good standard practice to provide

an error estimate for every predicted value. The entire method, image feature

extraction and classification, described in this paper could be easily extended to full coverage acoustic data sets, e.g., gridded bathymetry and geo-referenced visual ground-truthing data such as digital stills and video. Our predicted habitat map

provides a snapshot of the habitat composition that is bound to change over time. In its current form, it already provides evidence of the extent of the invasive New

Zealand screw shell Maoricolpus roseus (Allmon et al., 1994). The use of the AUV

in conjunction with the new methods described in this study is not restricted to Tasmanian deep-water rocky reefs. Other highly complex and vulnerable habitats such as coral reefs can be monitored to assess storm damage to corals or the extent of coral bleaching events. Not only do images provide a permanent record that can be reviewed or analysed at a later date, their collection is non-extractive, leaving habitats unchanged. In contrast to extractive sampling methods, such as dredging, our non-invasive geo-referenced imagery-yielding technique allows us to re-visit the same sampling area in the future. Thus, image-based methods supported by cost- effective methods of habitat mapping with known estimates of uncertainty, will underpin frameworks for monitoring programs to assess environmental change and management performance. Their applications will include determining the direct impact of bottom fishing methods and subsequent changes — particularly the changes that occur when previously disturbed areas are protected within marine

reserves. With regard to habitat mapping, every subsequent, congruent survey

could use the initial random forests model, thereby eliminating the need for human image annotation. In this context, we acknowledge the need for an uncertainty measure to account for prediction inaccuracies; here it is incorporated (rather coarsely) in the permuted habitat distributions. Future advancements in image content recognition or more sophisticated information extraction methods than those presented in this paper might yield better habitat maps than is currently possible.

It is also conceivable that combining co-located acoustic backscatter data and AUV imagery surpasses the accuracy of the map presented in this study. These parameters could than be used to predict habitats outside the sampled area using MBES data, e.g., Rattray et al. (2009). Currently acoustic backscatter analysis can distinguish hard and soft substrate with high degrees of confidence but would be insufficient as the sole source to distinguish 9 habitat classes as in this study. Backscatter data would be another predictor in the random forest algorithm and the algorithm decides whether the backscatter data are a strong predictor for certain habitat classes. It should be stated that most of the above mentioned advantages are closely linked to the design of the actual AUV itself. Geo-referenced imagery, a virtually-constant height-above-seafloor calibrated stereo camera system, altimeters and depth sensors are all required to implement the techniques described here. In conclusion, the AUV served as an excellent, stable and mature platform in this study to autonomously survey benthic habitats. While this AUV is relatively large in size, (Singh et al., 2004b) report successful AUV deployment from a 42’ (12.8 m) vessel equipped with an A-frame during their investigation of coral reef habitats in Puerto Rico, indicating the high utility of AUV platforms for habitat mapping in many different environments.

Chapter

4

Beyond diver’s depth: evaluating

stereo baited underwater video

systems as a tool to monitor

deep-water temperate reef fish

assemblages on the continental shelf

In document Testing and evaluating non extractive sampling platforms to assess deep water rocky reef ecosystems on the continental shelf (Page 94-100)