2.2 Application of Computer Vision Techniques to Species Classification
2.2.3 Bird Species Classification using Motion Features
All the methods in Section 2.2.2 use single images and appearance-based models for clas- sification; however, bird species also exhibit distinguishing behaviours (flying, moving, poses, etc) which could also be used to help robust automated identification. This is par- ticularly relevant to the identification of birds in flight, especially at a distance where appearance-based features such as colour tend to attenuate, whilst motion-features re- main discernible. In ecological studies, conservation and biology, motion-based vision techniques have been applied to tracking flying species and analysing their kinematics, behaviours, and flight trajectories (Breslav et al., 2012). The analysis of bats motion us- ing computer vision techniques have widely been studied by researchers for application in ecological, conservation and biology. Bats are nocturnal creatures and are active at night
2.2. APPLICATION OFCOMPUTERVISIONTECHNIQUES TOSPECIES
CLASSIFICATION 21
and hence most studies have used thermal cameras for the analyses of their motion. Cur- rently, research concern with bats is mostly focusing on wing beats modelling (Breslav et al., 2012) and counting bats as they emerge from caves (Hristov et al., 2010; Betke et al., 2008).
The most significant relevant studies using motion features with bird species include Duberstein et al. (2012), which explores wing beat frequencies of bird and bats species but does not use these features for classification. Duberstein et al. (2012) like Cullinan et al. (2015), and Matzner et al. (2015) mentioned above have attempted to use motion features to differentiate between small numbers of species. Duberstein et al. (2012) explores the wing beat frequencies and flight patterns of bird and bat species by extracting descrip- tive statistics from flight patterns of four species of birds and bats. The statistics (which included the minimum, maximum, mean, standard deviation, quartiles and interquartile range) were used to cluster species, thus making their classifier perform faster as the number of features were small. This work was redeveloped by Cullinan et al. for broad classification of species of bats, swallows, terns and gulls. However, they only used 48 tracks from a 5 minute video from a thermal camera and these cameras may sometimes be expensive and difficult to deploy in the field. The work in Matzner et al. (2015) increased the dataset in Cullinan et al. (2015) whilst applying the same techniques in Duberstein et al. and Cullinan et al., to classify species into categories such as bats, swallows, terns and gulls.
Hitherto, works concerned with flight pattern have been carried out using recordings from tracking radar (Liechti and Bruderer, 2002; Bruderer et al., 2010; Zaugg et al., 2008) during the migration seasons of bird species while others have been carried out using thermal camera (Duberstein et al., 2012; Cullinan et al., 2015; Matzner et al., 2015). Most of these studies that use thermal cameras were recorded during the evening when species are migrating, and are also very suitable for nocturnal species like bats. Work by Huang et al. (2013) used a graphical model with saliency to classify 9 species of birds, by extracting Scale-Invariant Feature Transform (SIFT) and colour features, which were trained using different Support Vector Machine (SVM) classifiers and achieving 73.8%
classification rate, which is comparable to the results in Cullinan et al. (2015).
Characteristics of Flying Birds
Bird species can be identified by the speed at which they flap their wings, as well as their wing beat patterns. The shape of a bird’s wings determines the way it flies. Smaller- winged birds tend to fly faster to maintain the same lift as those with larger wings (Cochran et al., 2008).
Most birds fly by combining flapping, gliding and soaring. The type of flight pat- tern used produced by a particular bird species depends on its size, weight, wing span and shape. The smaller birds usually fly using a technique of short bursts of flapping, alternated with intervals in which wings are folded against the body. This flight pattern is known as "flap-bounding" flight (Tobalske et al., 2009). They usually abandon the con- ventional flap-glide pattern but still glides only if they have to decelerate during flight. Examples of birds species with this characteristic flight pattern are finches and sparrows.
Some other small birds like the swift, swallow and martin glide most of the time but also occasionally combine flapping to fly faster. Other species like finches and woodpeck- ers have undulating flight pattern, which describes a kind of roller-coaster style where the bird flaps its wings during the rising phase, then glides as it descends.
Budgerigars fly using flap-gliding and have the tendency to fly at only two distinct fixed speeds. They switched between a high speed and a low speed suited to safe ma- noeuvring in a cluttered environment (Schiffner and Srinivasan, 2016). Birds like ravens and hawks have a flap-glide or flap-soar flight characteristics, which usually consist of flapping their wings with some occasional breaks from flapping by soaring or gliding. Interestingly, cockatiels blend the traditional flap-gliding with flap-bounding.
Some birds, like Parakeet, gulls, pigeons and doves have direct flight pattern, which consist of a steady flight with rapid wing beats.
2.2. APPLICATION OFCOMPUTERVISIONTECHNIQUES TOSPECIES
CLASSIFICATION 23
General Techniques for Analysing Motion Features
Various techniques have been used for the measurement of periodic and cyclical motions, based on metrics derived from bounding boxes (Ghaderian et al., 2011), similarity matri- ces (Cutler and Davis, 2000; Plotnik and Rock, 2002; Lazarevic et al., 2008), object pose (Breslav et al., 2012), motion pattern (Ren et al., 2011), and point correspondence (Laptev et al., 2005). These are mainly based on the use of autocorrelation and Fast Fourier Trans- forms (FFT) to estimate object motion periods. In the case of human activities, Ayyildiz and Conrad (2011) used motion moments to classify videos, by calculating the image moments and fitting a 1D time domain function on the centroids and pixel variances. These were transformed into the frequency domain using FFT and achieving a classifica- tion accuracy of between 59% to 70% for 10 home activities based on 200 videos. The disadvantage of this method is that it uses a radius based classifier and the results of the classifier are highly dependent on the radius used. Ren et al. (2011), used motion tem- plates to perform motion pattern analysis and extract periodicity information from sports videos and to classify two sports activities (weight lifting and dumbells) into qualified and unqualified. This work achieved 93.5% and 97.7% correct classification respectively. An- other technique that has been used in periodicity estimation is point correspondence and RANSAC procedure, as in Laptev et al. (2005), which was used to match image sequences over a period of time in an attempt to detect periodicity. Ghaderian et al. (2011) detected periodicity in human activities by extracting image silhouettes and fitting bounding boxes around them. The distances of the targeted silhouette in four directions (top, left, right and bottom) to the bounding box sides were obtained. For example, the horizontal distances from the edge of bounding box to the contour of the object silhouette were computed and summed to form a 1D time domain function. This work was tested using 50 periodic and 50 non-periodic videos, achieving a correct recognition rate of 97%. The techniques used were prone to errors in recognition especially when segmentation error is high. This is because bounding box metrics are based on the segmentation of the species from one frame to another. The most cited paper in motion analysis is by Cutler and Davis (2000),
which used object similarity to detect periodicity of human and dogs in videos. The ob- ject periodicity was estimated by extracting foreground images and resizing them to 9x15 pixels using Mitchell Filter Schumacher (1992). The image similarity matrices were then formed from these images using absolute correlation. In order to account for tracking er- ror, the similarity matrices were minimised using a local search radius to form recurrence matrices. A Hanning filter was then applied to the recurrence matrices, and each column transformed to their frequency domain using FFT. The power spectra were then averaged and Equation 2.1 applied to select the ideal frequency. Chapter 3 develops algorithms for the measurement of wing beat frequencies using Fast Fourier Transform (FFT), which have been evaluated initially on bat species, and evaluated with state-of-the-art techniques by Cutler and Davis.
P( fi) > up+ kσp (2.1)
The techniques used in Cutler and Davis (2000), have also been used in Plotnik and Rock (2002); Lazarevic et al. (2008) but predominately for estimation of periodicity in an- imals. Plotnik and Rock (2002) applied the techniques used in Cutler and Davis (2000) to quantify motion behaviour of jellyfish to detect their motion mode changes. The only dif- ference compared with Cutler and Davis is that they used the normalised sum of squared differences to form their similarity matrices. Lazarevic et al. (2008), used similarity ma- trices based on absolute correlation to differentiate airborne targets (birds and bats). To determine periodicity they used patterns produced by the similarity matrix plots. These works were more challenging than the work by Cutler and Davis in two ways: filming bats and birds, usually results had lower resolution images than human and that they had more erratic behaviours and thus measuring periodicity in these species requires a very robust technique. Recently, Breslav et al. (2012) proposed a method of computing the wing beat of bats by comparing every shape in the input shape time signal to a prototype template shape using the shape context descriptor and the Hungarian algorithm. They then scored shape poses which were used in the estimation of the wing beats. The disadvantage of this method is that it assumed bats are flying horizontally across the field of view, which
2.3. RELATEDFEATUREEXTRACTION METHODS 25
realistically is not the case as bats flight are more erratic in nature.
Motion patterns have also been studied in fish species to mainly differentiate trajec- tory into either normal or abnormal, but this is for a few species. Tian et al. (2014) dif- ferentiated normal and transgenic Zebrafish using the histogram of body bending along the zebrafish body, motion displacement vectors between two consecutive frames, speed and acceleration and motion between consecutive frames in the three-dimensional space over time. This work achieved a 73.99% recognition rate based on hybrid features. The Fish4Knowledge dataset introduced by Beyan and Fisher (2013b) was used in Beyan (2015) to classify fish trajectories by extracting 776 features, which was reduced using PCA to 140 features. These were used to cluster fish tracks as normal and abnormal. Fouad et al. (2013) used SIFT and SURF features with an SVM classifier to identify fish as Tilapia or non-tilapia and achieved a classification accuracy of between 56.6% to 100%. The good results acheved by this work were due to the small number of classes involved. Recently, Spampinato et al. (2014) have used video texture analysis and SVM to study differences in fish behaviour when disruptive events such as "typhoons" happen. Whiles Wang et al. (2015) analysed behaviour of fish species using a similarity-based pe- riodicity detection combined with the K-neighbors classifier. The method by Wang et al. was, however, not very robust to noise in the video as it uses a similarity-based approach.