• No results found

Chapter 2 gives a comprehensive overview and comparison of existing works in the area of automated systems for species classification. This not only includes bird species but also other animal species that have been studied using computer vision techniques. An evaluation of literature based on species identification and classification using bioa- coustics, appearance, and motion features was also presented. Finally, literature on other relevant work in the area of feature selection, evaluation using imbalanced datasets, ma- chine learning and classification techniques were presented.

Chapter 3 presents initial work on extraction of motion features. For this first study video data of bats collected from bat roosts rather than birds were initially used. Bats were initially used for this study partly because the data was readily available but also because bat motion is of a higher frequency and so is more challenging for analysis. The chapter also presents a discussion of low-level image processing techniques for dealing with low-light video data of bats and bird species. A number of techniques were intro- duced for the analysis of bat wing beat frequency and the results from this methods were

1.7. ORGANISATION OF THETHESIS 11

evaluated with state-of-the-art method by Cutler and Davis (2000).

Chapter 4 presents work on the classification of bird species in flight, using only appearance features. In this chapter the dataset used in all evaluations were detailed and statistical features that were used to represent the feature sets used in this research were introduced. The work presented here uses a rich set of appearance features with standard classifiers (Naive Bayes, Support Vector Machine, Random Decision Trees and Random Forest) to classify species. Three datasets (The seven species (Dataset #1), The thirteen classes (Dataset #2) and caltech-ucsd birds-200-2011 Dataset) were used to evaluate the work presented in this chapter. The results from the appearance features presented in this chapter compares favourably with exiting state-of-the-art image-based classifiers used in the work by Marini et al. (2013). This out performs the state-of-the-art (Marini et al., 2013) on all 3 datasets: 9% on caltech-ucsd birds-200-2011, 6% on Dataset #1 and 9% on Dataset #2.

Chapter 5 presents the work on fusing appearance and motion features. A rich set of motion features were used with the appearance features from chapter 4 with standard classifiers (Naive Bayes, Support Vector Machine, Random Decision Trees and Random Forest) to classify bird species. Dataset #2 was used to evaluate the work presented in this chapter. Using only motion features for classification, a classification accuracy of 38% was achieved. Fusing these features with the appearance features in this chapter, a classification accuracy of 85% was achieved. The result was compared with that of using only appearance features, and an initial slight reduction in classification accuracy by ap- proximately 1% was reported, thus, motivating the work in the following chapter.

In Chapter 6 the results of work on feature selection was presented. Combining ap- pearance and motion features in Chapter 5 resulted in a large set of features, which slightly degraded performance. To improve classification, feature selection techniques were ap- plied to remove redundant features. The most important features were identified using

correlation and classifier based feature selection techniques. These were used to clas- sify species using the Dataset #2 with the four standard classifiers (Naive Bayes, Support Vector Machine, Random Decision Trees and Random Forest) and a correct classification accuracy of 89% was attained, which is approximately 4% better than that in Chapter 5. A further experiment was performed to determine the contribution of the selected mo- tion features to overall performance. The conclusion was that the motion features used together with appearance improves classification by approximately 4% across all four standard classifiers.

In Chapter 7, further improvement in the correct classification rates was attempted. The work in chapters 4 - 6 present results based on classification using single frames and subsets of video. This was extended to combine the results of several frames from a sequence using majority voting. It has been established that combining the outputs of several classifications result in a better overall accuracy (Bhattacharya and Chaudhuri, 2003). This is because different classifications can capture different aspects of the input data, while one alone will not usually represent all. The work presented here uses the results in Chapter 6 with majority voting and the four standard classifiers (Naive Bayes, Support Vector Machine, Random Decision Trees and Random Forest) to classify bird species. The datasets used included that with the seven species and the extended thirteen classes dataset.

Finally, Chapter 8 discusses the results and the contributions and limitations of this thesis and outlines future works.

Chapter 2

Literature Review

The previous chapter mentioned the importance of monitoring bird species. In particular, it was discovered that ecologists monitor them to determine the factors causing population fluctuation and to help in conserving and managing threatened and endangered species. The various surveys used in counting bird species including data collection techniques were succinctly reviewed. It was established that a small but growing number of re- searchers have studied the use of computer vision for monitoring species, particularly for counting bat species.

This chapter evaluates reports of studies found in literature that are related to mon- itoring and classification of species. In particular, it focuses on reviewing birds, fish and bats works: techniques used for these species are often similar, and motion features which this research seeks to investigate for classification of birds have been used for both bats and fish. First techniques which perform classification using single images were explored. This was done by reviewing them separately as those that are used for classification of bird species and those for other species, specifically, bats and fish.

The work based on bird species was then splitted into two sections: part-free and part-based models. Feature selection and machine learning techniques, relevant to the classification of these species were separately reviewed. Finally, a brief overview of video classification system using computer vision techniques was presented. This chapter is structured into the following sections:

• The classification of species using computer vision techniques were reviewed in Section 2.2, by first looking at literature related to bat and fish species classification and monitoring, and finally, those related to bird species.

• Section 2.4 and 2.5, review feature selection and reduction methods and imbalanced datasets techniques respectively.

• Finally, machine learning algorithms for classification were reviewed in Section 2.7 and an overview of the video classification system in Section 2.8

2.1

Classification of Bird Species using Bio-acoustics

A number of existing attempts to automate the identification of birds have used audio rather than visual signals, such as Briggs et al. (2012); Neal et al. (2011); Bardeli et al. (2010). In particular, Briggs et al. classified 413 birds songs, each of 30 seconds, using FFT and Nearest Neighbour (NN) classifiers. This achieved a remarkable result (92%), which was due to the Fast Fourier Transforms (FFT) used, filtering out most of the noise in the signal. Neal et al. also proposed a supervised time-frequency audio segmentation to extract syllables of bird calls. They then applied Random Forest (RF) to classify 625 birds songs and achieved a correct classification rate of 83.6%. Even though the number of species used in Neal et al. was larger than that of Briggs et al., the improvement in classification was attributed to the state of the art Random Forest classifier used. Random Forest has a mechanism for boosting results by splitting the dataset into random trees and applying majority voting, which helps improve classification rates.

All the above research have used FFT to improve the quality of the signals, as audio recordings taken in the field are usually buried in noise. It has been demonstrated by Lopes et al. (2011) that when the number of species is increased the classification rate reduces, even with FFT. They performed experiments using 3, 5, 8, 12 and 20 species and by first applying FFT to the signals to reduce the noise and then applying a sound ruler software, with Naive Bayes (NB) classifier to classify species by vocals. Lopes et al. showed that by randomly using 3 species and averaging the results, the classification rate was 61.5% whilst this was 25.4% for 20 species. Other work using vocals did not attempt

2.2. APPLICATION OFCOMPUTERVISIONTECHNIQUES TOSPECIES

CLASSIFICATION 15