3.4 Conclusions
4.1.4 The Statistical Features
Almost all features extracted were represented as computed statistical features, which were used to form a feature vector for classification of species. This was done to ensure that the performance of the classification model is faster, as larger feature sets impact on the processing time and may also affect the classification rate negatively (Yu and Liu, 2003). For clarity, these statistical features have been introduced briefly in the following sections.
The First-order Histogram
The first-order histogram probability provides information about the distribution of the intensity level of an image (Sergyan, 2008) and can be used to represent an image. Given an image I of size N by M; the first-order probability is defined as:
Pi= ni
4.1. DATASETS ANDMETHODS 89
Where i is the intensity value at a point in the image, Piis a probability distribution
of intensity value i and ni is the number of intensity value i in the image I; the value of
0 ≤ Pi≤ 1 and ∑NMi=0Pi= 1. From the probability distribution function, other statistical
features are extracted as introduced in the following sections.
The Mean of the First-order Histogram
The mean of the first-order histogram probability describes the general brightness of the image. A high mean implies a bright image and a low mean implies a dark image. The mean is defined as:
¯ p= k
∑
i=0 i(pi) (4.2)Where k = 256 for saturation and value and k = 180 for hue colour channels (channel values are normalised). Where a grayscale image is used, k = 256.
The standard Deviation of the First-order Histogram
The standard deviation of the first-order histogram probability describes the contrast of the image. A high variance implies a high contrast and a low variance implies low contrast. The standard deviation is defined as:
σ = v u u t k
∑
i=0 (i − ¯p)2∗ p i (4.3)Where k = 256 for saturation and value and k = 180 for hue (channel values are normalised). ¯p is the mean of the first-order histogram probability. Where a grayscale image is used, k = 256.
The Skewness of the First-order Histogram
The skewness of the first-order histogram Probability measures the asymmetry about the mean in the intensity level distribution. The skewness is defined as:
skewness= 1 σ3 k
∑
i=0 (i − ¯p)3∗ pi (4.4)Where k = 256 for saturation and value, k = 180 for hue (channel values are nor- malised) and σ 6= 0. ¯p is the mean of the first-order histogram probability. Where a grayscale image is used, k = 256.
The Energy of the First-order Histogram
The energy describes the intensity levels in the image. A high energy tells us that the num- ber of intensity levels in the image is few. This means that the distribution is concentrated in only a small number of different intensity levels. The energy is defined as:
energy=
k
∑
i=0
|pi|2 (4.5)
Where k = 256 for saturation and value, k = 180 for hue (channel values are nor- malised) and ¯p is the mean of the first-order histogram probability. Where a grayscale image is used, k = 256.
The Entropy of the First-order Histogram
The entropy of the first-order histogram probability measures how many bits are need to code the image data. The entropy increase as the pixel values in the image are distributed among more intensity levels. This measure is inversely proportional to the energy levels. The entropy is defined as:
entropy= −
k
∑
i=0
(pi∗ (log(pi)/log(2))) (4.6)
Where k = 256 for saturation and value, k = 180 for hue (channel values are nor- malised) and ¯p is the mean of the first-order histogram probability. Where a grayscale image is used, k = 256.
The Kurtosis of the First-order Histogram
The Kurtosis is used to measure the peak of the distribution of the intensity values around the mean (Malik and Baharudin, 2013). A high kurtosis datasets usually have a peak near the mean and this declines rather rapidly with heavy tails whiles those with low kurtosis have a flat top near the mean. In some cases you will have a uniform distribution which
4.1. DATASETS ANDMETHODS 91
is an extreme case. The Kurtosis can be computed as:
Kurtosis= 1 σ4 k
∑
i=0 (i − ¯p)4∗ pi (4.7)Where k = 256 for saturation and value, k = 180 for hue (channel values are nor- malised) and σ is the standard deviation of the first-order histogram probability and σ 6= 0.
¯
pis the mean of the first-order histogram probability. Where a grayscale image is used, k= 256.
The Local Maxima and Local Minima of the First-order Histogram
Local maxima and minima occur at critical points where the derivative of the first-order probability function is zero. These are usually peaks and valleys in the distribution and provides information on a set of dominant and less dominant intensity values. In an image, peaks will represent areas of high-intensity and valleys, low-intensity and may be relevant features because they mark important image objects. In the first-order probability function, there may be multiple regional maxima and/or minima but there can only be a single global maxima or minima. These features have been used, by counting the number of minima and maxima in the first-order function, which is used to statistically represent part of the feature sets. Algorithm 2 and 3 are used to count the minima and maxima of the first-order histogram function respectively.
Algorithm 2: Find the number of local minima in the first-order histogram
1 LocalMin← 0;
2 vector< f loat > probs;
3 for i = 0 to probs.size() − 2 do
4 if probs(i) − probs(i + 1) ≥ 0 and probs(i + 1) − probs(i + 2) ≤ 0 then
5 LocalMin← LocalMin + +
6 end
7 end
8 Return LocalMin
The Maximum and Minimum of the First-order Histogram
The maximum and minimum provide information on the dominant and less dominant intensity values of the image. To compute the minimum and maximum, the probability
Algorithm 3: Find the number of local maxima in the first-order histogram
1 LocalMax← 0;
2 vector< f loat > probs;
3 for i = 0 to probs.size() − 2 do
4 if probs(i) − probs(i + 1) ≤ 0 and probs(i + 1) − probs(i + 2) ≥ 0 then
5 LocalMax← LocalMax + +
6 end
7 end
8 Return LocalMax
distribution of intensity values are passed to Algorithms 4 and 5 respectively. Algorithm 4: Find the Minimum intensity from the first-order histogram
1 Min← 0;
2 MinIntensity← 0; 3 vector< f loat > probs;
4 for i = 0 to probs.size() − 1 do 5 if probs(i) ≤ Min then
6 Min← probs(i) MinIntensity ← i
7 end
8 end
9 Return MinIntensity
Algorithm 5: Find the Maximum intensity from the first-order histogram
1 Max← 0;
2 MaxIntensity← 0; 3 vector< f loat > probs;
4 for i = 0 to probs.size() − 1 do 5 if probs(i) ≥ Max then
6 Max← probs(i) MaxIntensity ← i
7 end
8 end
9 Return MaxIntensity
The Zero Crossings of the First-order Histogram
This is the rate of sign-changes along the first-order histogram probability distribution. Gouyon et al. (2000) have used zero crossing as a key feature to successfully classify percussive sounds. In this thesis, the number of zero crossing in the first-order probability distribution function were counted (see Algorithm 6), which have been used to represent some of the features.
4.2. APPEARANCEFEATURESEXTRACTED 93
Algorithm 6: Find the number of zero crossing in the first-order histogram
1 probsMean← Mean(probs); 2 for i = 0 to probs.size() − 1 do 3 probs(i) ← probs(i) − probsMean;
4 end
5 for i = 0 to probs.size() − 2 do
6 Sign1 ← Sign of probs(i) Sign2 ← Sign of probs(i + 1) 7 if Sign1 6= Sign2 then
8 zerCross← zerCross + +;
9 end
10 end
11 Return zerCross