4.2 Experimental Data for Head Turn Analysis
4.3.3 Evaluation of the Extraction Algorithm
Accuracy at finding the start and end times of the head turn
To evaluate the effectiveness of the extraction algorithm, all raw data from all age groups (see Table 4.2 for numbers) were judged by hand by the author to find the start and end of a head turn. This was done by visual inspection of all the motion tracking data trial by trial and marking Ts and Te, this took several seconds per trial. As and Ae were obtained by finding the corresponding head angle at Tsand Te respectively. This data was then used as a comparison to the extraction algorithm. The difference between the start (Ts,ha) and the end (Te,ha) were calculated as shown in Equation 4.1 and Equation 4.2 respectively. It is important to get the extraction algorithm as close to the human judged data as possible so that accurate measurements of the head angle can be made.
Te,ha =Te,h−Te,a (4.1)
Ts,ha =Ts,h−Ts,a (4.2)
Where Te,h,Ts,h are the end and start of the head turns as picked by a human observer and Te,a,Ts,a are the end and start of the head turn as picked by the algorithm. It is noted that the data is not made absolute as the sign of Te,ha and Ts,ha indicate if the algorithm is either under- or overshooting the hand picked location.
Age Group Start(ms) M(SD) End(ms) M(SD) 1.0 to 2.20 yrs -647.2 (+644.4) -131.2 (+540.9) 2.20 to 4.0 yrs -703.0 (+684.9) -189.5 (+508.3) 4.0 to 5.0 yrs -659.1 (+632.9) -116.0 (+484.8) Adult -140.6 (+173.5) +225.4 (+135.2)
Table 4.3:The time differences between the hand picked start and end of head turns and the ones extracted using the algorithm. The results show large mean and standard deviations for the start of head turn extraction. This error is large across all of the children but lower in the adult data. The end of head turn can be seen to be relatively small of the adult and chil- dren, however, the standard deviation of the data is large for the children and lower for the adults.
Table 4.3 shows the mean and standard deviation of the time difference between the hand judged responses and the extraction algorithm. The algorithm is re- quired to be as close to the hand judged responses as possible. The data shows that the algorithm can find the end of the head turn for the children within 200ms, however, the consistency of the responses, i.e. the standard deviations, are large at around 500ms. This suggests that the extraction algorithm is not consistent in finding the end of the head turn. For the start of the head turn the accuracy is very poor with the average error for the children age groups being greater than -600ms.
For the adult data, the detection of the end of the head turn accuracy is poor (+225.4ms), the adult data however, does show a lower standard deviation, showing that the judgment of the extraction algorithm is more consistent. Ts,a
noise, unlike the child data.
By looking at the magnitudes of Ts,ha and Te,ha, the algorithm can be seen to be either under- or overestimating the point where the human observer judged the response. For all the child age groups, the algorithm is overestimating the end of head turn. However, for the adults the algorithm is underestimating. For the start of a head turn, all groups tested show an overestimation. The adult data shows that the algorithm is consistent in misclassification (high errors and low standard deviations) suggesting it is not good at finding the start and end of the head turn.
Accuracy at finding the start and end angles of the head turn
To find the angular difference between the human judged and algorithm judged responses, Equation 4.3 and Equation 4.4 are used.
Ae,ha = Ae,h−Ae,a (4.3)
As,ha = As,h−As,a (4.4)
The main reason the extraction algorithm was developed was so that it could automatically evaluate the localisation ability of the child. The point at which the sound is localised, Ae, is taken as the head’s location at time Te. Using Equation 4.3, the difference between the extracted end of head turn and the hand picked end of head turn can be computed. The differences in the data is plotted on a histogram showing Ae,ha against the number of occurrences.
Figure 4.4 shows a histogram showing the accuracy between the two methods and if the algorithm is under- or overshooting the hand picked locations.
Error between human and algorithm (°)
Number of responses 20 40 60 80 100 20 40 60 80 100 20 40 60 80 −100 −80 −60 −40 −20 0 20 40 60 80 100 20 40 60 80 100 mean = 9.64° standard dev =19.26 skewness =0.80 mean = 12.52° standard dev =17.50 skewness =1.60 mean = 4.23° standard dev =3.77 skewness =2.02 mean = 9.10° standard dev =15.54 skewness =1.24 1.0−2.2 years 2.2−4.0 years 4.0−5.0 years Adult
Figure 4.4:Distribution of the difference between the manual and algorithm esti- mation. Results show large differences (mean and standard deviation) across all of the age groups. The positive skewness of the histograms about zero show that the algorithm is underestimating the point at
years (mean - 12.5◦). The other age groups show an error of just under 10◦but with large standard deviations (19.3◦and 15.5◦for the 2.2-4.0 years and 1.0-2.2 years respectively). The adults show the lowest mean error and also the lowest standard deviation, 4.3◦ and 3.8◦ respectively. One way of understanding why the algorithm is going wrong is to look at the skewness of the data, i.e. is the al- gorithm consistently over- or undershooting the hand picked values (skewness calculated using the skewness.m function in MATLAB). All the data shown in Figure 4.4 had positive skew, i.e. the algorithm is consistently undershooting the hand picked location.