• No results found

4.3 DARSIS system

4.5.2 Error model evaluation

This section presents and benchmarks the error of DARSIS system against the theo- retical error derived from the analytical model that was described in Section 4.4. The purpose of this evaluation is to compare the error of the two approaches, show the validity of the error model and provide a better understanding of the error distribution of the DARSIS system.

4.5.2.1 Methodology

There was a need to perform a comparison between the error derived from the the- oretical model with the DARSIS error, to evaluate the error model and prove the applicability of in real-world environment based on the error. Understanding the error of the DARSIS system and the error model will evaluate both approaches as coherent systems that include the interpersonal distance and relative orientation estimations.

The evaluation of the two approaches was performed based on the datasets acquired in the previous experiments. The DARSIS and the theoretical error were evaluated based on the datasets from the three different indoor environments, involving five participants in real-world social interactions (See Section 4.5.1.1). The outcome of this evaluation will provide the error for both approaches in real-world situations. To measure the error the evaluation in both approaches was performed by considering both as coherent system that include both the interpersonal distance and relative orientation estimation.

4.5.2.2 Performance metrics

The key performance metric of this evaluation is the error of each approach. The percentage of faulty estimation of each approach with respect to the existence/absence of a social interaction given the ground truth is defined as error rate. The error is calculated for each user in each experiment.

To evaluate the error between the DARSIS system and the theoretical model, various performance metrics were utilised. Table 4.2 presents the mean error of each model and the standard deviation (SD) of the error to measure the variation of the error. The standard error of the mean (SEM) statistic provides a metric for the standard deviation of the distribution of the mean. Figure 4.4 depicts box-and-whisker plot with the error of the DARSIS system and the error model. This box plot constitutes a coherent representation of the error distribution of the two approaches through five statistics including minimum, 1st quartile, median, 3rd quartile and maximum. In the middle of the box plot, the red horizontal line describes the median value. The blue horizontal edges of the box describe the 1st quartile and the 3rd quartile. The horizontal black line that is connected through the doted line with the box describes the minimum and the maximum error values of the error distribution. The red crosses at the top of the box plot are represent the outliers of the error distribution and are 1.5 times bigger than the 3rd quartile.

4.5.2.3 Evaluation results

The results of the evaluation of the DARSIS system and the error model based on the aforementioned performance metrics are shown in Table 4.2 and Figure 4.4. The target of this evaluation is show that the error model has similar performance with the DARSIS system and to analyse the error distribution of both approaches.

Table 4.2 shows a comparison between the DARSIS and the error model based on the mean, STD and SEM error of each approach. As shown, both models perform similar by having similar error statistics including mean 13-16% and SD of 18-19.5%. A small decrease in the mean error of DARSIS is observed, around 1.84% in comparison to

Table 4.2: Comparison between DARSIS and the error model regarding the percentage (%) of error introduced in real-world environments.

Mean Error StdDev Error Standard Error of Mean DARSIS 13.83 18.92 3.45 Error Model 15.67 19.43 3.55 0 10 20 30 40 50 60 70

DARSIS Error Error Model

Percentage of error (%)

Comparison of error distribution between DARSIS and Error model

Figure 4.4: Comparison of error distribution of DARSIS and Error model through box plots.

the error model. An even smaller reduction of 0.51% is observed in the SD of the error, which shows a decrease in the error variation of DARSIS with respect to the error model. The smallest difference of 0.1% is observed in the SEM between the two approaches. In all three statistics, the difference of the two approaches does not exceed the error of 2%.

Figure 4.4 shows the diagram that includes the two box plots of the error distribution of DARSIS and the error model. In both cases the error distributions are quite similar.

A difference of 3% is observed by comparing the medians of the two box plots. The minimum with 1st quartile of the error plot are overlapping in both cases. The two plots indicate that the 1st quartile i.e. 25% of the lowest error cases the models do not produce any error. The 3rd quartile i.e 75% of the lowest error cases for both models produces 25% of error. Regarding the maximum percentage of error introduced

in the experiments, DARSIS and the error model produce 42% of error. As shown in the box plots, the error model produced one outlier with 68% of error as opposed to DARSIS which generated two outliers at similar levels. Although the median values have a difference around 3%, both plots present similar error distributions considering the other four statistics provided by the box plots.

Overall, DARSIS and the error model performed similarly regarding the error distri- butions. The difference in the mean, SD and SEM error did not exceed 2% of error between the two models. The box plots revealed a difference of around 3% between the median error of the two models. The minimum, 1st quartile, 3rd quartile and maximum percentage of error of both error distributions performed very close to each other. In the error statistics and the box plots, a small increase in the error is observed due to errors induced by the models derived from psychology. The outliers are sources of decalibration of the facing direction mechanism in one of the devices, which affects the relative orientation computation. Finally, both models managed to achieve less than 16% of mean error with SD less than 19.5% having consistent mean error with SD of 3.5%.