3.3 Methodology
3.3.2 Training the models
The interpersonal distance estimation technique relies on initially generating a generic training set and then extracting a bank of features from this training set. A feature selection process is followed in order to choose the most informative and less redundant feature set. Each classifier in each level of the hierarchy is evaluated in order to find the most appropriate choice.
A data collection campaign was performed in an indoor office environment through HTC One S smartphones, in order to construct the training set for the classifier. The Bluetooth interface on one of the smartphones was configured as discoverable and the other device was performing the discovery process. After the end of the experiment, Bluetooth RSSI data were collected from eight different distances, three different de- vice relative orientations. These device relative orientation were a) Screen-to-screen b) Screen-to-Back c) Back-to-Back. Empirical evaluation showed that these vertical relative orientations constitute representative for the effect of the facing direction vari- ation. A large number of Bluetooth RSSI samples was collected i.e. 2000 samples for each different distance and orientation combination, resulting in a dataset of 48000 Bluetooth RSSI samples, for reasons of statistical significance. As the data collection process was extremely lengthy, the humans were replaced with water-filled cylinders
Public_Confidence Social_Confidence Personal_Confidence Layer 2 classifier MovMin6 MidRange6 MinMode6 Q16 IQR6 Skewness6-Skewness4m MovDev6 Layer 1 classifier: Public MovMedian6 MovMin6 MovMax6 MidRange6 MaxMode6 Kurtosis6-Kurtosis4m MovDev6_Mean1.5m Layer 1 classifier: Social MovAvg6 MovMin6 Kurtosis6-Kurtosis4m MovDev6_Mean4m Layer 1 classifier: Personal
Figure 3.4: Features for 2-Layer DHC.
to which devices were attached, in order to simulate to human body absorption [198]. The devices with the bottles were placed at 0.8m height from the floor to simulate the most common wearing position (i.e. trousers pocket) [107].
For each different distance and different orientation, 2000 samples were collected. This resulted in a large dataset of 48000 samples. As Figure 3.2 indicates, the interaction zones were considered as Public, Social and Personal+Intimate. So the training data were split into three interaction zones according to the corresponding distances. In this way the Public and Personal+Intimate zones resulted with 12000 samples each and the Social zone had 36000 samples. This occurred because the Social zone takes into account samples at 2m,2.5m and 3m. For that reason in the Social zone, the number of samples for each distance and orientation was reduced in order to have 12000. This was done in order to avoid any bias in the training process towards the target class of the final classifier.
Literature has mainly focused on extracting at most 3 features in order to develop a machine-learning model that will be able to perform the mapping between the RSSI of either Bluetooth or WiFi signal towards the distance between the emitter and the receiver device. In Comm2Sense [51] authors selected only the maximum and the av- erage value of a 20-sample window. Based on these two features authors training a machine-learning model. To further improve on the state-of-the-art distance estima-
tion technique, there was a need to create a large bank of features, from which the most informative features would be selected. As Comm2Sense [51] selected only the maximum and the minimum value as feature set, the proposed process included all the basic statistics such as min, max, average, standard deviation etc. In addition, similar to the approach followed in K-means algorithm, the distance of the window statistics with the basic statistics of the target class were also included. Also, the deviation and z-score used to derive social signals [25] in literature, were also included in the feature bank.
A large feature set of 3050 features including several statistics was extracted from this dataset, considering a maximum window of 6 Bluetooth RSSI samples. Table 3.2 shows the basic features that were extracted and refer to various statistics. Table 3.3 shows the relative features which are produced by combining basic features with various statistics of the target class. Table 3.4 are features that combine the basic and relative features through a statistical metric such as deviation and z-score. Such large number of features was generated in order to be confident that the feature reduction techniques will produce the most informative features. Given the level of consistency [199] of each feature based on the target class, a subset of features is chosen. A wrapper subset evaluation [200] followed to retrieve an optimised feature set for the given classifiers.
To conclude on the appropriate feature selection technique, the methods presented in Table 3.1 were evaluated on the dataset. Each method was evaluated on the dataset with a particular ranker to understand the importance of the selected feature set that the approach concluded on. Only the feature selection techniques based on the infor- mation gain and the level of consistency were able to provide a high ranked feature set. Having concluded on these two feature sets, they were used in order to train two clas- sifiers with the same classification technique (J48 [201]). The technique that evaluates the worth of a subset of attributes by the level of consistency in the class values when the training instances are projected onto the subset of attributes, achieved the highest accuracy and for that reason the particular subset was selected.
The next step is the creation of the machine-learning based model for understand- ing the interpersonal distance among the people. Towards the fulfilment of that goal,
Table 3.1: Feature selection techniques
Name Description
[202] Evaluates the worth of a subset of features by considering the individual predictive ability of each feature along with the degree of redundancy between them.
Information Gain Ratio Evaluates the worth of a feature by measuring the gain ratio with respect to the class.
Correlation Evaluates the worth of a feature by measuring the correla- tion (Pearson’s) between it and the class.
Information Gain Evaluates the worth of a feature by measuring the infor- mation gain with respect to the class.
Chi squared Evaluates the worth of a feature by computing the value of the chi-squared statistic with respect to the class.
Table 3.2: Basic Features
Name Description
1 Mean The mean value of specific window of samples. 2 Median The median value of specific window of samples. 3 Min The minimum value of specific window of samples. 4 Max The maximum value of specific window of samples.
5 MidRange The mean value of minimum and maximum of specific window of samples. 6 MinMode The most frequent value of specific window of samples. If multiple, it returns the
minimum of them.
7 MaxMode The most frequent value of specific window of samples. If multiple, it returns the maximum of them.
8 Percentile 25th, 75th and 90th percentile of specific window of samples. 9 IQR The inter-quartile range of specific window of samples. 10 MAD The median absolute deviation of specific window of samples. 11 STD The standard deviation of specific window of samples. 12 Kurtosis The kurtosis of specific window of samples.
13 Skewness The skewness of specific window of samples.
Table 3.3: Relative Features
Name Description
1 Interaction Zone The difference between the basic feature of a window of samples and each of the interaction zones (Public, Social, Personal+Intimate).
2 Distance The difference between the basic feature of a window of samples and each of the distances (0.5m, 1m, 1.5m, . . . , 4m).
3 Orientation The difference between the basic feature of a window of samples and each of the distances and orientations (0.5m Screen-to-Screen, 0.5m Back-to-Screen, 0.5m Back-to-Back, etc.).
Table 3.4: Combined Features
Name Description
1 MovDev The deviation of a specific window of samples with respect to the mean of a class (interaction zone, distance, distance & orientation).
2 ZScore The z-score of a specific window of samples with respect to the mean and standard deviation of a target class (interaction zone, distance, distance & orientation).
various evaluations were performed. Machine-learning models were trained using the feature set described in the previous paragraph and defined in Table 3.5. Classifica- tion algorithms such as decision trees, naive bayes, ada boost etc. were used to train machine-learning models. The models were evaluated based on the training set with 10- fold cross validation and 25% split. The algorithms that achieved the highest accuracy were the decision tree and the MultiBoostAB [203] with decision tree J48 [201]. The MultiBoostAB approach achieved even higher accuracy than the decision tree, thus it was selected. The model was then evaluated in small 10-minute experiments in indoor office environments at different distances, where two users were placed at the centres of different interaction zones. The inference of the model was logged to understand the accuracy of the model. Also, the evaluation presented in PhoneMonitor [49], where two users walked for a particular distance in indoor environment next to each other, in order to understand if the people are interacting. In all experiments, the model achieve accuracy higher than 80%.
It should be noted that the machine-learning models were trained and evaluated on the same dataset. The algorithms were initially trained and evaluated based on 10- fold cross validation and 25% split on the different machine-learning models. Once the final machine-learning algorithm was selected, based on which the model will be trained, the model was trained based on all the dataset. This process improved the accuracy of the algorithm but tuned the algorithm towards the particular dataset. In the evaluation Section 3.4 it is detailed that the state-of-the-art machine-learning models that the approach was evaluated against, were also tuned with the same process. The PLMs were configured for the particular indoor environment, based on the RSSI measurements of the training dataset. This process reduces the generality of the model, however as mentioned in previous paragraph additional evaluation was performed that still achieved over 80% in different indoor settings e.g. office environment, standing, walking etc.
Initially a DARSIS Single Classifier (DSC) is trained for all three interaction zones based on several algorithms. Various evaluations showed that MultiBoostAB [203] with decision tree J48 [201] performed best combined with the features showed in Table 3.5. In detail, it managed the highest accuracy in a robustly manner because of its native
capability for variance and bias reduction. To further improve the performance, the hierarchical classifier DHC depicted in Figure 3.4 was introduced. The accuracy and robustness of MultiBoostAB in the inference process, led to the development of the models of each layer of the DHC based on the same algorithm. This led the approach in achieving an even higher accuracy than the DSC.
Table 3.5: DARSIS Single Classifier
Features
MovMedian6 MovDev6 Mean3m MovDev6 Mean1mF2F Kurtosis6 - Kurto- sis4mB2F
MovMax3 MovDev6 Mean4mF2F MovDev4 Mean4mB2F STD6
Based on psychology and also as depicted in Figure 3.2, people are interacting when they are in social or personal zone. Following that, a DPC was developed that detects if users are in proximity or not. Given that DHC infers in which interaction zone the users are, DHC was used to infer also if users are in proximity by considering social and personal zone as proximity and public as no-proximity.