• No results found

7.1 Introduction

7.2.1 Static Relevance Scores

Recall that with increased arousal levels GSR, HR and HF levels increase and ST levels decrease (described in Chapter 2.3.1). Since we seek to gain an understanding of the importance of the item to the individual in the collection as a whole, in these initial experiments into the utility of biometric response associated with past experience of items as a static score, we intuit that the maximum biometric response observed for an item across all past accesses to the item will indicate the importance of the item in the collection (or the importance of the item to the user in the collection)2. Thus each retrieved item for content+context retrieval was annotated with the maximum observed GSR, maximum observed HR, maximum observed HF and minimum ob-served ST across all accesses to the item3

As described in Chapter 2.3.1, increases in physical activity (detected through in-creases in energy expenditure) cause GSR, HR and HF levels to increase and ST levels to decrease. To discern changes in GSR, HR, ST and HF caused by changes in arousal level as opposed to changes in physical activity, we also tagged items with the max-imum observed GSR, HF and HR with energy expenditure factored4 and with the minimum observed ST with energy expenditure factored5 across all accessed to the item. To factor energy expenditure into the biometric readings we divided GSR, HF and HR levels by their associated energy expenditure readings (i.e., engGSRGSR , engHRHR and engHFHF ) and we multiplied ST levels by their associated energy expenditure read-ings (i.e. ST · engST ). As explained in Chapter 2.3.1, the lower the ST level the greater the arousal level, hence items were also tagged with the inverse of ST and the inverse

2This is the same premise as we took in Chapter 6 to explore the relationship between computer item importance and biometric response associated with previous item interaction.

3We acknowledge that biometric response when accessing files outside the biometric month, had it been recorded, may have resulted in different biometric levels being assigned to items, and that lack of this information may have negatively impacted on the results we will present in this chapter.

4Energy expenditure readings associated with GSR, HF and HR are referred to as engGSR, engHF and engHR respectively.

5Energy expenditure readings associated with ST are referred to as engST.

Tag Type Subject1 Subject2 Subject3

GSR,ST,HF 35% 67% 60%

HR 53% 83% 85%

Table 7.2: Percentage of retrieved items missing galvanic skin response (GSR), skin temperature (ST), heat flux (HF) and heart rate (HR) biometric tags, across each sub-ject’s queries.

Table 7.3: Default normalised biometric tags assigned to items with missing biometric tags.

of ST · engST .

Items in the ‘biometric month’ test set which had no associated biometric readings, due to biometric recording devices being removed for data downloading purposes, the subjects need for mental break from wearing of devices, and in the case of the heart rate monitor errors in recorded readings, etc (as described in Chapter 3.3.4.4), were assigned default biometric tags. The default value used was the median of the biometric tag associated with retrieved items. Examining the items retrieved for each subject for BM25F mod2 retrieval reveals that 35% of the items retrieved for Subject 1 had no GSR, HF and ST tags and 53% had no HR tags, for Subject 2 67% had no GSR, HF and ST tags and 83% were missing HR tags, and for Subject 3 60% were missing GSR, HF and ST tags and 85% were missing HR tags. These results are shown in Table 7.2. Table 7.3 provides the normalised default tags assigned to items with missing biometric tags for each subject.

All biometric tags associated with retrieved items were normalised using min-max normalisation. For example, to normalise each GSR tag associated with retrieved items we use: (GSRtag - minGSRtag)/(maxGSRtag - minGSRtag). The following ap-proaches for calculating static relevance scores using the normalised biometric data tags were investigated:

BIObase = w · se (7.1)

logBIO = w · log(s) (7.2)

logBIOeng = w · log(se) (7.3)

sigmBIO = w · sa

ka+ sa (7.4)

sigmBIOeng = w · sea

ka+ sea (7.5)

sigmIncST = w · ka

ka+ sta, where st = ST (7.6)

sigmIncST eng = w · ka

ka+ sta, where st = ST × engST (7.7) In the above equations s = ST1 , GSR, HR or HF and se = ST ×engST1 (i.e.,

1 ST

engST),engGSRGSR ,

HR

engHRor engHFHF . For the remainder of this chapter we use STbase to refer to the the use of ST data in the BIObase equation, logGSR to refer to the use of GSR data in the logBIO equation, etc. Following parameter tuning using the full set of the 3 subjects’ biometric month test cases, the static score’s weight of importance (w) and parameters k and a (where applicable) were set for each equation. Our parameter tuning approach was the same as that used for tuning retrieval algorithms parameters, described in Chapter 5.2. That is, for each static scoring approach we manually tuned the weight (w) to give overall best retrieval performance, and where applicable the k and a parameters were set to 1 during this process. For the static scoring approaches which contained k and a parameters, we then manually tuned the k parameter to give overall best retrieval performance using the tuned weight (w) and leaving the a parameter set to 1. Finally using the tuned weight (w) and k parameter we manually tuned the a parameter to give overall best retrieval performance. Table 7.4 provides these tuned parameter

w k a

Table 7.4: Parameter tunings for static biometric functions.

values.

Equation 1 is our baseline static scoring approach, used to examine the effect of the raw ST, HR, HF and GSR values with energy expenditure factored in on re-ranking result lists. The remaining equations investigate the use of non-linear transforma-tions of the biometric score. Equatransforma-tions 2 and 3 examine the effect of using logs of ST, HR, HF and GSR. The performance of our biometric scores using the transfor-mation approach from [Craswell et al., 2005a] described in Chapter 4.2.3 is examined with Equations 4 and 5. This approach is used to generate static relevance scores for features where higher values indicate greater importance. An approach for calculat-ing static relevance scores for features where lower values indicate greater importance is also provided in [Craswell et al., 2005a] and described in Chapter 4.2.3. This tech-nique’s performance using our ST data is investigated with Equations 6 and 7. The effect of accounting for energy expenditure is investigated in Equations 1, 3, 5 and 7.

The static scoring techniques presented in this section are added to content+context relevance scores generated using the BM25F mod2 model, described in Chapter 5.5.

The next section discusses results obtained using these approaches.

7.3 Results and Analysis

Retrieval effectiveness is measured here using average precision (AveP), P@5 and P@10. P@5 and P@10 show how effective our techniques were at moving relevant items towards the top of the result lists. Table 7.5 shows the retrieval scores for BM25F mod2+static score retrieval, averaged across the three experiment subjects, and percentage improvement over the BM25F mod2 baseline that these scores corre-spond to. Table 7.6 presents the individual breakdown of results for each subject and Table 7.7 provides the percentage improvement over the BM25F mod2 baseline that these scores correspond to. Table 7.9 presents results obtained for each subject when we consider only items with ‘real’ biometric tags (i.e. not considering items assigned default biometric tags) in retrieval. Table 7.10 presents the percentage improvement over the BM25F mod2 baseline that these scores correspond to. The results presented in these tables suggest that adding biometric static scores to content+context IR scores is useful for ranking PL text-based collections. In this section we analyse these re-sults.