Exploring response time - DOIG model versus black-box labelling

3.4 DOIG model versus black-box labelling

3.4.1 Exploring response time

In collecting the ImprompDo dataset, an assumption was made that if a user were to respond immediately as a result of being interrupted, they would do so within 30 seconds. As this assumption could impact the above comparison of the DOIG model and the black-box convention, the suitability of this threshold value is explored. Figure 3.8 visualises the response times of notifications that were either consumed or dismissed

3.4 DOIG model versus black-box labelling 63 0 5000 10000 15000 20000 25000 30000 Response time (ms) 0 20 40 60 80 100 120 Number of responses

Figure 3.8: Histogram of response times for notifications that were either consumed or dismissed, using a bin size of 1000 milliseconds.

(as a complete response represents the longest response time a user would need).

The results support that the threshold of 30 seconds is suitable to allow a user to consume the notification, with the majority of responses occurring between 3 and 17 seconds (millisecond statistics: 𝑀 = 12, 188.8; 𝑀𝑖𝑛 = 481; 𝑀𝑎𝑥 = 29, 958; 𝑆𝐷 = 6513.9; 𝑁 = 1571). When splitting the data between those in use and not, there were only minor changes to the distribution. The primary difference is a quicker mean response time when the device was in use (𝑀 = 897.4; 𝑀𝑖𝑛 = 481; 𝑀𝑎𝑥 = 29, 504; 𝑆𝐷 = 6039.1; 𝑁 = 575), which is expected given that the user does not have to go through the process of unlocking the device. With the mean response time for cases where the device was not in use taking longer, but still less than half of the 30s timeout (𝑀 = 14, 046.7; 𝑀𝑖𝑛 = 3, 798; 𝑀𝑎𝑥 = 29, 958; 𝑆𝐷 = 6036.5; 𝑁 = 996).

64 3.5 Conclusions

Summary: Observing decision making behaviour in interruption re-

sponses is worthwhile

The above results can be summarised by the following primary findings:

• Individuals are often not immediately interruptible for notifications (i.e., within 30 seconds);

• When they are, users respond to notifications to different extents (i.e,. partial as well as complete responses);

• The DOIG model offers improvement over the black-box convention for capturing these cases and labelling interruptibility.

Overall, the high number of null responses emphasises the need for interruptibility aware notification systems, with the number of partial responses suggesting that this behaviour should be observed and considered. The DOIG model therefore offers a useful extension to the existing black-box convention that can help applications in labelling interruptibility behaviour. However, the model is arguably only useful if the labels it produces are predictable, which forms a focus of the following chapters (Chapters 4 and 5).

3.5 Conclusions

While considerable progress has been made concerning capturing and predicting interruptibility across the literature, the research area is fragmented with specific solutions for specific interruptions and environments [43, 122]. Despite this, a broad convention has been to treat the response process towards an interruption as a single decision, with interruptibility labels determined around whether the user consumes the interruption content or not - creating a number of limitations as outlined in Section 3.1.

3.5 Conclusions 65

the response process that result in the user needing to make decisions during the response itself. For example, in environments where information surrounding the interruption is delivered in stages that require further interaction to access (as is the case with Android mobile notifications). Subsequently, partial responses can occur that can indicate some degree of interruptibility, even if the interruption content is not physically consumed.

In this chapter, a new approach to labelling is proposed (that extends the existing convention) to expose and capture this behaviour where possible, called the decision- on-information-gain model. Through an in-the-wild case study of Android notifications, partial responses were shown to feature prominently in comparison to complete responses (Section 3.4), demonstrating the utility of the model for labelling interruptibility. This forms the following contribution to this thesis:

C2 A flexible model for labelling interruptibility for different definitions, the Decision- On-Information-Gain (DOIG) model, that deconstructs the observable behavioural trace in a response to a notification.

The model improves upon the limitations of the existing convention outlined in Sec- tion 3.1. Limitations L3.1 and L3.2 are improved upon by the model deconstructing how a response is made into the conscious or subconscious decision steps that a user makes; with the worst case performance being the same as the existing convention (e.g., in cases where technical constraints exist in observing decision behaviour). L3.3 is improved upon by enabling individual applications to infer what decision behaviour (i.e., the extent to which a user responded) makes the user interruptible and their interruption successful. Finally L3.1 and L3.4 are improved upon by utilising passive collection over human annotation, with the ImprompDo case study providing as a demonstration of the model being feasible for real-world applications, and without a fundamental change to notification design or user experience.

As the DOIG model does not alter the response process to notifications, the finer granularity of observation that it enables produces a second contribution that is related, but independent to C2, using the ImprompDo dataset:

66 3.5 Conclusions

C3 Analysis into the natural decision behaviour underpinning interactions with notifications, using data collected in-the-wild.

This analysis has examined what kinds of behaviour can occur inside the “black-box” (Figure 3.1) of how a response is made, and while this motivates the DOIG model for labelling, the variety of responses is interesting in its own right.

Subsequent chapters of this thesis build upon this by investigating the practical feasibility of the DOIG model for implementation on Android devices further, by exploring whether contextual factors are reliable to sample from and correlate with the labels produced by the DOIG model (Chapter 4), towards building predictive models (in Chapter 5). Finally, the flexibility of the model is demonstrated for other, more customised Android notification designs and the variability that can exist in device preferences (Chapter 6), as well as whether the decision making behaviour observed in this chapter can be seen for other notifications across the device (Chapter 7).

Chapter 4 Sampling context and decision data

The decision-on-information-gain model has been shown to be a useful aid in labelling interruptibility from response behaviour (Chapter 3). However, its utility is ultimately tied to the ability to predict the labels it helps to produce. In order to motivate the building of predictive models of the reachability, engageability, and receptivity labels produced using the DOIG model, the purpose of this chapter is to examine the feasibility of passively sampling relevant data on Android devices that can be used for either creating the labels, or for forming contextual features to use as the basis for prediction. This also extends the observations of potentially influential contextual data seen in previous studies (discussed in Chapter 2 and in [121]) by considering both multiple labels of interruptibility and the technical feasibility of retrieving the data together.

Firstly, the rationale behind this analysis is discussed further in Section 4.1. The data sampling strategy for the ImprompDo application introduced in Chapter 3 is detailed in Section 4.2. From this, the practical feasibility of collecting sensor and software API data from Android devices is examined in Section 4.2.1. Finally, investigations into correlations between features extracted from this data and different interruptibility labels is examined in Section 4.3, resulting in preliminary suggestions that the labels produced by the model are likely to be predictable.

68 4.1 Investigating sampling stability and usefulness

4.1 Investigating sampling stability and usefulness

The primary rationale for this analysis is to determine the feasibility of using machine sources (rather than explicit human annotation) to passively implement the DOIG model on Android devices, in terms of helping to produce labels of interruptibility (as shown in Chapter 3), and in gathering potentially influential contextual data for building predictive models. Towards this, two primary limitations in the literature are used to frame the analysis. Firstly, machines sources have frequently been used across the literature, however this analysis often stops short of exploring the practical feasibility of retrieving the data, in favour of moving forward towards prediction. This creates the following common limitation:

L4.1 The ability for data to be made available when asked for from machine data sources is often assumed.

As the successful operation of the DOIG model is tied to this being a reasonable assumption, the sampling stability of the ImprompDo application is firstly explored, along with strategies to mitigate issues that arise.

From this, before pursuing the predictability of the interruptibility labels produced by the DOIG model using this dataset, the second part of this chapter explores whether this would be worthwhile to pursue. This is achieved through determining whether contextual features extracted from the dataset are correlated to the labels (e.g., whether the battery charging state correlates to being reachable/not reachable). Empirical studies have previously commented on the potential predictive power of different contextual values, however they commonly have the following limitation:

L4.2 Analysis of correlating contextual factors is performed from the perspective of a single definition of interruptibility (e.g. just receptivity), similarly to limitation L3.3 in Chapter 3.

This results in conclusions that are confined to this definition, creating challenges in determining the wider applicability of the results (i.e., for other definitions, such as

In document Decomposing responses to mobile notifications (Page 92-99)