1 Introduction and Background
1.3 Information Presentation and Interactions 27
1.3.5 Text Entry 39
An easy way to determine a core aspect of usability of a system is to investigate how well certain types of tasks could be performed using the system. This meant that text entry, as measured by something like typing speed, was an early and robust corollary to usability. Many, many studies investigate the effects of text entry on computing devices and most of these are beyond the purview of this review. Since our focus is on mobile systems, our interest is in how comparable text entry speeds on mobile devices are to traditional computing environments.
Mackenzie and Ishii (2007) detail some of the critical reasons for evaluation and testing of text entry techniques. According to them, too often, great ideas remain inadequately tested or untested, due in part to an unfortunate reluctance of researchers to engage the user community. The point driven home in this chapter is the need for comparative analysis of text entry systems and that in order to accomplish this, standards and methods must be adhered to. Following the mores of experimental psychology, Mackenzie and Ishii (2007) argue, questions should be “repeatable, observable and testable” (p. 78).
Curran, Woods and Riordan (2004) conducted a study of novice, intermediate and expert users of mobile phones and asked the groups to use a keypad based phone and a non-standard keypad based phone as well as a stylus based PDA with both a mini soft keyboard and handwriting recognition being tested. The predictive text (T9)
function of the keypad phones was used both turned on and off. In addition to these devices, a full size QWERTY keyboard and a mini-QWERTY keyboard based device were included in the testing. Their results showed that, in both preference and performance, the full size QWERTY computer keyboard was the fastest means of text input. It was followed by the mini QWERTY keyboard then by the soft QWERTY keyboard. The predictive text entry method was generally quicker than non-predictive though prior experience with predictive text entry might have been important. Their results provide
40
some information stratified by gender and age and they include a detailed treatment of error. This study is particularly nice because it included a wide variety of devices as well as the baseline or gold standard device: a full size QWERTY keyboard. Despite this, there were some limitations of the study in terms of generalizability due to small sample size, especially with stratification.
Myung (2004) looked at mobile phone text entry among Koreans. Pointing out that the keyboard layout for the Korean alphabet had not yet been adopted (culturally and/or nationally), part of the study was aimed at determining whether a predictive model could be used as an alternative to empirical analysis to determine best layout options. KLM-GOMS was used to predict usability of new keyboard/keypad layouts of the Korean alphabet and this was determined to be as effective as empirical validation of the new layout.
In 2001, Isokoski and Raisamo introduced their Minimal Device Independent Text Input Method (MDITIM). Intended to model device independent text input, this proof of concept was modeled on simplicity. To validate MDITIM, a study was conducted and text entry was compared using a variety of devices including stylus on touchpad, mouse, trackball, joystick and keyboard. Though this approach was (and still is) somewhat contrary to the trend toward task specific interaction devices and/or techniques, it was a new approach to measure the same technique across different devices. This served to highlight the fact that operationally, though the stroke might be the same for MDITIM in theory, it was executed differently on each input device.
Kamvar (2008) and Cox, Cairns, Walton and Lee (2008) both investigate instances where voice recognition is being used to provide an alternative to keyboard based text entry. Kamvar investigated the use profiles of users of the Google Mobile Application when the voice search function was invoked. Their aim was to understand when and why users chose to speak their queries. Results suggested, contrary to the
41
researcher’s initial thinking, that longer queries were not the focus of voice searching, instead shorter queries were.
Cox et al. (2008) compared voice based text entry to multi-tap and predictive text entry to validate KLM predictions. They then investigated these text entry methods in limited visual feedback conditions to determine the value of voice based text entry under conditions like walking, driving, etc. Based on their predicted results, a
combination of keypress and voice recognition would yield the best task completion time which was in fact the case. For more on this modality, see the voice section that
follows.
Das and Stuerzlinger (2008) investigated an important area of text entry,
learning effects. Their work resulted in a predictive model that could be tailored to user experience level, helping to elucidate the quantitative measures of learning effects (between novice and expert). This predictive model was tested against simulated users and was found to be highly accurate. Though empirical testing should be used to validate these results, the adjustments made to the model are informative for testing text entry among mobile phone users.