Current Usability Metrics for Mobile Computing Evaluation

Azham Hussain and Maria Kutar, Informatics Research Institute, University of Salford

Abstract

Usability metrics are important elements for measuring whether an application is usable or unusable. Literature on how to measure usability is limited in the area of mobile computing. In this paper, we review, categorise, and discuss usability metrics obtained from 26 studies published in core human-computer interaction (HCI) journals. The analysis produces usability metrics for evaluating mobile computing (as distinguished from usability metric used in desktop computing). We also explain the methods used by researchers to create usability metrics and describe several issues and challenges in developing usability metrics in mobile computing.

1. Introduction

Usability metrics are essential elements in software measurement; they ensure that the application is accurate, increase speed, and ensure the safety of the user from strain injury (Ahmed, Mohammad, Rex, & Harkirat, 2006). Mobile device applications such as news alert, weather forecasting, and entertainment have become increasingly popular and well accepted. The fast growth and high demand for mobile applications has encouraged researchers to undertake studies on potential developments in this field.

Many advantages can be obtained when conducting usability testing, such as driving down production costs, improving sales, enhancing brand loyalty, and providing access for all in the public sector (Abrahams, 2008). However, conducting usability testing will be rather difficult without usability metric, which provides guidelines for the usability evaluation process. Hence, it will be beneficial to have usability metric either for desktop computing or mobile computing before conducting usability testing.

Mobile computing can be described as software systems operating on mobile devices (Zhang & Adipat, 2005). This classification is divided into two types: horizontal applications and vertical applications (Thomas & Mark, 2003). Horizontal applications are general and adaptable to a wide range of users and applications, such as music, e-mail, web browsers, and file transfer. On the other hand, vertical applications are aimed at a specific type of user or application, such as financial, marketing and advertising, education, and emergency applications. This wide range of mobile applications creates many research opportunities for researchers.

Clear guidance on how to measure usability is limited (Hornbæk & Law, 2007), and much more so in the area of mobile computing itself. The novelty of mobile computing and the unique features of mobile devices are the main challenges in usability measurement activity. Recent advances in technology like global positioning system (GPS) receivers embedded into mobile phones create new challenges in the human-computer interaction (HCI) area. Many traditional usability metrics have been created intentionally for desktop applications; however, these metrics may not be directly applicable to mobile applications (Zhang & Adipat, 2005).

This paper aims to review the literatures on usability evaluation methods and current practice on modelling usability measures. We also identify the current usability metrics for mobile computing and explain the challenges in measuring usability. The next section provides a review of usability metrics, followed by an outline of the research strategy employed in this study. Finally, we explain and discuss the results obtained from this research and draw conclusions.

2. Related studies

Mobile computing imposes different usability challenges compared with still technology—for example, in terms of designing usable interfaces (Ballard, 2005); how usability is measured (Kjeldskov & Stage, 2004); and the relationships between the technology, work tasks, and the context of work (Steinar & Fredrik, 1999). Understanding the usability of mobile computing has been widely discussed—for example, in the navigation of complex information on small screens (Björk, Johan, Peter, & Lars Erik, 2000); tactile feedback (Brewster, Faraz, & Lorna, 2007); techniques for assessing mobile usability (Kjeldskov & Stage, 2004); Barnard et al., 2005; and (Minna et al., 2007); texting (MacKenzie & Soukoreff, 2003); and how to conceptualise mobile usability (Dong-Han et al., 2006). Usability evaluation methods refer to the techniques employed to carry out usability evaluation, such as usability testing, focus groups, and interviews. All of these methods have been used by many researchers to evaluate usability, and each method has advantages and disadvantages, depending on the objective of the study. Different evaluation methods have emerged and contributed to the evolution of usability evaluation, giving software development organisations a wide collection of techniques that fit specific development projects (Scapin & Law, 2007).

A number of models for usability measurement are available for reference—for instance, Quality in Use Integrated Measurement (QUIM), developed by Ahmed et al. (2006). QUIM is a consolidated model for usability measurements and metrics; it is also appropriate for users who have no or little knowledge of usability. The model consists of 10 factors that are subdivided into 26 criteria. For the measurement of these criteria, the model provides 127 metrics. The model is used to measure the actual use of working software and identify the problem; however, it is not optimal yet and needs to be validated.

On the other hand, Metrics for Usability Standards in Computing (MUSiC), develop by Bevan and MacLeod (1994), is another project concerned with defining measures of software usability. MUSiC is integrated into the original ISO 9241 standards, which are effectiveness, efficiency, and satisfaction. Examples of specific usability metrics in the MUSiC framework include user performance measures, such as task effectiveness, temporal efficiency, and length or proportion of productive period. However, a strictly performance-based view of usability cannot reflect other aspects of usability, such as user satisfaction or learnability. Software Usability Measurement Inventory (SUMI), develop by Kirakowski and Corbett (1993), is a part of the MUSiC project. SUMI was developed to provide measures of global satisfaction for five additional usability areas: effectiveness, efficiency, helpfulness, control, and learnability. Another MUSiC project related to software tool development is Diagnostic Recorder for Usability Measurement (DRUM) (Drummond & Themessl-Huber), developed by Macleod and Rengger (1993). This project focuses on the analysis of user-based evaluations and delivery of these data to the appropriate party, such as a usability engineer. The Log Processor component of DRUM is the tool concerned with metrics. It calculates several different performance-based usability metrics, including task time; snag, help, and search times; effectiveness; efficiency; relative efficiency; and productive period.

Subsequently, another model dealing with the analysis of the quality of use for interactive devices was introduced by Macleod & Rengger (1993)—namely, the Skill Acquisition Network (SANe). This approach assumes a user interaction model that defines user tasks, the dynamics of the device, and procedures for executing user tasks. Specifically, a task model and a device model are simultaneously developed and subsequently linked. Afterwards, user procedures are simulated within the linked task-device model. A total of 60 different metrics are described in this framework, 24 of which are concerned with the quality measures. Scores from the latter are then combined to form a total of five composite quality measures, including efficiency, learning, adaptiveness, cognitive workload, complexity, and effort for error correction.

The International Organization for Standardization (ISO) is an international standard-setting body composed of representatives from various national standards organisations. The ISO has developed over 17,000 international standards for a variety of subjects, and 1,100 new ISO standards are published every year (ISO website, www.iso.org/iso/home.htm). Most literature in HCI employs ISO 9241-11 for usability measurement (Hornbæk & Law, 2007). Table 1 lists the ISO standards related to HCI. ISO 9241-11 specifically addresses the definition of usability measurement and has been thus chosen as the foundation for the framework in this study. In another study of usability measurement dimensions, Constantinos and Dan (2007) found the

most common characteristics in usability evaluation are effectiveness (62%), efficiency (33%), and satisfaction (20%). These three characteristics reflect the ISO 9241-11 standard, as they measure attributes for that standard.

Table 1. ISO Standard related to measurement.

Usability in ISO

Standard Description

ISO 9241-11 (1998) Identifies efficiency, effectiveness, and satisfaction

as major attributes of usability.

ISO/IEC 9126-1 (2001) Defines the standard as software quality attributes that can be subdivided into five different factors, including understandability, learnability, operability, attractiveness, and usability compliance.

ISO/IEC 9126-4 (2001) Defines the related concept of quality in use as a

kind of higher-order software quality attribute. ISO/IEC 14598-1

(1999)

A model for measuring quality in use from the perspective of internal software quality attributes

3. Research Approach

This section explains the steps taken to review the literature. A systematic literature review (SLR) method will be used in this phase to ensure that previous studies on usability evaluation relevant to this study are considered. The journals selected in this study were the top journals in HCI from 2006 until 2008. The total of 409 journal papers was reviewed based on keywords “usability”, “evaluation” and “metric”. Only 26 out of 409 journal papers were selected for further review in obtaining the guidelines and usability metric for mobile application development. Table 2 below describes the journal papers that were reviewed.

Table 2. Journal papers reviewed.

Journal Year Candidate Selected

ACM Transactions on Computer-Human

Interaction 2006–2008 54 8

Human-Computer Interaction 2006–

2008 36 2

International Journal of Human-

Computer Interaction 2006–2008 97 5

International Journal of Human- Computer Studies

2006– 2008

222 11

Total 409 26

The review is based on a conception of usability that is similar to ISO 9241-11 (1998) and Bevan and MacLeod (1994). This conception merely discusses studies related to usability evaluation rather than the broad concept of usability. We analyse the quality characteristics of each measure to ensure that no duplication exists. Interestingly, we found that most of the

studies employed effectiveness, efficiency, and satisfaction as quality characteristics, which

appear in ISO 9241 as well. Thus, we decided to make these three characteristics base guidelines and to designate others as sub-guidelines. Table 3 describes the most popular usability guidelines obtained from the literature. In the following section, we review the approaches that previous studies utilised to identify the issues that arise in creating usability

Table 3. Usability guidelines from the literature.

No Guidelines Description

1 Completeness The extent or completeness of users’ solutions to tasks.

2 Accurate The accuracy with which users complete tasks.

3 Fewer errors or

no errors Errors made by the user during the process of completing a task.

4 Ease of data input The data input process should be simple.

5 Ease of output

use The output should be very simple and accurate.

6 Ease to install Application installation should be user friendly.

7 Response time The system must respond in an appropriate time.

8 Simple The application should be straightforward.

9 Time The duration of tasks or parts of tasks.

10 Ease to learn The user interface must be designed for the user to learn

easily.

11 Application size The space used by the application should be appropriate.

12 Battery power

used The battery power use by the application.

13 Wireless connectivity

The application should easily connect to a network.

14 Features available Appropriate features available on an application.

15 Satisfaction with

the interface Measures satisfaction as the interface that users prefer using.

16 Provide

support/help The help information given by the application should be useful.

17 Safety User should be safe and secure while using the

application.

4. Results and discussion

With reference to Table 3, we simplify the usability metric by removing the guidelines that are not relevant to interaction. Duplicate guidelines are omitted as well, to assure that they have no effect on other guidelines. Since this study focuses on mobile computing, we create several new guidelines related to mobility—for instance, “touch screen facilities,” “safety while driving,” and “automatic update”—which do not appear in existing literature. In Table 4, we present the results by categorising quality characteristics and their corresponding usability guidelines and metrics.

Table 4. Usability metrics for mobile computing.

Quality

Characteristic Usability Guidelines Usability Metrics

Simplicity • Ease of data input

• Ease of output use

• Ease of installation • Ease of learning Effectiveness Accuracy • Accuracy • Absence of errors • Success

Time taken • To respond

• To complete a task Efficiency Features • Support/help • Touch-screen facilities • Voice guidance • System resource information • Automatic updating

Safety • While using the application

• While driving

Satisfaction

Attractiveness • User interface

In the current literature, several issues arise with respect to creating usability metrics. For example, few studies employ experts to assess quality. Expert assessments are needed to evaluate subjective data and usability defects (Bevan & MacLeod, 1994). Hence, we would suggest employing experts to create usability metrics, as expert judgement is one of the essential methods in usability inspection. We also note that measures on how to use interfaces are rarely employed. Current measurements for interfaces focus more on the use of colours; layout and information structuring; consistency of the terminology; consistency of the interaction mechanisms; and how to fit in small-screen devices. Metrics for interface use will be different for mobile applications due to the novelty and size of mobile devices.

The measure of satisfaction is another important element in measuring usability, and questionnaires are one of the techniques employed to measure user satisfaction. However, many studies are not using validated questionnaires to assess satisfaction. A large number of questionnaires have been developed to assess the user’s subjective satisfaction of the system and related issues. These include the Questionnaire for User Interface Satisfaction (QUIS), developed by Chin et al. (1988) and SUMI, developed by Kirakowski & Corbett (1993). Finally, we found that many studies combine both objective and subjective measures—for example, “learnability of an interface” and “time needed to master an interface into a single metric.” We believe that such measurements should be separated rather than included in a single metric.

5. Conclusion

We have reviewed usability metrics employed in 26 studies published in core HCI journals, drawing attention to several issues relating to the production of these measurements. The metrics created in this paper could serve as alternatives to current usability metrics in mobile computing. In addition to the issues raised in this study, we intend, in future research, to explore other challenges in creating usability metrics, particularly in the area of interaction between humans and mobile applications.

References

Abrahams, P. (2008) 'The future of usability testing,' Bloor Research; available at: http://www.it-director.com/business/compliance/content.php?cid=10495.

Ahmed, S., Mohammad, D., Rex, B.K., and Harkirat, K.P. (2006) 'Usability measurement and

metrics: a consolidated model,' Software Quality Control 14 (2), 159–178.

Ballard, B. (2005) User Interface Design Guidelines for J2me Midp 2.0. Lawrence, KS: Little

Springs Design, Inc.

Barnard, L., Yi, J.S., Jacko, J.A., and Sears, A. (2005) 'An empirical comparison of use-in-

motion evaluation scenarios for mobile computing devices,' International Journal of

Human-Computer Studies 62 (4), pp. 487–520.

Bevan, N. and MacLeod, M. (1994) 'Usability measurement in context,' Behaviour and

Information Technology 13 (1), 132–145.

Björk, S., Redström, J., Ljungstrand, P., and Holmquist, L.E. (2000) 'POWERVIEW: using information links and information views to navigate and visualize information on small

displays,' Proceedings of the 2nd _{International Symposium on Handheld and Ubiquitous}

Computing, Bristol, UK.

Brewster, S., Chohan, F., and Brown, L. (2007) 'Tactile feedback for mobile interactions,'

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Washington, D.C., pp 159–162.

In document Conference programme & proceedings (Page 196-200)