User Interface Design

(1)

User Interface Design

Winter term 2005/2006

Thursdays, 14-16 c.t., Raum 228

Prof. Dr. Antonio Krüger

(2)

Testing & modeling users

(3)

The aims

· Describe how to do user testing.

· Discuss the differences between user testing, usability testing and research experiments.

· Discuss the role of user testing in usability testing.

· Discuss how to design simple experiments.

· Describe GOMS, the keystroke level model, Fitts’ law and discuss when these techniques are useful.

· Describe how to do a keystroke level analysis.

(4)

Experiments, user testing &

usability testing

 Experiments test hypotheses to discover new

knowledge by investigating the relationship between two or more things – i.e., variables.

 User testing is applied experimentation in which

developers check that the system being developed is usable by the intended user population for their tasks.

 Usability testing uses a combination of techniques,

including user testing & user satisfaction questionnaires.

(5)

User testing is not research

User testing

 Aim: improve products

 Few participants

 Results inform design

 Not perfectly replicable

 Controlled conditions

 Procedure planned

 Results reported to developers

Research experiments

 Aim: discover knowledge

 Many participants

 Results validated statistically

 Replicable

 Strongly controlled conditions

 Experimental design

 Scientific paper reports results to community

(6)

User testing

 Goals & questions focus on how well users perform tasks with the product

 Comparison of products or prototypes common

 Major part of usability testing

 Focus is on time to complete task & number & type of errors

 Informed by video & interaction logging

 User satisfaction questionnaires provide data about users’ opinions

(7)

Testing conditions

 Usability lab or other controlled space

 Major emphasis on

• selecting representative users

• developing representative tasks

 5-10 users typically selected

 Tasks usually last no more than 30 minutes

 The test conditions should be the same for every participant

 Informed consent form explains ethical issues

(8)

Type of data (Wilson & Wixon, ‘97)

· Time to complete a task

· Time to complete a task after a specified time away from the product

· Number and type of errors per task

· Number of errors per unit of time

· Number of navigations to online help or manuals

· Number of users making a particular error

· Number of users completing task successfully

(9)

Usability engineering orientation

· Current level of performance

· Minimum acceptable level of performance

· Target level of performance

(10)

How many participants is enough for user testing?

 The number is largely a practical issue

 Depends on:

• schedule for testing

• availability of participants

• cost of running tests

 Typical 5-10 participants

 Some experts argue that testing should

continue until no new insights are gained

(11)

Experiments

 Predict the relationship between two or more variables

 Independent variable is manipulated by the researcher

 Dependent variable depends on the independent variable

 Typical experimental designs have one or

two independent variable

(12)

Experimental designs

 Different participants - single group of participants is allocated randomly to the experimental conditions

 Same participants - all participants appear in every condition

 Matched participants - participants are matched in tuples, e.g., based on

expertise, gender

(13)

Example

 Hypotheses: “Will the time to read a screen of text be different if 12-point Helvetica is used instead of 12-point Times-Roman?”

 Condition 1: users read text with Helvetica

 Condition 2: users read text with Times Roman

 Control condition: read text on paper

 Extend design with variable user-expertise (additional conditions: expert/beginner)

 What are the independent and dependent

variables

(14)

Advantages and disadvantes

(15)

Evaluation of results / significance

 The larger the sample, the less likely that the difference is due to sampling errors or chance.

 The larger the difference between the two means, the less likely the difference is

due to sampling errors

 The smaller variance among the

participants, the less likely that the

(16)

Variablitiy

 Are the results statistically

significant?

 Use the t-test to

analyze the ration of means and group

variability

(17)

T-test

Use standard-table of significance to determine if t is good enough.

(18)

Predictive models

 Provide a way of evaluating products or designs without directly involving users

 Psychological models of users are used to test designs

 Less expensive than user testing

 Usefulness limited to systems with predictable tasks - e.g., telephone answering systems, mobiles, etc.

 Based on expert behavior

(19)

GOMS (Card et al., 1983)

 Goals - the state the user wants to achieve e.g., find a website

 Operators - the cognitive processes & physical actions performed to attain those goals, e.g., decide which

search engine to use

 Methods - the procedures for accomplishing the goals, e.g., drag mouse over field, type in keywords, press the go button

 Selection rules - determine which method to select when there is more than one available

(20)

Keystroke level model

 GOMS has also been developed further into a quantitative model - the keystroke level model.

 This model allows predictions to be made

about how long it takes an expert user to

perform a task.

(21)

Response times for keystroke

level operators

(22)

Problems of GOMS/Keystroke model

 Doesn’t take into account slack times and critical situations that may slow down certain strokes.

 Example: Usage of system while talking to a person in parallel.

 Further influences that are not taken into account:

fatigue, learning effects, workload, etc..

 Models are just good to provide an estimate, they can’t substitute user testing

(23)

Fitts’ Law (Paul Fitts 1954)

 The law predicts that the time to point at an object using a device is a function of the distance from the target

object & the object’s size.

 The further away & the smaller the object, the longer the time to locate it and point.

 Useful for evaluating systems for which the time to locate an object is important such

as handheld devices like mobile phones

 Why are labeled toolbars easier to access?

(24)

Key points

· User testing is a central part of usability testing

· Testing is done in controlled conditions

· User testing is an adapted form of experimentation

· Experiments aim to test hypotheses by manipulating certain variables while keeping others constant

· The experimenter controls the independent variable(s) but not the dependent variable(s)

· There are three types of experimental design: different-participants, same- participants, & matched participants

· GOMS, Keystroke level model, & Fitts’ Law predict expert, error-free performance

· Predictive models are used to evaluate systems with predictable tasks such as telephones

(25)

Design & evaluation

in the real world

(26)

The aims

 Show how design & evaluation are brought together in the development of interactive products.

 Show how different combinations of design & evaluation methods are used in practice.

 Describe the various design trade-offs & decisions that have to be made in the real world.

(27)

Key issues: From requirements to design

 which design cycle to use

 which combination of methods to use when designing & evaluating a product

 what happens when the product being developed is confidential and there are no users available to test it?

 how many users should be involved in tests?

 what to do with the evaluation findings

 how much to expect from users

(28)

Case study: designing mobile communicators

Two examples, for very different audiences:

 Nokia’s mobile communicator

 Philips communicator for children

(29)

Designing Nokia’s mobile communicator

 design cycle: iterative user-centered approach

 which methods:

• ethnographic research

• scenarios and task models

 confidential product issues:

• first in the market is key

• evaluation must be very limited and no real

(30)

Designing Nokia’s mobile communicator (contd)

 physical aspects:

• screen size

• number of buttons versus functionality

 consistency issues

• internal consistency (within mobile software)

• external consistency (with desktop software)

 user testing

• none before release

• summative testing & questionnaires after

(31)

Designing telephones for

special user groups

⁽

Royal Nation Institute for the Blind)

 Guarded or recessed keys

 Sidetone reduction to reduce noise level

 Adjustable key pressure

 Audio and tactile feedback

 Larger key size

(32)

Consistency of the Design

 Internal consistency

• Nokia style guide

 External consistency difficulties

• No pointing device

• Slow connection and download times

• Default homepage

• Transcoding webpages (focus on text)

(33)

Philips Communicator for children

(Oosterhoolt ’96)

(34)

Designing Philips’ communicator for children

 design cycle: iterative and evolutionary

 which methods:

• low-fidelity prototyping

• participatory design

• interface metaphors

 physical aspects:

• color, shape, size, robustness

• pen input

• bags to protect screen

(35)

Designing Philips’ communicator for children

 user involvement:

• children involved throughout

• prototypes evaluated constantly

• invaluable insights for the designers

 lessons learned:

• agree on assumptions in requirements

• think of follow-on projects early on

• users are not designers

(36)

Case study 2: A telephone response information system (TRIS)

 Interactive voice response systems are common in

government offices and large companies. Do you know of examples that you have used?

 Why are these systems often so frustrating to use?

• Forming a mental model is difficult because there is no visual feedback and the user must remember the menu structure

 Many menus and deep menus are particularly difficult

(37)

Why was TRIS difficult to use?

 Having to remember the menu structure.

 The programmers traded computational elegance for usability, e.g., the system asked for social security

number and employee identification number, confusing users who did not have both.

 TRIS was comprised of different systems each with its own interaction style. Users were not told this but when they moved between the systems they experienced

sudden, unexplained changes.

(38)

How was TRIS evaluated?

 A combination of techniques were used:

• a review of the literature provided information about problems with interactive voice response systems

• expert reviews

• GOMS analysis of the proposed redesign

 The redesign was implemented

• usability tests confirmed that the redesigned system offered better usability than the original design

(39)

Why was using different methods valuable?

 The evaluators were able to build-up a broad picture of usability problems.

 Using GOMS and heuristic evaluation they could

explore the potential benefits of the redesigned system.

 User testing enabled them to confirm that the redesigned system offered better usability.

 User satisfaction questionnaires confirmed that users preferred the redesigned system.

(40)

Key points

 Design involves trade-offs

 Design space for making changes when upgrading a product is limited

 Cycles of rapid prototyping and evaluation allow designers to examine alternatives

 Piecing together evidence from a variety of sources can be valuable