• No results found

Universidade de Aveiro Departamento de Electrónica, Telecomunicações e Informática. Evaluation in Data and Information Visualization

N/A
N/A
Protected

Academic year: 2021

Share "Universidade de Aveiro Departamento de Electrónica, Telecomunicações e Informática. Evaluation in Data and Information Visualization"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

Evaluation in Data and Information

Visualization

Universidade de Aveiro

Departamento de Electrónica, Telecomunicações e Informática

(2)

• Visualization is the process of exploring, transform and represent data as images (or other sensorial forms) to gain insight into phenomena

• There are several expressions used to designate different areas of Visualization:

– Scientific Visualization

– Data Visualization

– Information Visualization

• The differences among these areas are not completely clear

(3)

Data acquisition Data Computing Results User Hypothesis Understanding

Framework

(Brodlie et al., 1992)

• Visualization includes not only image production from the data, but also their transformation and manipulation (if possible their acquisition)

(4)

Data Visualization Reference model

Measured data: CT, RMI, ultra-sound lasers, ……. Satalite imaging... Simulated data

Finite Element Analysis, …….

Numerical models ...

Data

Transform Map Display

(adapted from Schroeder et al., 2006)

(5)

• In general:

Data Visualization (DV) - Data having an inherent spatial structure (e.g., CAT, MR, geophysical, meteorological, fluid dynamics data)

Information Visualization (IV) – Data not having an inherent spatial structure

(e.g., stock exchange , S/W, Web usage patterns, text)

• These designations may be misleading since both DV and IV start with (raw) data and allow to extract information

• Borders between these areas are not well defined, neither it is clear if there is any advantage in separating them (Rhyne, 2003)

(6)

Information Visualization Reference Model

Raw data Data tables Visual structures Views task Data Transformation Visual Mappings View Transformation Human interaction

Visualization can be described as the mapping of data to visual form that supports human interaction in a workspace for visual sense making

(Card et al., 1999)

• In Information Visualization interaction is generally more considered

(7)

• A correct definition of goal is fundamental

How can we evaluate a Visualization?

(8)

Answering two questions:

“How well does the final visualization: - represents the underlying phenomenon

- helps the user understand it?”

Which imply:

A) “Low level “- evaluating the representation of the phenomenon

A) “High level” – evaluating the users’ performance in their tasks (involving understanding the phenomenon) while using the visualization

(9)

• Evaluating a visualization technique should

Involve evaluation of all phases: e.g.

low level: accuracy, repeatability of methods (errors, artifacts, …) high level: efficacy and efficiency, in supporting users tasks

learnability, memorability, …

• Not forgetting the interaction (not only visual) aspects!

Visualization technique map transform display Data Simulated Data Measured Data

(10)

• Motivation/ goal (why? / what for?)

• Test data (which data sets? How many?)

• Evaluation methods (which?)

• Collected data (which measures? which observations?)

• Data analysis (which methods?)

Main Issues for evaluation planning

Much related with the methods

(11)

• Motivation and goal are the starting point of an evaluation For example:

- Which is the best representation of specific data to support specific users while performing specific tasks?

- Which is the best segmentation algorithm?

• Along constraints, influences the choice of

– methods

– data sets

(12)

• Test data can be real, synthetic (or in beteween)

• For instance in Medical Data Visualization it is common to use:

Synthetic data “Phantoms" Cadavers In Vivo

Synthetic data allow a better knowledge of the “ground truth”

• Data should :

– Be enough

– Be representative

– Include specially difficult cases

Accuracy

(13)

• Collected data have a fundamental impact on the information we can get from the evaluation

• Analysis of the collected data has an impact on the results credibility

• Selecting methods should take into consideration:

– Nature, level of representation and scale of the collected data

– Size of the sample

– Statistical distribution

(14)

• Methods from other disciplines can be adapted, e.g.

– methods used in Human-Computer-Interaction (Dix, 2004):

- Controlled experiments with users - Observation

- Query methods (questionnaires, interviews) - Inspection methods (heuristic evaluation)

• Specific methods are appearing (e.g. insight based methods)

Methods

Empirical (involving users)

Analytical

(15)

Controlled experiments

• “workhorse” of experimental science (Carpendale, 2008)

• with benchmark tasks, the primary method for rigorously evaluating visualizations (North, 2006)

• Involve:

– Hypothesis

– Independent (input) variables (what is controlled)

– Dependent (output) variables (what is measured)

– Secondary variables (what more could influence results)

– Experimental design (between groups / within groups)

– Protocol (sequence and characteristics of actions)

(16)

Observation

• Is a very useful method widely used in usability evaluation

• Can be done in different ways:

– Very simple (e.g. just observing the user doing some tasks) …

– Very sophisticated (e.g. using a usability Lab, logging, video, …)

– Think aloud

• Usability testing includes observation and query techniques (engineering approach)

(17)

Query methods

• Also very useful and widely used in usability evaluation

• Two types:

– Questionnaires – easier to apply to more people; less flexible

– Interviews – more flexible; reach less people

• Must be carefully designed (types of questions, scale of responses, …)

(18)

Heuristic evaluation

• Widely used in usability evaluation

• Application in Visualization evaluation is not as common (few heuristics)

• It is a structured analysis assessing if a set of heuristics are followed

• It should be performed by expert analysts

• Has the advantage of not involving users

(19)

Evaluating Visualizations: examples

Universidade de Aveiro

Departamento de Electrónica, Telecomunicações e Informática

CardioAnalyser: Left Ventricle (LV) Visualization from Angio Computer Tomography (CT) data

*

(20)

CardioAnalyser: Visualizing the Left Ventricle (LV) and

quantifying its performance from Angio Computer Tomography data

Goal:

Help users to better understand the

performance of the Left Ventricle from AngioCT through interactive visualization methods/tools

(21)

CardioAnalyser: Visualizing the Left Ventricle (LV) and

quantifying its performance from Angio Computer Tomography data

- CT exam: ~12 phases x (512x512x256) volume - segment endocardium and epicardium

in every phase - edit the segmentations (if necessary) - visualize – quantify

(22)

How should we evaluate? 1- the segmentation method/tool

2- the functional analysis method/tool 3- the perfusion analysis method/tool

CardioAnalyser: Visualizing the Left Ventricle (LV) and

quantifying its performance from Angio Computer Tomography data

(23)

How should we evaluate?

1- the segmentation method /tool 2- the functional analysis tool

CardioAnalyser: Visualizing the Left Ventricle (LV) and

(24)

“Low level evaluation”:

• Preliminary evaluating the segmentation method – observer study, query

“High level evaluation”:

• Evaluating a 3D segmentation editing tool - user study, observation, query

The team:

At the University:

- Samuel Silva, PhD student - Joaquim Madeira, PhD - Carlos Ferreira, PhD (Math) At Gaia Hospital:

(25)

Is the CardioAnalyser LV segmentation tool adequate to

support radiographers in their segmentation tasks?

1– qualitative evaluation of the segmentation method 2– qualitative evaluation of the 3D editing tool

3- selection of a measure to compare segmentations 4- quantitative evaluation of the LV segmentation tool

(26)

Constraints during evaluation

• A lot of data for each exam

• High patient/image variability

• Very busy domain experts

• Distant hospital

implied:

-

a careful choice of test data set and methods - the development of specific applications

(27)

1- LV segmentation method

.

• Accurate segmentations are needed to: - compare structures

- perform quantitative measurements

• In medical applications segmentations must be validated by the expert

• A segmentation method that starts by one phase (60%) and uses the first segmentation to help segment the other phases was developed …

(28)

Qualitative Evaluation of the segmentation method

• Preliminary qualitative evaluation after developing the first prototype

• Meant to:

– detect serious segmentation problems

– inform further fine tuning of the method

• 3 radiographers • 7 exams, 3 phases /exam (ED, ES, 60%) epicardium, endocardium endocardium epicardium

(29)

• Using a Regional classification:

• Endocardium: four anatomical regions:

– apex

– mid-ventricle

– mitral valve

– outflow

• Epicardium: five anatomical regions:

– apex

– mid-ventricle lateral and septal regions

– basal lateral and septal regions

Scale:

- OK (optimum segmentation) - EXCESS (3 levels) + ++ +++ - SHORTAGE (3 levels) - --

---• Radiographers classified the segmentations (without any edition) as if they were final (i.e., usable for diagnosis purposes)

(30)

1 – low significance: very good; could include/exclude a very small region; Segmentation classification:

(31)

Results of preliminary evaluation of the segmentation method

• Endocardium segmentation:

– apex and midventricular slices well segmented

• Epicardium segmentation clearly needed further improvements

– Most problems in the septal sections of midventricular and basal regions

example of epicardium segmentation problem in the septal section

(32)

2- A tool to edit LV segmentations in 3D

• Even robust segmentation methods cannot deal with the wide range of variation of anatomical structures, e.g. in:

- shape

- orientation - texture

• Tools to easy segment editing/correction by experts are needed

• Performing segmentation in volume data editing several slices may be a tiresome task

(33)

• Should be:

- Intuitive

- Easy to use by radiographers

to correct most common segmentation problems

• Two alternatives:

Voxel mask (ADD/REMOVE) 3D surface (deform)

(34)

• Three radiographers

• Explanation and practice

• Two typical tasks:

– task 1 - adjusting the segmentation to the mitral valves (removing)

– task 2 - adjusting the segmentation to the LV wall (adding)

• Time to perform the tasks using:

– voxel mask (3DV)

– surface editing (3DS)

– the 2D editing tool (from MITK)

• Preferences , comments

(35)

Results of the 3D editing tool evaluation

Time (s) to complete an editing task using:

2D tool; 3DV - voxel editing; 3DS - surface editing

• Average task times for both 3D editing modes much smaller than for the 2D tool

• Users preferred voxel editing simplicity

- but surface editing does not occlude the image

(36)

Comparing a modified pedigree tree visualization method

with the original method

H-Tree method (Tuttle et al., 2010)

João Miguel Santos: MSc Student Paulo Dias, PhD

(37)

Comparing a modified pedigree tree visualization method with the

original method

• Visualization techniques capable of representing large pedigree trees are useful

• An H-Tree Layout has been recently proposed to overcome some of the limitations of traditional representations

(38)

Traditional representations of pedigree trees

(used in commercial S/W)

• Binary trees with several layouts (horizontal, vertical, bow):

- Generations easily understandable

- Space needs grow fast with generations

• Fan trees

- Generations still understandable - Space needs attenuated

(39)

Pedigree H-layout representation

• To overcome space limitations, Tuttle et al. (2010) proposed a method based on the H-Tree Layout:

- It allows the representation of a greater number of generations simultaneously However:

- It is more difficult to identify relations among individuals

2 1 3 2 3 1 4 4 3 2 3 4 4 1 4 4 5 4 5 5 4 5 3 2 3 5 4 5 5 4 5 1 5 4 5 5 4 5 5 4 5 5 4 5 3 2 3 5 4 5 5 4 5 1 5 4 5 5 4 5

(40)

Enhancing the Pedigree H-layout

• Objectives:

- simplify the understanding of the family structure inherent in the pedigree - allow downward interactive navigation

(41)

Enhancing the Pedigree H-layout

• New functionality proposed:

- complementary information on the tooltip with the relation to the central individual

- "generation emphasis" that highlights individuals belonging to generation n in relation to the individual under the cursor

- contextual menu allowing downward navigation to direct descendants

(42)

Evaluating the Enhanced Pedigree H-Tree

• Does the enhanced method better support understanding the family structure? As (comparative evaluation)

• How good is the enhanced method (for specific tasks/users)? (outright evaluation)

• Two types:

– Analytical

(43)

Empirical evaluation characterization

• Data: public real data

• Users:

– InfoVis/HCI students

– Experts (MDs, animal breeders)

• Methods: – Observation – Logging – Questionnaire – Interview – Controlled experiment • Tasks: – Simple – Complex – Interaction – Visual • Measures: – User performance • Efficiency • Efficacy – Satisfaction

(44)

• Measures/methods: – Task completion: • Observation • Logging – Difficulty, Disorientation: • Questionnaire • Observation – Times: • Observation/Logging – Satisfaction: • Questionnaire • Interview

(45)

Evaluation: four/five phases

• Pilot usability test

– A few users

• Usability test

– 6 InfoVis students

• Pilot test for the controlled experiment:

– 6 InfoVis students

• Controlled experiment:

– 60 HCI students

• Evaluation with domain experts

For academic purposes: - further improvement - formal comparison - guidelines

- No logging

- Only comparative

- Informally confirmed usefulness of enhancements

- Allowed improving: - application

(46)

Usability test

(including pilot)

• General explanation concerning the

application and the test

• Practice until each user feels ready

• Users performed 6 tasks

An observer registered:

• Task completion

• Correct answers

• Times

• Difficulty

• If the user asked for help/ seemed lost

• Users answered a questionnaire

(47)

Documents involved in the protocol

• List of tasks

• Observer notes

(48)

Results of the usability test

• Efficacy - more correct answers with:

– tooltips

– generation emphasis

• Efficiency - times were difficult to register manually (tasks too simple?)

• Tooltips were considered the most helpful feature to understand the family structure

• Specific suggestions (e.g. increase arrows size)

(49)

Another test: is the test application “Colorblind friendly?”

(50)

Other tested alternatives

(51)

Design of the controlled experiment

• Question:

Do users understand better the family structure while using the enhanced method (compared with the original method)?

• Can be divided in the following two hypothesis:

Hypothesis 1 – Tooltips improve users performance in understanding the family structure, when compared with the original method

Hypothesis 2 – “Generation emphasis” improves users performance in understanding the family structure, when compared with the original method

(52)

Variables:

• Input (independent) variables:

– Method – 3 levels – original

original + tooltips

original + “generation emphasis”

• Output (dependent) variables:

– times

– task completion rate; success rate

– disorientation, difficulty

– satisfaction

• Secondary variables:

(53)

Experimental design

• Within-groups:

all users perform the same tasks in all experimental conditions

(i.e., with all methods )

• Advantages over between-groups design:

– More data with the same users

– Less user profile variation

• Caution:

(54)

Protocol of the controlled experiment

• General explanation concerning the

application and the test

• Practice until each user feels ready

• Users perform 10 tasks

– An observer registers:

• Task completion

• Difficulty

• Errors

• If the user asked for help/ felt lost

– The application loggs times

• Users answer a questionnaire

(55)

In these examples (but more gerally):

• Formative came first, then summative evaluation (they are not totally disjoint)

• It was important to:

– Start thinking about evaluation as soon as possible

– Do several evaluation “rounds”

– Use more then one method

– Carefully choose the methods, data, users, tasks, measures, data analysis methods

– Learn as much as possible from each evaluation round, to: - Improve the methods/applicationsI

(56)

• Evaluating Visualizations is challenging

• It will become more challenging as Visualization evolves to be more interactive, collaborative, distributed, multi-sensorial, mobile …

• It is fundamental to:

- evaluate solutions to specific cases

- develop new visualization methods / systems - establish guidelines

to make Visualization more useful, more usable, and more used

About Evaluating Visualization methods/applications:

(57)

Bibliography - books

• Brodlie, K., L. Carpenter, R. Earnshaw, J. Gallop, R. Hubbold, A. Mumford, C. Osland, P. Quarendon, Scientific Visualization, Techniques and Applications, Springer Verlag , 1992 • Card, S., J. Mackinlay, B. Schneiderman (ed.), Readings in Information Visualization- Using

Vision to Think, Morgan Kaufmann, 1999

• Carpendale, S.: Evaluating Information Visualization. Information Visualization: Human-Centered Issues and Perspectives, Kerren, A. Stasko, J., Fekete, J.D., North, C. (eds), LNCS

vol. 4950 19-45. Springer, 2008

• Dix, A., Finlay, J., Abowd G., Beale, R.: Human-Computer Interaction, 3rd edition, Prentice Hall, 2004

• Hansen, C., C. Jonhson (eds.), The Visualization Handbook, Elsevier, 2005

• Jonhson, C., R. Moorhaed, T. Munzner, H. Pfister, P. Rheingans, T. Yoo, Visualization Research Challenges, NHI/NSF, January , 2006

• Keller, P., M. Keller, Visual Cues, IEEE Computer Society Press, 1993

• Schroeder, W., K. Martin, B. Lorensen, The Visulization Toolkit- An Object Oriented Approach to 3D Graphics, 4th ed., Prentice Hall, 2006

• Spence, R., Information Visualization: Design for Interaction, 2nd ed., Addison Wesley 2006 • Ware, C. , Information Visualization: Perception to Design, 2nd ed. Academic Press, 2004

(58)

Bibliography – papers

• Rhyne, T. M., "Does the Difference between Information and Scientific Visualization Really Matter?“, IEEE Computer Graphics and Applications, May/June, 2003, pp. 6-8

• Rhyne, T. M., “Scientific Visualization in the Next Millennium”, IEEE Computer Graphics and Applications, Jan./Feb., 2002, pp. 20-21

• Hibbard, B., “ Top Ten, Visualization Problems”, SIGGRAPH Computer Graphics Newsletter, VisFiles, May 1999, Vol. 33, N.2

• Johnson, C., “Top Scientific Visualization Research Problems”, IEEE Computer Graphics and Applications: Visualization Viewpoints, July/August, 2004, pp. 13-17

• Eick, S., "Information Visualization at 10," IEEE Computer Graphics and Applications, vol. 25, no. 1, Jan /Feb, 2005, pp. 12-14

• Keefe, D., “Integrating Visualization and Interaction Research to Improve Scientific Workflows”, IEEE Computer Graphics and Applications,

vol. 30, no. 2, Mar/April, 2010, pp. 8-13

• Globus, A., E. Raible, “Fourteen Ways to Say Nothing With Scientific Visualization”,

References

Related documents

In the present paper we prove a convergence result for martingales (see Theorem 2.2) and we apply it to a two-colors randomly reinforced urn, which is a generalization of the urn

Basics of nursing, Pharmacology, Research and development work methodology, Clinical nursing, Paediatric nursing, Mental health nursing, Nurse’s personality

Dari hasil olah data deskripsi variabel yang dilakukan dari 30 responden maka diperoleh hasil nilai persentase rata-rata status sosial ekonomi sebesar 79 persen,

A UL of 350 mg magnesium per day was set for adults and children > 8 years based on a lowest observed adverse effect level (LOAEL) seen in a study where 6 of 21 patients

Such measures include having a dedicated operating theatre for emergency patients, thus allowing elective operations to continue in other operating theatres [93], and to consider

The ACHN needs to be associated with a login account on the Aqua Connect website to control and/or monitor their pool via the web.. A given ACHN can only be associated with a

Comparative model tests of the 3D foil with a smooth leading edge (no tubercles) and 367. with the leading tubercles, which covered the whole span of the foil, confirmed