• No results found

Visualising Variables – Validly!

N/A
N/A
Protected

Academic year: 2021

Share "Visualising Variables – Validly!"

Copied!
32
0
0

Loading.... (view fulltext now)

Full text

(1)

Visualising Variables – Validly!

Damien Jolley

Monash Institute of Health Services Research Monash University

AHMRC Posters

May 2006

(2)

Download slides from:

http://www.jolley.com.au

Average daily retail petrol price,

Melbourne, 15 April-14 May 2006

We

d Mo Wed Wed

n Mo

n Mo

n Mo

n T

h

Source: http://www.accc.gov.au, 29 May 2005

(3)

Download slides from:

http://www.jolley.com.au

Average daily retail petrol price,

Melbourne, Oct-Nov 2002

T

h T

h T

u

Sat Sat Fri T Sat

h

Source: http://www.accc.gov.au, 21 Nov 2002

(4)

Download slides from:

http://www.jolley.com.au

Price & pattern has changed…

2004 2005 2006

130

90 100 110 120

C en ts p er li tr e, M el bo u rn e

(5)

90 100 110

90 100 110

01/05 08/05 15/05 22/05

01/05 08/05 15/05 22/05 01/05 08/05 15/05 22/05

Melbourne Sydney Adelaide

Brisbane Perth

pr ic e

date

Graphs by city

Daily average petrol prices (c/litre) in selected Australian

cities, May 2005

Source: http://www.accc.gov.au

(6)

Download slides from:

http://www.jolley.com.au

Obvious fact #1:

Graphs can communicate data:

quickly

accurately

powerfully

efficiently

(7)

Download slides from:

http://www.jolley.com.au

“Only 50% of American 17- year-olds can identify

information in a graph”*

Source: Wainer H.

Understanding graphs and tables.

Educational Researcher 1992;

21:14-23

* US National Assessme

nt of Education

al

Progress,

June 1990

(8)

Download slides from:

http://www.jolley.com.au

Whose fault?

Source: Wainer H.

Understanding graphs and tables.

Educational Researcher 1992;

21:14-23

“Like characterising someone’s ability to read by asking questions about a passage full of spelling and

grammatical errors. What are we really testing?”

Drawn using MS Excel ‘XY-chart’

(9)

http://www.jolley.com.au

Obvious fact #2:

Bad graphs can hinder communication

(10)
(11)

http://www.jolley.com.au

Less obvious facts #3, #4,

#5:

What characterises a “good” graph?

What are the characteristics of a

“bad” graph?

What software to use? How to use it?

(12)

Download slides from:

http://www.jolley.com.au

Howie’s Helpful Hints

for bad graph displays

Ten useful pointers to help you create uninformative, difficult-to-read scientific graphs

Adapted from:

Wainer H. (1997) Visual Revelations.

Mahwah, NJ: Lawrence

Erlbaum Associates,

Publishers

(13)

http://www.jolley.com.au

Steps for better graphs

1. Identify direction of effect

In almost all cases, the cause or predictor variable should be horizontal (X)

Effect or outcome variable is best vertical (Y)

2. Identify the levels of measurement

Nominal, ordinal or quantitative are different!

3. Think of visual perception guides

Columns or dots? Lines or scatterplot?

4. Minimise guides and non-data

Grid lines, tick marks, legends are non-data

(14)

Download slides from:

http://www.jolley.com.au

Cause (X) and effect (Y)

Figure 16

Standard deviation of batting averages for all full-time players by year for the first 100 years of professional baseball. Note the regular decline.*

Standard deviation

Time

Source:

Gould, Stephen Jay. Full House: The Spread of Excellence from Plato to Darwin. Random House, 1997.

cited: http://www.math.yorku.ca /SCS/Gallery/, 24 Nov 2002

* My emphasis

Standard deviation

Time

(15)

Killias M.

International correlations between gun ownership and rates of homicide and suicide.

Can Med Assoc J 1993;

148: 1721-5

(16)

% of households owning guns

Rate of homicide with a gun (per million per year)

10 20 30 40

1 5 10

50 USA

Norway Canada France Finland Belgium

Australia

Spain Switzerland

Netherlands

West Germany

Scotland England & Wales

Drawn using S-plus

(17)

http://www.jolley.com.au

Levels of Measurement

The right display for a variable depends on its level of measurement

For univariate graphs,

qualitative  barplot

ordinal  column chart

quantitative  boxplot or histogram

For bivariate graphs,

X ordinal, Y binary  connected percents

X & Y both quantitative  scatterplot

X categorical, Y quant  box plots

Binary

eg gender, death, pregnant

Categorical

Qualitative

eg race, political party, religion

Diverging

eg change (-ve to +ve)

Ordinal

eg rating scale, skin type, colour

Quantitative

Interval

only differences matter, eg BP, IQ

Ratio

absolute zero, ratios matter,

eg weight, height, volume

(18)

Source:

Lewis S, Mason C, Srna J. Carbon monoxide exposure in blast furnace workers.

Aust J Public Health. 1992 Sep;16(3):262-8.

Ordinal variable,

but categorie

s mixed Outcome

is COHb%, but drawn

on X

(19)

http://www.jolley.com.au

An alternative display . . .

Area of circles proportional Predictor variable to n

O u tc om e va ri ab le

Drawn using MS Excel

‘bubble plot’

(20)

Download slides from:

http://www.jolley.com.au

Principles of visual perception

WS Cleveland

much work in psycho- physics of human visual understanding

Tells us:

hierarchy of visual

quantitative perception

patterns and shade can cause vibration

graphs can shrink with almost no loss of

information

Source: Cleveland WS. The Elements of Graphing Data. Monterey: Wadsworth, 1985.

(21)

http://www.jolley.com.au

Ubiquitous column charts

Source: Jamrozik K, SpencerCA, et al. Does the Mediterranean paradox extend to abdominal aortic aneurism? Int J Epidemiol 2001; 30(5): 1071

(22)

Download slides from:

http://www.jolley.com.au

A dotchart version…

Mediterranean Netherlands

All other Other N Europe

Australia Scotland

Full fat milk

50 60 70 80

Adds salt

50 60 70 80

Meat 3+ weekly

50 60 70 80

Fish 1+ weekly

50 60 70 80

Percent

Drawn using S-plus

“Trellis” graphics

(23)
(24)

Moiré vibration is easy with

a computer !!!

(25)

http://www.jolley.com.au

Moiré vibration

Vibration is maximised with lines of equal separation

This is common in scientific column charts

cited in Tufte E. The Visual Display of Quantitative Information.

(26)

Download slides from:

http://www.jolley.com.au

Minimise non-data ink

Non-data ink includes tick marks, grid lines, background, legend

Explanation of error bars, P-values can be included in caption or in text

Greeks in Australia Swedes in Sweden Japanese in Japan Anglo-Celts in Australia

Greeks in Greece

0.10 0.25 0.50 0.75 1.00

Relative mortality rate (all causes)

Note the exception for X-Y orientation:

because predictor is qualitative

(unordered)

(27)

http://www.jolley.com.au

Software for scientific graphics

Dedicated programs – thousands!

DeltaGraph (SPSS)

Prism

ViSta

Business graphics

MS Excel

many other spreadsheet programs

Graphics in statistical packages

Stata

simple, powerful

S-Plus, R

powerful, difficult

SPSS interactive graphics

easy, expensive

Systat

good reputation

SAS GRAPH language

expensive, powerful

Advice: Avoid “default” choice in all programs (almost always wrong).

Avoid programs with “Chart Type” menus – wrong approach.

(28)

Download slides from:

http://www.jolley.com.au

Graph formats

Object-oriented

lines, shapes, etc can be identified within graph

each object has attributes (eg size, colour, font)

editable using selection and

“grouping”

Common formats:

Postscript (ps,eps)

Windows metafile (wmf,emf)

Bit-mapped

image exists as a collection of pixels

each pixel is light or dark, coloured

can edit only pixels not objects

often “compressed” to save disk space, bandwidth

Common formats

graphics interchange (gif)

Windows bitmap (bmp)

JPEG interchange (jpg)

Advice: Use WMF format where possible. Paste WMF into

PowerPoint, “ungroup”, then edit objects for publication quality.

(29)

http://www.jolley.com.au

References, further reading

Tufte ER.

The Visual Display of Quantitative Information Cheshire, CT:

Graphics Press 2001

www.edwardtuft e.com

Cleveland WS.

Visualizing Data

Summit NJ:

Hobart Press, 1993

Wainer H.

Visual

Revelations.

Graphical Tales of Fate and Deception from Napoleon

Bonaparte to Ross Perot Mahwah, NJ:

Lawrence Erlbaum Associates, Publishers. 1997 www.erlbaum.co m

Wilkinson L.

The Grammar of Graphics

New York:

Springer Verlag,

1999

(30)

Download slides from:

http://www.jolley.com.au

Summary

Howie’s Helpful Hints for bad graphs:

Don’t show the data

Show the data inaccurately

Obfuscate the data

Steps for better graphs:

Identify direction of cause & effect

Exploit levels of measurement

Accommodate visual perception principles

Minimise non-data ink

Don’t use Excel unless you have to

And if you have to, don’t use the default chart!

References

Related documents

As a final check that the birth spacing patterns observed do reveal a higher son preference in less segregated Roma settlements as well as in non-Roma communities, we look at the

If you are not ending the Contract for one of the reasons set out in clause 9.2, then the Contract will end immediately and we will refund any sums paid by you for

The results of questions aimed at finding out the level of the use of each component of process management can lead to a conclusion that managers of Czech enterprises attach

Consumer involvement is defined as a state of mind that motivates consumers to identify with product/service offerings, their consumption patterns and consumption behavior.

1) The air discharge valve and the 3-way valve was opened and closed interchangeably until ambient pressure was reached in the cylinder. The level of liquid in the cylinder at

Long-term operation of Dukovany NPP New nuclear sources Stabilization abroad Renewable sources Customer orientation New Energy Performance and Entrepreneurship 1...

The main objective of this study was to investigate interactions between onion (Allium cepa) and yellow wax bean (Phaseolus vulgaris) in monocultures and intercropping with

Specifically, to orchestrate the crystallization of an ecosystem blueprint while at the same time ensuring future value capture, Green and Red enacted three dimensions of