Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 4, Slide 1
Chapter 4
Understanding and
Comparing
Comparing Groups
It is almost always more
interesting to compare groups.
With histograms, note the
shapes, centers, and spreads of the two distributions.
What does this graphical
Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 4, Slide 3
Comparing Groups (cont.)
Boxplots offer an ideal balance of information and
simplicity, hiding the details while displaying the overall summary information.
We often plot them side by side for groups or categories
we wish to compare.
What About Outliers?
If there are any clear outliers and you are
reporting the mean and standard deviation, report
them with the outliers present and with the
outliers removed. The differences may be quite
revealing.
Note: The median and IQR are not likely to be
Copyright © 2015, 2010, 2007 Pearson Education, Inc. Slide 4- 5Chapter 4, Slide 5
Example - Creating Stem-and-Leaf Plots
A comparative stem-and-leaf plot is used when two groups of data are to be analyzed
together. One group will extend to the left of the stem and the other group will extend to the right.
The UNICEF report “Progress for Children” (April, 2005) included the accompanying data on the percentage of primary-school-age children who were enrolled in school for 19 countries in Northern Africa and for 23 countries in Central Africa.
Northern Africa
54.6 34.3 48.9 77.8 59.6 88.5 97.4 92.5 83.9 96.9 88.9 91.6 97.8 96.1 92.2 94.9 98.6 86.6
Central Africa
58.3 34.6 35.5 45.4 38.6 63.8 53.9 61.9 69.9 43.0 85.0 63.4 58.4 61.9 40.9 73.9 34.8 74.4 97.4 61.0 66.7 79.6 98.9
Evidence suggests that a high indoor radon concentration
might be linked to the development of childhood cancers. The data that follows is the radon concentration in two different
samples of houses. The first sample consisted of houses in which a child was diagnosed with cancer. Houses in the
second sample had no recorded cases of childhood cancer. Cancer
10 21 5 23 15 11 9 13 27 13 39 22 7 20 45 12 15 3 8 11 18 16 23 16 9 57 16 21 18 38 37 10 15 11 18 21 22 11 16 17 33 10
No Cancer
9 38 11 12 29 5 7 6 8 29 24 12 17 11 11 3 9 33 17 55 11 29 13 24 7 11 21 6 39 29 7 8 55 9 21 9 3 85 11 14
Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 4, Slide 7
Timeplots: Order, Please!
For some data sets, we are interested in how the data
Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 4, Slide 9
*Re-expressing Skewed Data to
Improve Symmetry
When the data are skewed it can be hard to summarize
them simply with a center and spread, and hard to decide whether the most extreme values are outliers or just part of a stretched out tail.
*Re-expressing Skewed Data to
Improve Symmetry (cont.)
One way to make a
skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function).
Note the change in
skewness from the raw
Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 4, Slide 11
What Can Go Wrong? (cont.)
Avoid inconsistent scales, either within the display or
when comparing two displays.
Label clearly so a reader knows what the plot displays.
What Can Go Wrong? (cont.)
Beware of outliers
Be careful when
comparing groups
that have very
different spreads.
Consider these
side-by-side
boxplots of
cotinine levels:
Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 4, Slide 13
What have we learned?
We’ve learned the value of comparing data
groups and looking for patterns among groups
and over time.
We’ve seen that boxplots are very effective for
comparing groups graphically.
We’ve experienced the value of identifying and
investigating outliers.
We’ve graphed data that has been measured
over time against a time axis and looked for
long-term trends both by eye and with a data
AP Tips
When comparing distributions, make sure to
compare the center, shape
and
spread. All three
are required for full credit.
When comparing center and spread, use
comparison words
. Examples:
“The mean was 34 mpg in group A, while the
group B mean was 30 mpg.” (NOT full credit)