Chapter 4
Displaying Quantitative Data
What I will know and be able to do:
Summarize quantitative data using an appropriate display
and be able to interpret the shape, center, and spread of
the data.
Assignment:
Read Chapter 4
Dealing With a Lot of Numbers…
•
Summarizing the data will help us when we look at
large sets of quantitative data.
•
Without summaries of the data, it’s hard to grasp
what the data tell us.
•
The best thing to do is to make a picture…
•
We can’t use bar charts or pie charts for quantitative
data, since those displays are for
categorical
variables
.
Slide 4- 3
Histograms: Displaying the Distribution of Price
Changes
•
The chapter example discusses the changes in Enron’s stock
price from 1997 – 2001.
•
First, slice up the entire span of values covered by the
quantitative variable into equal-width piles called
bins.
Slide 4- 4
Histograms: Displaying the Distribution
of Price Changes (cont.)
•
A
histogram
plots
the bin counts as
the heights of bars
(like a bar chart).
•
Here is a
histogram of the
monthly price
Slide 4- 5
Histograms: Displaying the Distribution
of Price Changes (cont.)
• A relative frequency histogram displays the percentage of cases in each bin instead of the count.
• In this way, relative frequency histograms are faithful to the area principle.
Slide 4- 6
Creating Histograms
• Used with numerical data
• Bars touch on histograms
• Two types
•
Discrete
•
Bars are centered over discrete values
•
Continuous
•
Bars cover a class (interval) of values
Slide 4- 7
Creating Histograms
• Used with numerical data
• Bars touch on histograms
• Two types
•
Discrete
•
Bars are centered over discrete values
•
Continuous
•
Bars cover a class (interval) of values
• For comparative histograms – use two separate graphs with the same scale on the horizontal axisWould a histogram be a good graph for the
number of pieces of gum chewed per day by AP Stat students? Why or why not?
Would a histogram be a good graph for the fastest speed driven by AP Stat students?
Making a Histogram
For an agility test, fourth grade children jump from side to side across a set of parallel lines, counting the number of lines they clear in 30
seconds. Here are their scores:
22, 17, 18, 29, 22, 22, 23, 24, 23, 17, 21, 25, 20, 12, 19, 28, 24, 22, 21, 25, 26, 25, 16, 27, 22
Calculator tip
• Your calculator will make a fine histogram and will choose a bin width for you.
• Use 9:Zoomstat most of the time.
• But you should be able to go into the Window settings and adjust the bin width.
• Experimenting with different bin widths on your calculator will give you a good feel for how the same data can be presented differently.
Stem-and-Leaf Displays
•
Stem-and-leaf displays
show the distribution of a
quantitative variable, like histograms do, while
preserving the individual values.
•
Stem-and-leaf displays contain all the information
found in a histogram and, when carefully drawn,
satisfy the area principle and show the distribution.
Stem-and-Leaf Example
•
Compare the histogram and stem-and-leaf display for the
pulse rates of 24 women at a health clinic. This is the data:
56,88,60,72,80,64,80,80,80,72,64,64,84,68,84,72,68,76,
68,76,68,76,76,72
•
Which graphical display do
you
prefer?
Constructing a Stem-and-Leaf Display
•
First, cut each data value into leading digits (“stems”)
and trailing digits (“leaves”).
•
Use the stems to label the bins.
•
Use only one digit for each leaf—either round or
truncate the data values to one decimal place after
the stem.
Slide 4- 13
Creating Stem-and-Leaf Plots, pg. 51 #12
The Cornell Lab of Ornithology holds an annual Christmas Bird Count, in which birdwatchers at various locations around the country see how
many different species of birds they can spot. Here are some of the counts reported from sites in Texas during the 1999 event.
Slide 4- 14
Example - Creating Stem-and-Leaf Plots
• Select one or more leading digits for the stem values. The trailing digits become the leaves.
• List possible stem values in a vertical column.
• Record the leaf for every observation beside the corresponding stem value (separate with commas if leaves are more than one digit)
Slide 4- 15
Example Creating Stem-and-Leaf Plots (cont)
Stem Leaf
15 2 3 3 6 7
16 0 0 2 2 3 6 7 17 1 7 8
18 1 3 6 19
20 6 6 21
22 8
Slide 4- 16
Example 2 - Creating Stem-and-Leaf Plots
A comparative stem-and-leaf plot is used when two groups of data are to be analyzed together. One group will extend to the left of the stem and the other group will extend to the right.
The UNICEF report “Progress for Children” (April, 2005) included the accompanying
data on the percentage of primary-school-age children who were enrolled in school for 19 countries in Northern Africa and for 23 countries in Central Africa.
Northern Africa
54.6 34.3 48.9 77.8 59.6 88.5 97.4 92.5 83.9 96.9 88.9 91.6 97.8 96.1 92.2 94.9 98.6 86.6
Central Africa
58.3 34.6 35.5 45.4 38.6 63.8 53.9 61.9 69.9 43.0 85.0 63.4 58.4 61.9 40.9 73.9 34.8 74.4 97.4 61.0 66.7 79.6 98.9
Dotplots
•
A
dotplot
is a simple
display. It just places a
dot along an axis for
each case in the data.
•
The dotplot to the right
shows Kentucky Derby
winning times, plotting
each race as its own dot.
•
You might see a dotplot
displayed horizontally or
vertically.
Think Before You Draw, Again
•
Remember the “Make a picture” rule?
•
Now that we have options for data displays, you need
to
Think carefully
about which type of display to
make.
•
Before making a stem-and-leaf display, a histogram,
or a dotplot, check the
•
Quantitative Data Condition:
The data are values of
a quantitative variable whose units are known.
Shape, Center, and Spread
•
When describing a distribution, make sure to always tell
about three things:
shape
,
center
, and
spread
, and anything
unusual
you see…
What is the Shape of the Distribution?
1.
Does
the
histogram have a single, central hump or several
separated humps?
2.
Is the histogram symmetric?
3.
Do any unusual features stick out?
Humps
1.
Does the histogram have a single, central hump or
several separated bumps?
•
Humps in a histogram are called modes.
•
A histogram with one main peak is dubbed unimodal;
Humps (cont.)
Humps (cont.)
•
A histogram that doesn’t
appear to have any
mode and in which all
the bars are
approximately the same
height is called
uniform
:
•
For example, we would
expect a 6-sided die to
produce a uniform
Symmetry
2. Is the histogram symmetric?
• If you can fold the histogram along a vertical line through the
middle and have the edges match pretty closely, the histogram is symmetric.
Symmetry
(cont.)
•
The (usually) thinner ends of a distribution are called the
tails. If
one tail stretches out farther than the other, the histogram is said
to be skewed
to the side of the longer tail.
•
In the figure below, the histogram on the left is said to be skewed
left, while the histogram on the right is said to be skewed right.
Anything Unusual?
3.
Do any unusual features stick out?
•
Sometimes it’s the unusual features that tell us
something interesting or exciting about the data.
•
You should always mention any stragglers, or
outliers
,
that stand off away from the body of the distribution.
•
Are there any
gaps
in the distribution? If so, we might
have data from more than one group.
Anything Unusual? (cont.)
•
The following histogram has outliers—there are three cities
in the leftmost bar:
Timeplots: Order, Please!
•
For some data sets, we are interested in how the data
What Can Go Wrong?
•
Don’t make a histogram of a categorical variable—bar charts
or pie charts should be used for categorical data.
•
Don’t look for shape,
center, and spread
of a bar chart.
What Can Go Wrong? (cont.)
• Don’t use bars in every display—save them for histograms and bar charts.
• Below is a badly drawn plot and the proper histogram for the number of juvenile bald eagles sighted in a collection of weeks: