Variance & standard deviation

(1)

Level ALevel

ExamBoard Edexcel

Subject Mathematics

Module Mechanics andStatistics

Topic Location andspread

Sub-Topic Variance & standard deviation

Booklet Model Answers 1

Variance & Standard Deviation

Model Answers1

40minutes / 32 /100 TimeAllowed: Score: Percentage: A* A B C D E U >85% 77.5% 70% 62.5% 57.5% 45% <45% Grade Boundaries:

(2)

Estimate the mean and the standard deviation of the mass of the contents of a

packet of Kaff coffee to 1 decimal place. (3)

1. Kaff coffee is sold in packets. A seller measures the masses of the contents of a random sample of 90 packets of Kaff coffee from her stock. The results are shown in the table below.

The mean can be worked out by using the following formula: μ = ∑ "#_$

where f is the frequency, n = ∑ % and y is the mid-point for each interval. μ = &'&.) * +,&'-.) * .),&)/ * 0),&)0.) * &0,&)1.) * 2_2/

μ = &&)0).)_2/

μ = 250.4

σ2₌∑ "#3

$ – μ2

where f is the frequency, n = ∑ % and y is the mid-point for each interval. σ2₌)-''.1..1)

2/ – 250.42

(3)

The following grouped frequency distribution summarises the number of minutes, to the nearest minute, that a random sample of 200 motorists were delayed by roadworks on a stretch of motorway.

Delay (mins) Number of motorists

4—6 15 7—8 28 9 49 10 53 11—12 30 13—15 15 16—20 10

(a) Use interpolation to estimate the median of this distribution.

2.

(2 )

We need to keep in mind that the time is shown to the nearest minute. Therefore, the interval 4-6 minutes is considered the interval 3.5-6.5 minutes. Similarly, we represent all intervals.

The total frequency is 200.

The median value would be 200/2 = 100th

Using cumulative frequency, we work out that the 100th_{value corresponds to}

(4)

(b) Calculate an estimate of the mean and an estimate of the standard deviation

of these data. (6 )

We use linear interpolation:

Median = lower class boundary + !"#$%& '( )*%#+ ", *' *-% #%.)/!_{!& '( )*%#+ )! *-% 01/++} x class width In our case, the lower class boundary of the interval 9.5 ≤ x < 10.5 is 9.5. The class width is 1 and the number of items in the class is 53.

The number of items up to the median is: 100 – 15 – 28 – 49 = 8 Q2= 9.5 + ₄₅3 x 1

Q2 = 9.65

To calculate an estimate of the mean we first need to consider the mid-point of each interval. This way, we estimate that each motorist in an interval was delayed by the same time, the mid-point.

(5)

(Total 8 marks)

The mean can be worked out by using the following formula: μ = ∑ "#_$

where f is the frequency, n = ∑ % and x is the mid-point of each interval. μ = & ' (&)*.& ' ,- ). ' /.)(0 ' &1)((.& ' 10)(/ ' (&)(- ' (0_,00

μ = (..(_,00

μ = 9.955 (9 minutes and 57 seconds)

σ2₌∑ "#2

$ – μ2

where f is the frequency, n = ∑ % and x is the mid-point of each interval. σ2₌,(133.&

,00 – 9.9552

(6)

Number leaving 3 2 means 32 Totals 2 7 9 9 (3) 3 2 2 3 5 6 (5) 4 0 1 4 8 9 (5) 5 2 3 3 6 6 6 8 (7) 6 0 1 4 5 (4) 7 2 3 (2) 8 1 (1)

For these data,

(a) write down the mode, (1)

(b) find the values of the three quartiles. (3)

3. Over a period of time, the number of people x leaving a hotel each morning was recorded. These data are summarised in the stem and leaf diagram below.

The mode represents the value in the dataset which appears most often. Using the key we see that the number of people 56 is seen 3 times.

The mode is 56.

The total frequency is 3 + 5 + 5 + 7 + 4 + 2 + 1 = 27

The median value would be (27 + 1)/2 = 14th_{position (27 is an odd number so we}

add 1 before dividing to work out the median value).

Using cumulative frequency, we work out that the 14th_{value corresponds to the 4}th

row in the diagram.

Median = 1st_{number on the row +}!"#$%& '( )*%#+ ", *' *-% #%.)/!

!& '( )*%#+ '! *-% &'0 x number of items on the row.

(7)

In our case, the 4th_{row in the diagram has the 1}st_{value 52. The number of values on}

the 4th_{row is 7 and the class width here represents also the number of items in the}

row.

The number of items up to the median is: 14 – 3 – 5 – 5 = 1 Q2= 52 + !_"x 7

Q2 = 52

The lower quartile separates the first 25% of the dataset from the upper 75% of the dataset.

This will be the mark in the position: 27/4 + 0.5 = 7th_{(the quarters are rounded off)}

Looking at the diagram, we work out that at the 7th_{position is 35.}

Q1 = 35

The upper quartile separates the first 75% of the dataset from the upper 25% of the dataset.

This will be the mark in the position: 27 x 3/4 + 0.5 = 21th

Looking at the diagram, we work out that at the 7th_{position is 60.}

Q3 = 60

(8)

(4) Given that Δx = 1335 and Δx2 _{= 71 801, find}

(c) the mean and the standard deviation of these data.

(Total 8 marks)

The mean can be worked out by using the following formula: μ = ∑ "_#

where f is the frequency, n = ∑ $ and " represents the total number of people obtained.

μ = %&''₍₎

μ = 49.4

σ2₌∑ "*

# – μ2

where f is the frequency, n = ∑ $ and " represents the total number of people obtained.

σ2₌)%+,%

() – 49.42

(9)

Summarised below are the distances, to the nearest mile, travelled to work by a random sample of 120 commuters.

(2) Distance

(to the nearest mile) Number of commuters

0–9 10 10–19 19 20–29 43 30–39 25 40–49 8 50–59 6 60–69 5 70–79 3 80–89 1

For this distribution,

(a) use linear interpolation to estimate its median. 4.

We need to keep in mind that the distance is shown to the nearest mile. Therefore, the interval 0-9 miles is considered the interval 0.5-9.5 miles. Similarly, we represent all intervals.

(10)

(3) The mid-point of each class was represented by x and its corresponding frequency by f giving

Σfx = 3550 and Σfx 2 _{= 138020}

(b) Estimate the mean and the standard deviation of this distribution. We use linear interpolation:

Median = lower class boundary + !"#$%& '( )*%#+ ", *' *-% #%.)/!_{!& '( )*%#+ )! *-% 01/++} x class width

In our case, the lower class boundary of the interval 19.5-29.5 is 19.5. The class width is 10 and the number of items in the class is 49.

The number of items up to the median is: 60 – 10 – 19 = 31 Q2 = 19.5 + 23₄₂x 10

Q2 = 26.7

To calculate an estimate of the mean we first need to consider the mid-point of each interval. This way, we estimate that each motorist in an interval was delayed by the same time, the mid-point.

(11)

(Total 5 marks) The mean can be worked out by using the following formula:

μ = ∑ "#_$

where f is the frequency, n = ∑ % and x is the mid-point of each interval. μ = &''(_)*(

μ = 29.583

σ2₌∑ "#+

$ – μ2

where f is the frequency, n = ∑ % and x is the mid-point of each interval. σ2₌)&,(*(

)*( – 29.5832

σ = 16.58

Going back to point a), for a positive (right) skewed distribution the rule of thumb is that the mean is to the right of the median, fact which was proved at points b) and c)

(12)

The birth weights, in kg, of 1500 babies are summarised in the table below.

Weight (kg) Midpoint, xkg Frequency, f

0.0 – 1.0 0.50 1 1.0 – 2.0 1.50 6 2.0 – 2.5 2.25 60 2.5 – 3.0 280 3.0 – 3.5 3.25 820 3.5 – 4.0 3.75 320 4.0 – 5.0 4.50 10 5.0 – 6.0 3

[You may use

∑

fx = 4841 and

∑

fx2 _{= 15889.5]}

(3)

Calculate an estimate of the standard deviation of the birth weight.

5.

The formula for the standard deviation is:

σ2₌∑ "#$

% – μ2

where f is the frequency, n = ∑ &, x is the mid-point of each interval and μ is the mean.

Therefore, we need to work out the mean first.

The mean can be worked out by using the following formula: μ = ∑ "#_%

where f is the frequency, n = ∑ & and x is the mid-point of each interval. μ = '(')_)*++

(13)

Time, t Number of students 10 – 19 2 20 – 29 4 30 – 39 8 40 – 49 11 50 – 69 5 70 – 79 2

(a) Use interpolation to estimate the value of the median. (2) Given that

and

6. On a randomly chosen day, each of the 32 students in a class recorded the time, t minutes to the nearest minute, they spent on their homework. The data for the class is summarised in the following table.

∑

t = 1414

∑

t2 _{= 69378} The time is shown to the nearest minute.

Therefore, the interval 10-19 minutes is considered the interval 9.5-19.5 minutes. Similarly, we represent all intervals.

(14)

Using cumulative frequency, we work out that the 16th_{value corresponds to the}

interval 39.5-49.5.

Median = lower class boundary + !"#$%& '( )*%#+ ", *' *-% #%.)/!_{!& '( )*%#+ )! *-% 01/++} x class width In our case, the lower class boundary of the interval 39.5-49.5 is 39.5. The class width is 10 and the number of items in the class is 11.

The number of items up to the median is: 16 – 2 – 4 – 8 = 2 Q2 = 39.5 + ₃₃2 x 10

Q2 = 41.3

(b) Find the mean and the standard deviation of the times spent by the

students on their homework. (3)

The mean can be worked out by using the following formula: μ = ∑ 5₆

where f is the frequency, n = ∑ 7 and ∑ 5 represents the total time obtained. μ = 3838₉₂

μ = 44.1875

σ2₌∑ 5:

6 – μ2