The raw data for a sample of 12 students from our population of 95 are given below in dollars.
25 30 35 38 34 22 30 40 35 25 32 40
Since there are not many responses, a stem plot is an appropriate way of displaying the data.
To summarise and comment further on the sample, it is useful to use some of the summary statistics covered earlier in this topic. The most effi cient way to calculate these is to use CAS. Using the steps outlined in the previous sections, we obtain a list of summary statistics for these data. x= 32.2 s= 6 Q1 = 27.5 median= 33 Q3 = 36.5
To measure the centre of the distribution, the median and the mean are used. Since there are no outliers and the distribution is approximately symmetric, the mean is quite a good measure of the centre of the distribution. Also, the mean and the median are quite close in value.
Stem 2 2* 3* 3* 4 Leaf 2 5 5 0 0 2 4 5 5 8 0 0 Key: 2
|
2 = 22 dollarsGenerate 5 random numbers (integers) between 1 and 50.
THINK WRITE
1 Find the appropriate menu in CAS to generate random integers.
2 Generate 5 random numbers between 1 and 50.
3 An example of a set of numbers is displayed. {48, 46, 8, 26, 21}.
WOrKed eXaMPLe
To measure the spread of the distribution, the standard deviation and the interquartile range are used. Since s = 6, and since the distribution is approximately bell-shaped, we would expect that approximately 95% of the data lie between 32.2 + 12 = 44.2 and 32.2 − 12 = 20.2 (as shown in section 1.11). It is perhaps a little surprising to think that 95% of students spend between $20.20 and $44.20 on family presents. One might have expected there to be greater variation on what students spend. The data, in that sense, are quite bunched.
The interquartile range is equal to 36.5 − 27.5 = 9. This means that 50% of those in the sample spent within $9 of each other on family presents. Again, one might have expected a greater variation in what students spent. It would be interesting to know whether students confer about what they spend and therefore whether they tended to allocate about the same amount of money to spend.
At another school, the same investigation was undertaken and the results are shown in the following stem plot. The summary statistics for these data are as follows:
x = 47.5, s = 16.3, Q1= 35, median = 50, Q3 = 60.
The distribution is approximately symmetric, albeit very spread out. The mean and the median are therefore reasonably close and give us an indication of the centre of the distribution. The mean value for this set of data is higher than for the data obtained at the other school. This indicates that students at this school in this year level, in general, spend more than their counterparts at the other school. Reasons for this might be that this school is in a higher socio-economic area and students receive greater allowances, or perhaps at this school there is a higher proportion of students from cultures where spending more money on family presents is usual.
The range of money spent on family presents at this school and at this particular year level is $55. This is certainly much higher than at the other school. The interquartile range at this school is $25. That is, the middle 50% of students spend within $25 of each other which is greater than the students at the other school.
Populations and simple random samples
1 WE24 Generate 5 random numbers (integers) between 1 and 100. 2 Generate 10 random numbers (integers) between 1 and 250.
Stem 2 2* 3 3* 4 4* 5 5* 6 6* 7 7* Leaf 0 5 5 5 5 5 0 0 5 5 0 0 5 5 Key: 2
|
2 = 22 dollars EXERCISE 1.10 PRACTISE3 Students are selecting a sample of students at their school to complete an investigation. Which of the following are examples of choosing this sample randomly?
A Choosing students queuing at the tuckshop
B Assigning numbers to a list of student names and using a random number table to select random numbers
C Calling for volunteers
D Choosing the girls in an all-girls science class
E Choosing students in a bus on the way home
4 Generate 10 random numbers (integers) between 1 and 100.
5 Generate 20 random numbers (integers) between 1 and 500.
6 Which is larger: A population or a sample? Explain why.
7 When selecting students for a simple random sample of a year level, the students selected should be:
A of similar age
B a group of mates
C independent
D female
E the tallest students
8 The students selected for a simple random sample of a year level should be selected by:
A a group of mates
B a group who all dance
C a selection of males
D the students with the best test results
E using random numbers
9 Would the mean be a good measure of the centre of the distribution shown at right? Explain.
10 The mean is a good measure of the centre of a distribution if the data is:
A skewed left
B symmetric
C skewed right
D has outliers
E bimodal
11 The interquartile range is 12, since Q1= 24 and Q3 = 36.
The percentage of data that fit between 24 and 36 is:
A 12%
B 30%
C 50%
D 68%
E 95%
12 Conduct an investigation into how much money students in your year level earn per week (this might be an allowance or a wage). Write a report on your findings, ensuring you include:
CONSOLIDATE Stem 0 1 2 3 4 5 6 Leaf 7 0 2 3 6 7 8 8 2 4 5 7 9 9 2 3 4 7 8 1 3 7 2 8 Key: 1
|
2 = 12a an explanation of the population for your investigation
b the manner in which your sample was selected
c the number in your sample
d your results as raw data
e your results in a stem plot or histogram
f the summary statistics for your data.
Comment on your results based on the summary statistics.
13 Repeat question 12, but this time investigate the following for students in your year level:
a the number of hours spent on homework each week
b the number of hours spent working in part-time jobs.
14 Conduct a similar investigation to that which you completed in questions 12
and 13; however, this time sample students in another year group. Compare these data with those obtained for your year level.