INFERENTIAL STATISTICS - THINKING MATHEMATICS

THE NULL HYPOTHESIS Here’s something “fun” …

PUZZLE: A disease is spreading across the country at an alarming rate.

Fifty percent of the people who get it, get better on their own. The remaining fifty percent die.

Two serums, A and B, has been developed hurriedly and little time has been given to test them. The only information available right now is:

• 3 patients with the disease who were given serum A all survived

• 7 out of 8 patients who were given serum B survived.

You have just learned that you have the disease. Which serum should you take?

Some comments …

“Three-out-of-three” is a 100% success rate, but only three test patients isn’t much to go on.

“Seven-out-of-eight” is not perfect, but it is a larger sample.

Which seems more promising?

KEY IDEA:

• Assume that serum A has no effect and ask: How likely is it that 3/3 people would naturally survive on their own?

• Assume that serum B has no effect. How likely is it that 7/8 people would survive on their own?

Answers:

If serum A has no effect that the chances that three people would all naturally survive is: 1 1 1 1

12.5%

2× × = =2 2 8 .

If serum B has no effect, then the chances that 7/8 people survive is:

1 1 1 1 1 1 1 1 1

8 3.125%

2 2 2 2 2 2 2 2 32

× × × × × × × × = =

It is quite unlikely that we’d see 7/8 people surviving if serum B had no effect.

(Far more unlikely than seeing 3/3 people survive.) We conclude then: there is a good chance that serum B is having an effect.

I would take serum B.

□

The act of assuming that there is no effect at play and working to see where that assumption leads is called testing the null hypothesis.

EXAMPLE: You have a suspect coin in hand.

a) You toss the coin 10 times and get ten HEADS in a row. Would you likely conclude that the coin is biased?

b) Suppose instead when you tossed the coin you got nine HEADS out of ten.

Are you still likely to conclude that the coin is biased?

c) What if you got 8 HEADS out of ten? Just seven?

Answer: Let’s test the null hypothesis and assume for the moment that the coin is fair (that is, that nothing suspect is going on).

a) What are the chances of naturally getting 10/10 heads with a fair coin?

1 10 1 2 1024 0.1%

  = ≈

  

With 99.9% confidence I would say that the coin is biased.

(Note: There is a 0.1% chance that I am wrong.)

b) What are the chances of receiving 9/10 heads with a fair coin?

10! 1 10

1!9! 2 1.0%

⋅    ≈

In this case I would say, with 99.0% confidence, that the coin is biased.

c) What are the chances of receiving 8/10 heads naturally?

10! 1 10

2!8! 2 4.4%

⋅   ≈

 

With 95.6% confidence I would say that the coin is biased.

COMMENT: There is an issue of wording here. The chances of seeing eight or more heads, as one might phrase a test for bias, is greater than 4.4%.

In fact, let’s make a table:

With 7/10 heads I am less confident to conclude that the coin is biased.

Even less so for 6/10 heads.

□

NOTICE: We’ve presented here a table displaying the likelihood of each and every possible outcome. This is an example of a distribution.

Here we have also talked about confidence in making some kind of inference about the meaning of a result.

We’re now in the thick of inferential statistics.

[Comment: The distribution in the above table is called the binomial distribution.]

DISTRIBUTIONS

Loose Definition: A distribution is a table or a diagram that illustrates the frequency of measurements or counts from an experiment or study.

Histograms lead to distributions.

For example, consider the following histogram displaying the heights of 1000 people:

If the heights of the bars are percentages, not actual counts, then this diagram has total area 1.

We can make the information displayed on the diagram more precise by choosing small category intervals.

and smaller and smaller …

In the limit we get a smooth curve of area 1, the “height distribution curve.”

Of course, one cannot do this in practice, but we do like to think that human heights follow some kind of smooth distribution curve of area 1.

Then we like to say … the probability that someone chosen at random has height 72"≤ ≤x 78" is given by …

(72 78)

P ≤ ≤x = the fraction of area above the interval 72≤ ≤x 78

Since the area of the whole curve is 1, this fraction of area matches actual area above the interval 72≤ ≤x 78.

Better Definition: A distribution for a quantity X (such as height or foot length) is a curve with area 1 such that the probability that a randomly chosen value for X lies between

a

and

b

is:

(P a≤ X ≤b) = the area under the curve from

a

^to

b

This is abstract!

Usually in practice, one doesn’t actually know any formula for the distribution curve a quantity X seems to follow.

ARTIFICIAL EXERCISE: Suppose people’s ages among the world’s population is distributed as follows:

a) Verify that the area under this distribution is indeed 1.

b) A person is chosen at random. According to this model, what are the chances that this person is between 20 and 30 years old?

Comment: People like to use the following adjectives for distributions:

BIG QUESTION …

How does one find or estimate the distribution for a quantity?

Answer: Take some samples and make a guess based on what you observe.

But there is something deeper going on …

ACTIVITY: What is the distribution of heights of the people in this class?

What is the mean and the standard deviation?

Here’s a curious idea:

What if we took another class somewhere in the nation with the same number participants and worked out their mean height? And say we did this for a third class as well. Actually for 1000 other classes!

We’d expect all the means we calculate to be close to each other.

The means have their own distribution.

Here’s the curious thing …

In the 1700s, when scientists conducted experiments multiple times and computed the average result or aggregate result over many runs of the same experiment, they noticed that the means always seem to follow a bell-shaped curve – no matter the type of experiment was being conducted:

They thought this odd.

Human height comes in a bell-shaped curve. (The height of each human is the aggregate effect of growth rates of a collection of cells. Thus each human is the mean result of a “collection of experiments.”)

The lengths of carrots come in a bell-shaped curve. (Each carrot is the aggregate result of cell growth.)

Scholars began work on identifying this curve and finding a formula for it.

[These scholars include Gauss (~ 1820), Laplace (1818) and Lyapunov (1901).]

This special curve is today called the normal distribution.

These scholars managed to prove the famous “central limit theorem,” which we shall discuss next.

For those interested …

The normal distribution follows the formula

mean of the original experiment and

σ

is the standard deviation of the original experiment.

In document THINKING MATHEMATICS (Page 137-147)