• No results found

swm02_25.ppt

N/A
N/A
Protected

Academic year: 2020

Share "swm02_25.ppt"

Copied!
22
0
0

Loading.... (view fulltext now)

Full text

(1)
(2)

Chapter 25

(3)

Paired Data

 Data are paired when the observations are collected in

pairs or the observations in one group are naturally related to observations in the other group.

 Paired data arise in a number of ways. Perhaps the most

common is to compare subjects with themselves before and after a treatment.

 When pairs arise from an experiment, the pairing is a

type of blocking.

 When they arise from an observational study, it is a

(4)

Paired Data (cont.)

 If you know the data are paired, you can (and must!) take

advantage of it.

 To decide if the data are paired, consider how they

were collected and what they mean (check the W’s).

 There is no test to determine whether the data are

paired.

 Once we know the data are paired, we can examine the

pairwise differences.

 Because it is the differences we care about, we treat

(5)

Paired Data (cont.)

Now that we have only one set of data to

consider, we can return to the simple one-sample

t-test.

Mechanically, a

paired

t

-test

is just a one-sample

t-test for the mean of the pairwise differences.

(6)

Assumptions and Conditions

Paired Data Assumption:

The data must be paired.

Independence Assumption:

The differences must

be independent of each other. Check the:

Randomization Condition

Normal Population Assumption:

We need to

assume that the population of

differences

follows a

Normal model.

Nearly Normal Condition:

Check this with a

(7)

The Paired

t

-Test

When the conditions are met, we are ready to test

whether the paired differences differ significantly

from zero.

We test the hypothesis H

0

:

d

=

0

, where the d’s

(8)

The Paired

t

-Test (cont.)

 We use the statistic

where n is the number of pairs.

 is the ordinary standard error for the mean

applied to the differences.

 When the conditions are met and the null hypothesis is true, this

statistic follows a Student’s t-model on n – 1 degrees of freedom, so we can use that model to obtain a P-value.

 

0

1 n

d

t

SE d

 

 

sd

SE d

n

(9)

Confidence Intervals for Matched Pairs

 When the conditions are met, we are ready to find the confidence

interval for the mean of the paired differences.

 The confidence interval is

where the standard error of the mean difference is

The critical value t* depends on the particular confidence level, C, that you specify and on the degrees of freedom, n – 1, which is based on the number of pairs, n.

 

s

d

SE d

n

 

1

n

(10)

Example 1: Pulse Rates Once Again

One indicator of physical fitness is resting pulse

rate. Ten men volunteered to test an exercise

device advertised on television by using it three

times a week for 20 minutes. Their resting pulse

rates were measured before the test began, and

then again after six weeks. Results are shown in

the table. Is there evidence that this kind of

(11)

Example 1: Table of Data

Pulse Rates(bpm)

Subject Before After

Allen 73 73

Brandon 83 79

Carlos 85 81

David 87 86

Edwin 91 87

Franco 99 91

Graeme 87 84

Hans 85 83

(12)

Example 1: continued

The before and after measurements are not

independent – we CAN’T use a two sample t-test.

When this happens we will look at the paired

(13)

Example 1: continued

Hypothesis:

The null hypothesis is that there will be no difference in the pulse rates before and after using the exercise device. The alternative hypothesis is that the exercise device will cause a decrease the resting pulse rate. The difference will be final pulse rate – beginning pulse rate.

(14)

Example 1: continued

Model:

1.

The men are not selected randomly, but we assume

that they are a representative sample of adult men.

2.

Independence: The pulse rate of one man does not

influence the pulse rate of another.

3.

10 is certainly less than 10% of all men.

4.

Since n<30, we will do a boxplot. Since the graph of

the paired differences is unimodal and

(15)
(16)

Example 1: continued

The P value of .0034 is much lower than our level of

significance of .05. Therefore, using the matched

pairs t-test, we reject the null hypothesis. There

is strong evidence that this form of exercise can

reduce resting pulse rates.

90% confidence interval: (-4.266,-1.334)

I am 90% confident that mean decrease in the

(17)

Blocking

 Consider estimating the

mean difference in age between husbands and wives.

 The following display is

worthless. It does no good to compare all the wives as a group with all the

(18)

Blocking (cont.)

 In this case, we have paired data—each husband is

paired with his respective wife. The display we are interested in is the difference in ages:

(19)

Blocking (cont.)

Pairing removes the extra variation that we saw in

the side-by-side boxplots and allows us to

concentrate on the variation associated with the

difference in age for each pair.

(20)

*The Sign Test Again?

 Because we have paired data, we’ve been using a simple

t-test for the paired differences.

 This suggests that if we want a distribution-free method, a

sign test on the paired differences testing whether the

median of the differences is 0 is appropriate.

 The advantage of the sign test for matched pairs is that

we don’t require the Nearly Normal Condition for the paired differences.

 However, if the assumptions of a paired t-test are met,

(21)

What Can Go Wrong?

Don’t use a two-sample

t-test for paired data.

Don’t use a paired-t method when the samples

aren’t paired.

Don’t forget outliers—the outliers we care about

now are in the differences.

Don’t look for the difference in side-by-side

(22)

What have we learned?

 Pairing can be a very effective strategy.

 Because pairing can help control variability between

individual subjects, paired methods are usually more powerful than methods that compare individual groups.

 Analyzing data from matched pairs requires different

inference procedures.

 Paired t-methods look at pairwise differences.

 We test hypotheses and generate confidence intervals based

on these differences.

 We learned to Think about the design of the study that

References

Related documents

Massively multiplayer online role-playing games (MMORPGs) are a fast-growing computer game genre; they are multiplayer role-playing games (RPGs) played online over the internet in

Real Internet Malware Sample Configuration Analysis Results Internal logs Traffic Log Emulated Internet Filtering Rules Sandbox Access Controller OS Image Config Analysis

On June 9, 2003, the Tax Discovery Bureau (Bureau) of the Idaho State Tax Commission issued a Notice of Deficiency Determination (NODD) to [Redacted] (taxpayers), proposing income

a) sympathetic nerve stimulation of the kidney. b) atrial natriuretic hormone secretion. b) the rate of blood flow through the medulla decreases. c) the rate of flow through the

Key words: bile duct proliferation, chlordanes, dichlorodiphenyltrichloroethane, dieldrin, East Greenland, HCB, hexacyclohexanes, Ito cells, lipid granulomas, liver, mononuclear

With all four financing mechanisms, price control results in a price decrease in both the monopolistic and the competitive region due to the incumbent ’s strong market power.. There

A failure mode, effects and criticality analysis of rolling stock critical systems is conducted in (8) and the outcome is used to further proposed a generic

How Many Breeding Females are Needed to Produce 40 Male Homozygotes per Week Using a Heterozygous Female x Heterozygous Male Breeding Scheme With 15% Non-Productive Breeders.