Rank-Based Non-Parametric Tests
Non-Parametric Tests
Reminder: Student Instructional Rating Surveys
You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
The survey should be available on any device with a full-featured web browser. Please take the time to fill it out. Your answers:
• Will be anonymous
• Will help me to improve my teaching strategies and the structure of the course
• Will help the department in planning and designing future courses
• Will be used by the university in promotion, tenure, and
reappointment decisions
Non-Parametric Tests
Parametric and Nonparametric Tests
• Most of the statistical tests that we have used throughout the semester have relied on certain specific assumptions about the distribution of the involved variables and/or their means, and have been set up to test hypotheses about specific
population parameters
• Such tests (z-tests, t-tests, ANOVAs, Pearson’s correlation)
are called parametric tests
Non-Parametric Tests
Parametric and Nonparametric Tests
• Though these parametric tests are robust to minor violations of their assumptions, they can lead to gross systematic errors when the data are strongly violate the underlying assumptions and can even be undefined for certain types of data (e.g.,
nominal or non-numerical data).
• Certain tests do not rely on specific distributional assumptions or test hypotheses about particular population parameters.
These tests are generally called nonparametric tests
• The chi-square tests introduced in the last lecture and
Spearman’s rank correlation coefficient test were examples of
nonparametric tests.
Non-Parametric Tests
Parametric versus Nonparametric Tests
• The advantages of parametric tests are that
– They are more powerful (i.e., you can detect smaller effect sizes with smaller samples) than comparable non-parametric tests when the parametric assumptions are correct (or approximately correct).
– The hypothesis tests are more specific and easier to interpret.
• The advantage of nonparametric tests are that
– They can be used when the distribution of the population is completely unknown
– They tend to be more robust to ill-behaved (e.g., non-normal, heteroscedastic, & multi-modal) data
– They are less sensitive to outliers
Non-Parametric Tests
Nonparametric tests
• In the last 70 years or so, statisticians have developed many different nonparametric tests.
• Those most widely used in the behavioral sciences tend to be rank randomization tests
• Rank randomization (or rank-permutation) tests are
hypothesis tests based on the theoretical distribution of
randomly assigned ranks. As a first step, they all require the conversion of raw scores to ordinal ranks
– This makes them obvious candidates for ordinal data, though these data
usually still need to be modified
Non-Parametric Tests
Rank Randomization Tests
Advantages of rank-based tests:
1. Ranks are simpler, and rank-based tests are easier to compute 2. They are largely insensitive to the particular form of the population
distributions and differences between the distributions underlying the scores in different samples
3. They tend to minimize the effects of large sample variances 4. They are insensitive to outlier scores and make it easier to deal
with undetermined scores (e.g., time to task completion)
5. The distribution of randomly assigned ranks can be computed
Non-Parametric Tests
Sample 1 Ranks
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 4
1 2 4
1 2 4
1 2 4
● ● ●
● ● ●
● ● ●
6 5 3
6 5 3
6 5 4
6 5 4
6 5 4
6 5 4
6 5 4
6 5 4
Sample 2 Ranks
4 5 6
4 6 5
5 4 6
5 6 4
6 4 5
6 5 4
3 5 6
3 6 5
5 3 6
5 6 3
● ● ●
● ● ●
● ● ●
4 1 2
4 2 1
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
720
permutations
Non-Parametric Tests
720 permutations
Sample 1 Ranks R
1= Σr
1 2 3 6
1 2 3 6
1 2 3 6
1 2 3 6
1 2 3 6
1 2 3 6
1 2 4 7
1 2 4 7
1 2 4 7
1 2 4 7
● ● ● ●
● ● ● ●
● ● ● ●
6 5 3 14
6 5 3 14
6 5 4 15
6 5 4 15
6 5 4 15
6 5 4 15
6 5 4 15
6 5 4 15
Sample 2 Ranks R
2= Σr
4 5 6 15
4 6 5 15
5 4 6 15
5 6 4 15
6 4 5 15
6 5 4 15
3 5 6 14
3 6 5 14
5 3 6 14
5 6 3 14
● ● ● ●
● ● ● ●
● ● ● ●
4 1 2 7
4 2 1 7
1 2 3 6
1 3 2 6
2 1 3 6
2 3 1 6
3 1 2 6
3 2 1 6
W
s= min(R
1,R
2) 6 6 6 6 6 6 7 7 7 7
●
●
●
7
7
6
6
6
6
6
6
Non-Parametric Tests
Non-Parametric Tests
Non-Parametric Tests
Rank Randomization Tests
Popular rank-based tests:
1. The Mann-Whitney U (or Wilcoxon rank-sum) test
– Nonparametric analogue to the independent-samples t-test
2. The Wilcoxon signed-rank test
– Nonparametric analogue to the matched-samples t-test
3. The Kruskal-Wallis test
– Nonparametric analogue to the one-way ANOVA (independent meas.)
4. The Friedman test
– Nonparametric analogue to the repeated-measures ANOVA
Non-Parametric Tests
A Note about Computing Ranks
• All of the rank-based tests will require that you compute ranks based on the total number of scores and from lowest to
highest
– I.e., if you have 3 samples with 5 scores each, the lowest overall score should be assigned the rank 1 and the highest overall score should be assigned the rank 15
• In case of ties, each tied score should be assigned the mean of the tied ranks
– I.e., if the 3 rd and 4 th lowest scores have the same value then you should assign them each a rank of 3.5, with the next highest value receiving a rank of 5.
– If the 7 th , 8 th , and 9 th scores are all tied, then you should assign them
each a rank of 8, with the next highest value receiving a rank of 10.
Non-Parametric Tests
Rank Randomization Tests
Aside from converting the raw scores to ranks, the logical steps are similar to those for the parametric hypothesis tests:
1. State the null and alternative hypotheses about the population.
2. Use the null hypotheses to predict the characteristics that the sample ranks should have.
3. Use the samples to compute the test statistic
4. Compare the test statistic with the hypothesis prediction
Non-Parametric Tests
The Mann-Whitney Test
• This is a test for independent-measures (between subjects) research designs with two groups and is thus an alternative to the independent-measures t-test
• The basic intuition behind the test is that:
– A real difference between the two treatments should cause the scores in one sample to be generally larger than the scores in the other sample
– If all the scores are ranked, the larger ranks should be concentrated in
one sample and the smaller ranks should be concentrated in the other
sample.
Non-Parametric Tests
The Mann-Whitney Test
• The null and alternative hypotheses are a bit more vague than for the t-test, but still test for some sort of difference in central tendency
– H 0 : There is no difference between treatments. Therefore, there is no tendency for ranks in one sample to be systematically higher or lower than in the other sample
– H 1 : There is a difference between treatments. Therefore, the ranks in
one sample should be systematically higher or lower than in the other
sample
Non-Parametric Tests
The Mann-Whitney Test
• When comparing two samples, the Mann-Whitney U statistic for sample 1 represents the sum of the number of scores in sample 2 outranked by scores in sample 1
• The smaller of the U values is looked up in the table
Treatment A Raw Scores Ranks
27 7
2 1
9 4
48 8
6 2
15 5
Treatment B Raw Scores Ranks
71 11
63 9
18 6
68 10
94 12
8 3
Non-Parametric Tests
Example
Treatment A
Raw Scores Ranks Points
27 7 2
2 1 0
9 4 1
48 8 2
6 2 0
15 5 1
6
Treatment B
Raw Scores Ranks Points
71 11 6
63 9 6
18 6 4
68 10 6
94 12 6
8 3 2
30
Non-Parametric Tests
The Mann-Whitney Test: Steps
In practice, the steps for computing the test statistic (U) are:
1. Rank all the observations from smallest to largest
2. Compute the sum of the ranks in each sample, using the following formulas to compute U statistics from the ranks R:
3. U is the smaller of the sums of these counts
1 1
1 1
( 1 2 ,
)
U R n n
U 2 R 2 n 2 ( n 2 2 1)
Non-Parametric Tests
Example
Treatment A Raw Scores Ranks
27 7
2 1
9 4
48 8
6 2
15 5
ΣR
127
Treatment B Raw Scores Ranks
71 11
63 9
18 6
68 10
94 12
8 3
ΣR
251
1 1
1 1
( 1 2 27 6(7) 2
)
7 21 6 2
U R n n
Non-Parametric Tests
1 2
6 6 6 U
n n
In this case, U crit = 5, so we
retain the null hypothesis
Non-Parametric Tests
The Mann-Whitney Test: Normal Approximation
When n 1 and n 2 are sufficiently large (e.g., n 1 ,n 2 ≥ 10) the distribution of rank sums becomes roughly normal and you can use a normal
approximation, evaluated against a critical z value, to test for significance.
1 2 U 2
n n
1 2 1
U 12
n n N
U
U
z U U
Non-Parametric Tests
Wilcoxon’s Signed Ranks Test
• This is a test for repeated-measures (within-subjects)
research designs with two treatment conditions and is thus an alternative to the repeated-measures t-test
• The basic intuition behind the test is that:
– A real difference between the two treatments should cause the difference scores to be generally positive or negative
– If all the difference scores are ranked and signed (according to whether they represent increases + or decreases -), the ranks should be
concentrated in either the positive or negative set.
Non-Parametric Tests
Wilcoxon’s Signed Ranks Test
• Again, the null and alternative hypotheses are a bit more
vague than for the repeated measures t-test, but test for some sort of difference in central tendency
– H 0 : There is no difference between treatments. Therefore, there is no tendency for the ranks of difference scores to be generally positive or negative
– H 1 : There is a difference between treatments. Therefore, the ranks of
the difference scores should be systematically positive or negative
Non-Parametric Tests
Wilcoxon’s Signed Rank Test: Steps
The steps for computing the test statistic (T) are:
1. Compute the difference scores
2. Rank all the difference scores from smallest to largest
absolute value and assign them positive or negative signs based on whether they represent an increment or decrement 3. Compute separate sums for the positively and negatively
signed sets
4. T is the smaller of the resulting signed-rank-sums
Non-Parametric Tests
Example
Non-Parametric Tests
Example
T = 9 N = 15
T crit is 25, which is greater
than our test statistic, so
we would reject the null
hypothesis
Non-Parametric Tests
Wilcoxon’s Signed Ranks Test: Normal Approximation
Again, when n is sufficiently large (e.g., n ≥ 20) the distribution of T becomes roughly normal and you can use a normal approximation, evaluated against a critical z value, to test for significance.
1
T 4
n n
( 1)(2 2)
T 24
n n n
1
4
( 1)(2 1) 24
T T
z T T
T n n
n n n
Non-Parametric Tests
The Kruskal-Wallis One Way ANOVA
• This is a test for independent-measures research designs with more than two groups. As its name suggests, it is a
nonparametric alternative to the parametric one-way ANOVA
• The basic intuition behind the test is analogous to that for the parametric one-way ANOVA:
– A real difference among treatments should cause the variability of
scores between groups to be greater than the variability of scores within groups
– If all the scores are ranked the variability of rank-sums between groups
Non-Parametric Tests
The Kruskal-Wallis One Way ANOVA
• The null and alternative hypotheses are very similar to those in the parametric one-way ANOVA.
– H 0 : There is no difference between treatments. There is no tendency for ranks in any sample to be systematically higher or lower than in any other condition.
– H 1 : There are differences between treatments. The ranks in at least one
condition are systematically higher or lower than in another treatment
condition
Non-Parametric Tests
2
1 2
k
i
k n
i j
i i T
ij i
n M M F C
x M
2
2 2
k
T i
k n
T i
i i
j j
i
r r r n
r H C
Parametric ANOVA: Kruskal-Wallis:
within