The size of a correlation is often described in words as well as in numbers. Correlations of .80 or above are usually talked of as being ‘large’, ‘strong’ or ‘high’. This size of correlation may be obtained when we measure the same variable, such as depression, on two separate occasions two weeks apart. In such a case, we may say there was a strong correlation between the first and second test of depression. Correlations of .30 or less are usually spoken of as being small, weak or low. Correlations of this size are typically found when we measure different variables, such as depression and social support, on the same or different occasions. Correlations between .30 and .80 are commonly said to be moderate or modest. They are usually shown when we assess very similar measures, but which are not the same, such as (1) how supportive one sees a partner as being and (2) how satisfied one is with that relationship.
These labels may be misleading in that they may seem to be underestimating the strength of a correlation. The meaning of the size of a correlation is better understood if we square the correlation value – this gives us something called the coefficient of determination. So a correlation of .20 when squared gives a coefficient of determination of .04. This value represents the proportion of the variation in a variable that is shared with the variation in another variable. Technically this variation is measured in terms of a concept or formula called variance. The way to calculate variance can be found in a statistics textbook such as the companion book Introduction to Statistics in Psychology (Howitt and Cramer, 2011a).
A correlation of 1.00 gives a coefficient of determination of 1.00, which means that the two variables are perfectly related. A correlation of zero produces a coefficient of determination of zero, which indicates that the variables are either totally separate or they do not have a straight-line relationship such as the relationship between work and leisure satisfaction in Figure 4.3. These proportions may be expressed as a percentage, which may be easier to understand. We simply multiply the proportion by 100 so .04 becomes 4 (.04 × 100). The percentage of the variance shared by the correlations in Table 4.1 is shown in the fourth column of that table.
If we plot the percentage of variance shared against the size of the correlation as shown in Figure 4.4 we can see that there is not a straight-line or linear relationship between
FIGURE 4.4 Relationship between correlation and percentage of shared variance
the two but what is called an exponential relationship. The percentage of variance increases at a faster rate at higher than lower correlations. As the size of a correlation doubles, the corresponding size of the percentage of shared variance quadruples. To give an example of this, a correlation of .40 is twice as big as one of .20. If we express these correlations as the percentage of shared variance we can see that a percentage of 16 is four times as big as one of 4. This should tell you that it is helpful to consider the amount of variation explained by a correlation and not simply the numerical size.
A correlation of .40 is not twice as good as a correlation of .2 because in terms of the amount of variation (variance) explained, the larger correlation accounts for four times the amount of variation. Table 4.1 gives the figures for the amounts of variation explained.
The verbal labels generally used to describe different sizes of the shared variance have tended to differ in the research literature from those given to the correlations that correspond to them. Where the percentage of shared variance is about 1, the size of the effect or the association has been called ‘small’ (Cohen, 1988, pp. 24–7). Where it is about 5 it has been described as being ‘medium’. Where it is more than about 10 it has been referred to as being ‘large’. These judgements are obviously subjective or personal to some extent. What one psychologist considers to be a large effect, another might think of as being small. We are inclined to think that 10 may be considered medium and above 20 as large. However, such a judgement does depend on a great many factors such as what is being measured and how accurately or reliably it can be measured. One would expect lower values for the coefficient of determination if it is based on variables which cannot be measured accurately.
Justification for the use of these labels might come from considering just how many variables or factors may be expected to explain a particular kind of behaviour. Racial prejudice is a good example of such behaviour. It is reasonable to assume that racial prejudice is determined by a number of factors rather than just a single factor. The tendency towards authoritarianism has a correlation of .30 with a measure of racial prejudice (Billig and Cramer, 1990). This means that authoritarianism shares 9 per cent of its variance with racial prejudice. On the face of things, this is not a big percentage of the variance. What if we had, say, another ten variables that individually and inde-pendently explained (accounted for) a similar proportion of the variance? Then we could claim a complete account of racial prejudice. The problem in psychology is finding out what these other ten variables are – or whether they exist. Actually a correlation of .30 is not unusual in psychological research and many other variables will explain considerably less of the variance than this.
There is another way of looking at this issue. That is to ask what the value is of a correlation of .30 – a question which is meaningless in absolute terms. In the above example, the purpose of the research was basically associated with an attempt to theorise about the nature of racial prejudice. In this context, the correlation of .30 would seem to imply that one’s resources would be better applied to finding more effective explanations of racial prejudice than can be offered on the basis of authorit-arianism. On the other hand, what if the researcher was interested in using cognitive behaviour therapy in suicide prevention? A correlation of .30 between the use of cognitive behaviour therapy and decline in the risk of suicide is a much more important matter – it amounts to an improvement in the probability of suicide prevention from .35 to .65 (Rosenthal, 1991). This is in no sense even a moderate finding: it is of major importance. In other words, there is a case against the routine use of labels when assessing the importance of a correlation coefficient.
There is another reason why we should be cautious about the routine application of labels to correlations or any other research result. Our measures are not perfectly reliable or valid measures of what they are measuring (see Chapter 15 for a detailed discussion of reliability and validity). Because they are often relatively poor measures of
what they are intended to measure, they tend usually to underestimate the true or real size of the association. There is a simple statistical procedure for taking into account the unreliability of the measures called the correction for attenuation (see the companion book Introduction to Statistics in Psychology, Howitt and Cramer, 2011a, Chapter 36).
Basically it gives us an idealised version of the correlation between two variables as if they were perfect measures. The formula for the corrected correlation is:
corrected correlation =
If the correlation between the two measures is .30 and their reliability is .75 and .60, respectively, the corrected correlation is .45:
corrected correlation = = = = .45
This means that these two variables share about 20 per cent of their variance. If this is generally true, we would only need another four variables to explain what we are inter-ested in. (Though this is a common view in the theory of psychological measurement, the adjustment actually redefines each of the concepts as the stable component of the variables. That is, it statistically makes the variables completely stable [reliable]. This obviously is to ignore the aspects of a variable which are unstable, for example, why depression varies over time, which may be as interesting and important to explain as the stable aspects of the variable.)
How do we know what size of effect or association to expect if we are just setting out on doing our research?
z Psychologists often work in areas where there has already been considerable research.
While what they propose to do may never have been done before, there may be similar research. It should be possible from this research to estimate or guestimate how big the effect is likely to be.
z One may consider collecting data on a small sample to see what size of relationship may be expected and then to collect a sample of the appropriate size to ensure that statistical significance is achieved if the trend in the main study is equal to that found in the pilot study. So if the pilot study shows a correlation of .40 between the two variables we are interested in, then we would need a minimum of about 24 cases in our main study. This is because by checking tables of the significance of the correlation coefficient, we find that .40 is statistically significant at the 5 per cent level (two-tailed test) with a sample size of 24 (or more). These tables are to be found in many statistics textbooks – our companion statistics text, Introduction to Statistics in Psychology (Howitt and Cramer, 2011a), has all you will need.
z Another approach is to decide just what size of relationship or effect is big enough to be of interest. Remember that very small relationships and effects are significant with very large samples. If one is not interested in small trends in the data then there is little point in depleting resources by collecting data from very large samples. The difficulty is deciding what size of relationship or effect is sufficient for your purposes.
Since these purposes vary widely no simple prescription may be offered. It is partly a matter of assessing the value of the relationship or effect under consideration. Then the consequences of getting things wrong need to be evaluated. (The risk of getting things wrong is higher with smaller relationships or effects, all other things being equal.) It is important not simply to operate as if statistical significance is the only basis for drawing conclusions from research. measure 1 reliability × measure 2 reliability
z Psychologists are often concerned with testing generalisations about human behaviour that are thought to apply to all human beings. This is known as universalism since it assumes that psycho-logical processes are likely to apply similarly to all people no matter their geographical location, culture or gender.
z The ability of a researcher to generalise from their research findings is limited by a range of factors and amounts to a complex decision-making process. These factors include the statistical significance of the findings, the representativeness of the sample used, participation and dropout rates, and the strength of the findings.
z Participants are usually chosen for their convenience to the researcher, for example, they are easily accessible. A case can be made for the use of convenience samples on the basis that these people are thought for theoretical purposes to be similar to people in general. Nonetheless, researchers are often expected to acknowledge this limitation of their sample.
z The data collected to test a generalisation or hypothesis will be either consistent with it or not consistent with it. The probability of accepting that the results or findings are consistent with the generalisation is set at .05 or 5 per cent. This means that these results are likely to be due to chance 5 times out of 100 or less. Findings that meet this criterion or critical value are called statistically significant. Those that do not match this criterion are called statistically non-significant.
Key points
It is probably abundantly clear by now that purely statistical approaches to general-isation of research findings are something of an impossibility. Alongside the numbers on the computer output is a variety of issues or questions that modify what we get out of the statistical analysis alone. These largely require thought about one’s research findings and the need not to simply regard any aspect of research as routine or mechanical.
4.8 Conclusion
Psychologists are often interested in making generalisations about human behaviour that they believe to be true of, or apply to, people in general, though they will vary in the extent to which they believe that their generalisations apply universally. If they believe that the generalisation they are testing is specific to a particular group of people they will state what that group of people is. Because all people do not behave in exactly the same way in a situation, many psychologists believe that it is necessary to determine the extent to which the generalisation they are examining holds for a number, or sample, of people.
If they believe that the generalisation applies by and large to most people and not to a particular population, they will usually test this generalisation on a sample of people that is convenient for them to use.
The data they collect to test this generalisation will be either consistent or not con-sistent with it. If the data are concon-sistent with the generalisation, the extent to which they are consistent will vary. The more consistent the data are, the stronger the evidence will be for the generalisation. The process of generalisation is not based solely on simple criteria about statistical significance. Instead it involves considerations such as the nature of the sampling, the adequacy of each of the measures taken, and an assessment of the value or worth of the findings for the purpose for which they were intended.
ACTIVITIES
1. Choose a recent quantitative study that has been referred to either in a textbook you are reading or in a lecture that you have attended. What was the size of the sample used? Was a one- or a two-tailed significance level used and do you think that this tailedness was appropriate? What could the minimum size of the sample have been to meet the critical level of significance adopted in this study? What was the size of the effect or association, and do you think that this shows that the predictor or independent variable may play a reasonable role in explaining the criterion or dependent variable? Are there other variables that you think may have shown a stronger effect or association?
2. Choose a finding from just about any psychological study that you feel is important. Do you think that the principle of universalism applies to this finding? For example, does it apply to both genders, all age groups and all cultures? If not, then to which groups would you be willing to generalise the finding?