Most of the studies reported in this thesis follow experimental and quasi-experimental designs, and the current and following sections will elaborate on each of these methods respectively.

The experimental method is used to test hypothesised causal relationships by systematically manipulating one or more variables and measuring the effect on other variables. This can occur in a highly controlled laboratory setting or in a less controlled field setting. The variables that are manipulated are referred to as independent variables, and those that are measured are dependent variables. In an experiment, the independent variable is split into two or more conditions, which could be something like a drug condition and a placebo condition, or stimuli display time conditions of 250ms, 500ms, and 750ms, for example. The

Complete Latin Square

Order 1 2 3 4 5 6 1 2 3

Task 1 A A B B C C A C B

Task 2 B C A C A B B A C

Task 3 C B C A B A C B A

Table 3.1: Demonstration of complete and Latin Square counterbalancing for a set of three tasks. A = administered first, B = administered second, and C = administered third.

experimenter then measures and compares the dependent variable under each condition, e.g. patients’ reports of their symptoms following treatment.

One way in which experiments can differ is in whether the conditions are administered between- or within-participants. In a between-participants design, the participants are randomly assigned to different conditions. Random assign- ment is very important because it means that any differences between groups, other than the experimental manipulation, are due to random variation rather than systematic variation. Any differences between groups found in the depend- ent measure can therefore be said to be due to the manipulation.

In a within-participants design, all participants experience all conditions, and the order in which they are administered must be counterbalanced. Com- plete counterbalancing means that all possible orders of tasks are administered (with participants randomly assigned to one order each). Latin square counter- balancing means that tasks are presented in a set but rotating order. Figure 3.1 demonstrates complete and Latin square counterbalancing for a set of three tasks. The reason for counterbalancing is to prevent any order effects. For ex- ample, if participants become tired towards the end of a study, counterbalancing ensures that this does not affect performance on one task alone but balances the effect between all tasks.

Christensen (2000) stated that there are three conditions that must be met for a good experimental research design. The first criterion is that the design must allow the research question to be answered. The second criterion is that extraneous variables (variables that are not of interest but that affect the de- pendent variable) are controlled for (also known as internal validity, see Section 3.6.2). The third is that the findings are generalisable.

The first criterion, that the design must allow the research question to be answered, seems so fundamental as to not require stating. However, it is not im- possible for a researcher to get as far as trying to interpret data before realising that this criterion has not been met. For example, take the research question

‘Is intervention X effective in helping dyslexic children improve their reading speed?’. A flawed approach to answering this question would be to take a sample of dyslexic children and measure their reading speed before and after administering intervention X. Suppose the findings showed that the children’s reading speed was faster after the intervention. Does this answer the research question? No, it merely shows that the children’s reading speed became faster over time, whether or not that has anything to do with the intervention is im- possible to say in the absence of a control group. Recall the study by Lehmann (1963) which showed that undergraduate students’ critical thinking skills im- proved throughout their university education. It was mentioned in the literature review in Chapter 2 that this could have been a change that occurs in all college- aged people. Due to the lack of a control group, it was not reasonable to assume that the change had any relation to the participants’ educational experiences. To adequately answer such questions, the design needs an experimental group that receives the intervention and a control group that does not, and the par- ticipants need to be randomly assigned to one group or the other to allow any differences to be attributable to the manipulation.

The second criterion, that extraneous variables are controlled for, is neces- sary to be able to eliminate rival hypotheses. An extraneous variable is some- thing other than the independent variable that influences the dependent vari- able. It would be no good if you concluded that participants who read a happy story recalled more details than those who read a sad story if the participants in the happy story group were more intelligent, for example. The best way to control for extraneous variables is to include a control group and to randomly assign participants to groups. The control condition should be identical to the experimental condition in all aspects except that it does not receive the exper- imental manipulation. In this way, the independent variable is isolated as the difference between conditions and a research question about that variable can be answered. By randomly assigning participants to conditions, there should be no known or unknown variables that affect one group more than another, except for the independent variable.

The third and final criterion is generalisability (also known as external valid- ity, see Section 3.6.2) – the extent to which the results can be applied beyond the study itself. Generalisability is restricted by having a non-representative sample or an artificial experimental situation. For example, if your sample is entirely made up of females the findings may not generalise to males. Similarly, if you study factors affecting attractiveness of faces using photographs in a lab setting, your results may not apply to attractiveness of faces as seen in natural public settings. It is likely that participant samples will always be restricted in some dimension, but the important thing is to be aware of the boundaries of

the population to which your sample belongs and thus how far your results can generalise when you draw your conclusions. Artificiality of study environments, however, is a more interesting issue.

A complaint often levelled at lab based and experimental research is that the artificiality of the setting makes the results non-generalisable to other settings (Mook, 1983). This complaint often reflects a fundamental misunderstanding of the aims of the research. On the whole, such research is not conducted with the aim of generalising the results to other situations, it is done with the aim of testing a theory. In psychological research, theories describe and explain real world behaviour and the role of experiments is to test and refine those theories. In testing a theory, a researcher derives a hypothesis that can be tested in a controlled situation. The results of the experiment are only used to accept or reject that hypothesis, thus informing the researcher about the accuracy of the theory from which it was derived. This means that it is irrelevant whether or not the experiment resembles real life, what is important is that the experiment is highly controlled so that the hypothesis is being tested accurately – in the absence of confounding variables. Mook (1983) provided a very detailed discussion of this issue, and it is also discussed further in Section 3.6.2 below on external validity.

In document Advanced mathematics and deductive reasoning skills: testing the Theory of Formal Discipline (Page 66-69)