Pilot 1: Splitting the syllogisms task

In document Advanced mathematics and deductive reasoning skills: testing the Theory of Formal Discipline (Page 102-105)

5.2 Pilot Studies

5.2.1 Pilot 1: Splitting the syllogisms task

In the Belief Bias Syllogisms task, 24 syllogisms are presented in a contextualised format (S´a et al., 1999; see also Chapter 4). In eight of the syllogisms, prior belief and validity are in accordance (four believable-valid, four unbelievable-invalid), in another eight they are in conflict (four believable-invalid, four unbelievable- valid), and in the final eight the context is neutral (four neutral-valid, four neutral-invalid). This creates four problems for each of six believability-logic combinations (see Figure 5.1 for an example of each type). Two of each of these four make positive (P, Q) statements, and two of the four make negative (not-P, not-Q) statements. Therefore, there are twelve combinations of believability, validity and valence. Two problems for each of these combinations makes the total of 24 items. Because there are two problems of each form, the test can be split in half and still cover all combinations to give a full measure of belief bias. The test could be split in this way so that half of the problems could be given to participants at Time 1 and the other half at Time 2 (with the order counterbalanced by school), to reduce repeat testing effects and testing time. This method means that any difference between the two halves in terms of believability could cause misleading gains or losses in belief bias between the two time points. Therefore, a pilot study was required to determine whether

Unbelievable, valid:

Premises: All things with four legs are dangerous. Poodles are not dangerous.

Conclusion: Poodles do not have four legs.

Unbelievable, invalid:

Premises: All guns are dangerous. Rattlesnakes are dangerous. Conclusion: Rattlesnakes are guns. Believable, valid:

Premises: All fish can swim. Tuna are fish. Conclusion: Tuna can swim.

Believable, invalid:

Premises: All living things need water. Roses need water.

Conclusion: Roses are living things.

Neutral, valid:

Premises: All ramadions taste delicious. Gumthorps are ramadions. Conclusion: Gumthorps taste delicious.

Neutral, invalid:

Premises: All lapitars wear clothes. Podips wear clothes. Conclusion: Podips are lapitars.

any such differences did exist. In the pilot study the conclusions from each problem, which are where the believability/validity conflicts lie, were rated by participants in terms of how believable they were. The problems of the same format from each half of the test (e.g. the believable problems from Half 1 and Half 2) were then compared for believability ratings.


Participants Fifty-eight participants (38 male, aged 19-23, M =20.12) were re- cruited by email through a mathematics module tutor and took part unpaid during a larger online study. All participants were undergraduate mathematics and engineering students at Loughborough University.

Procedure Participants took part during an unrelated online study about their degree course study choices (see Inglis, Palipana, Trenholm & Ward, 2011). After completing all sections relevant to their study choices, they were asked to complete a section on the believability of sentences. The instructions read: “Below is a list of sentences. Some of the sentences will be completely believable, some will be completely unbelievable, some will be roughly in the middle, and some will be meaningless. Your task is to decide which is which”.

Below the instructions, on the same page, the 24 conclusions from the Belief Bias Syllogisms task (see Appendix C) were presented in a set order, alternating between Half 1 conclusions and Half 2 conclusions. Next to each sentence was a 5 point scale with the options ‘Very unbelievable’, ‘Moderately unbelievable’, ‘Neither believable nor unbelievable’, ‘Moderately believable’, and ‘Very believ- able’. Participants rated each statement on the scale before submitting their answers.


The 24 syllogism conclusions fell into six categories for the analysis: Half 1 unbelievable (H1U), half 1 neutral (H1N), half 1 believable (H1B), half 2 un- believable (H2U), half 2 neutral (H2N), and half 2 believable (H2B). The aim of the analysis was to test for differences between test halves on believability ratings of each item type, i.e, are H1U conclusions rated differently to H2U conclusions?

Participants’ mean responses are shown in Figure 5.2. A 2 (test half: 1 or 2) × 3 (intended believability: unbelievable, neutral, believable) repeated measures analysis of variance (ANOVA) was conducted with believability ratings as the dependent variable. There was a significant main effect of intended believability on believability ratings, F(2, 114) = 323.2, p < .001, ηp2 = .85.







Half 1

Half 2














Item type




Figure 5.2: Believability ratings for each problem type by test half (error bars represent ±1 SE of the mean).

believable (M = 4.14, SD = 0.50) than neutral items (M = 2.90, SD = 0.57), which in turn were rated as significantly more believable than unbelievable items (M = 1.50, SD = 0.51). There was no significant main effect of test-half, meaning that there was no evidence of the two halves of the test differing in believability. Importantly, there was also no significant interaction between half and believability (p = .20, ηp2 = .03). H1U (M = 1.49, SD = 0.57), H1B

(M = 4.05, SD = 0.69) and H1N (M = 2.91, SD = 0.59) conclusions were rated similarly to H2U (M = 1.54, SD = 0.58), H2B (M = 4.22, SD = 0.55) and H2N (M = 2.90, SD = 0.58) conclusions, respectively, so there was no evidence that conclusions in each test half that had the same intended believability status differed in rated believability (see Figure 5.2).

Discussion and Implications

There are three findings from this study: 1) each half of the test can be assumed to be equally believable (at least, there was no evidence of a difference), 2) problems of the same intended believability in each half can be assumed to be equally believable (again, insofar as there was no evidence of a difference), and 3) the intended believability of problems over the test as a whole are in accordance with participants’ perceptions of believability. All three of these outcomes are positive for the use of different halves of the test at different time points. The results suggest that a participant would show consistent extents of belief bias in each half of the test, and any difference over time found in the longitudinal study will not be due to a difference in the items used but due to a genuine difference in the participants’ susceptibility to belief bias.

In document Advanced mathematics and deductive reasoning skills: testing the Theory of Formal Discipline (Page 102-105)