Web-based Tools to Measure Inductive Reasoning Ability

1 1 . 1 Introduction

Inductive reasoning ability (IRA) is the general ability to see patterns among examples (Rescher, 1 980; Harvery et aI., 2000). Literature on different theories about inductive reasoning ability has been reviewed in chapter 7 and is not repeated here. However, it has not been an easy task to find an inductive reasoning test that meets:

1 . financial constraint: most of the test batteries have prices above what we can afford;

2. task specificity: inductive reasoning test is usually just a subset of a larger test about intelligence (e.g. Hall, 1 985 ; van de Vijver, 2002); and

3. deployment capability: existing inductive reasoning tests usually are not capable of being deployed through the Internet (e.g. Brown, Glass, and Park, 2002; Laumann, 1 999).

Therefore, a web-based test (task) of inductive reasoning ability is developed as part of this research. It is called Web-IRA. Web-IRA is developed with references to available infonnation, mostly through literature, about other inductive reasoning tests (e.g. van de Vijver, 2002; Bailenson et aI. , 2002; Ekstrom et aI., 1 976; Laumann, 1 999). The nature of the different tests is discussed in more detail in Section 1 1 .2 to provide a reference framework on what the existing inductive reasoning tests are like. The tool we developed in this study is then discussed in Section 1 1 .4 .

When examining a representative inductive reasoning task called Wason's 2-4-6 task (Was on, 1 960) and subsequent and relevant discussions about it (i.e. Whetheric, 1 962; Gonnan and Gonnan, 1 984; Gonnan et aI., 1 984; Tukey, 1 986), a theme, although implicit, consistently surfaced to our attention. We have noticed that these discussions about Wason' s 2-4-6 task are primary focused on external behaviours. In these discussions, although there are inferences about cognitive activities, such as eliminative strategy, the cognitive activities are mainly used to explain observed behaviours. This causes insufficient understanding of the cognitive activities underlying the 2-4-6 rule finding task. We believe that the observed behaviours should be used to understand the cognitive activities which can then be used as models/lens to interpret behaviours. We therefore propose a hypothesised construct called systematic thinking ability that could potentially account for the cognitive activities in Wason's 2-4-6 task.

Our aim in this study is not to obtain a comprehensive and extensive knowledge on Wason's 2-4-6 task in the cognitive level. We believe that the systematic thinking ability is also highly relevant to inductive reasoning ability. Besides, we have already

Web-IRA, Web-S ys, and Web-RF

listed systematic thinking ability as one of the manifestations of trait of inductive reasoning ability in chapter 7. It is worth exploring this relationship further empirically. Section 1 1 .3 presents more discussion on systematic thinking ability. A

tool to measure systematic thinking ability is also developed. This tool is called Web Sys and is discussed in Section 1 1 .4.2 .

Another tool, called Web-RF (short for Rule Einding), is also developed. Web-RF follows Wason's 2-4-6 task but sets it up in a computerised environment. The reason to develop this tool is to further our understanding in the rule finding task, and to test our hypotheses on the relationships between inductive reasoning ability and rule finding ability and between systematic thinking ability and rule finding ability. Details of Web-RF are discussed in Section 1 1 .4.3 .

Empirical evaluations were carried out to study Web-IRA, Web-Sys and Web-RF. For the purpose of discussion, the evaluation is presented in three experiments. Experiment 1 is an evaluation of Web-IRA and is discussed in Section 1 1 .5 . Experiment 2 is an evaluation of Web-Sys and the relationship of inductive reasoning ability and systematic thinking ability. Experiment 2 is presented in more details in Section 1 1 .6 . Experiment 3 examines the relationships between the rule finding ability to inductive reasoning ability and systematic thinking ability. Experiment 3 is discussed in Section 1 1 .7 . Section 1 1 .8 summarises our findings and gives conclusion of this chapter.

1 1 , 2 Tasks in Inductive Reasoning Test

Tasks involved in measuring IRA can often be categorised into groups depending on what characteristics of IRA that particular task is trying to explore. Among the categories, series extrapolation, analogical reasoning, and exclusion tasks are the ones that are quite often used by researchers (van de Vijver, 2002).

In the series extrapolation task, the subject is shown a series of stimulus items which were placed in an order governed by a rule. The rule is unknown to the subject and is hereafter called the Target-Rule (T-Rule) of the task. The subject is asked to find out what is the next item that will be generated by the T-Rule. The stimuli can be numbers, string of characters or symbols. An example of a series extrapolation task is "What should come after 2-4-6-8- 1 0?", and the correct answer is 1 2.

The series extrapolation task involves the ability to generate hypothesis which in term requires subjects to find out what are the relevant attributes that are common in the stimuli by using comparison (e.g. odd or even number, ascending or descending). The result of the comparison can then be used to classify those common attributes into a group that belong to a hypothesis or group of hypotheses that the T -Rule is consisted of. The "sorting task" (Bailenson et aI., 2002) that requires subjects to sort pictures of birds that go together by nature is a more specific example of testing the classification ability.

In the analogical reasoning task, the subject is often given a relationship and is asked to use this as an analogy to find the next relationship. An example would be "Road to

Web-IRA, Web-S ys, and W eb-RF

Car as Rail to?" , and the answer should be "Train". The analogical reasoning task requires one to be able to see what the relevant attributes of the given relationship are, and map those relevant attribute from one context to the other..

In the exclusion task, the subjects are give a set of stimuli and are asked to detect irregularity among the stimuli. An example would be "In 2, 3, 4, 6, 8, which is the least similar with others?", and 3 is the answer because it is the only odd number. The exclusion task again requires the subjects to be able to classify attributes into groups and form hypothesis. The "Letter set task" (Ekstrom et aI., 1 976; Laumann, 1 999) that requires subjects to mark the odd one out from five sets of four letters is a type of exclusion task.

1 1 . 3 Systematic Thinking Ability

Rule finding is one of the early definitions of inductive reasoning by Thurston in 1 938, whereas later researchers saw inductive reasoning more as generalisation (Shye,

1 988). Wason' s 2-4-6 task is a typical rule finding task.

By using the 2-4-6 task, Wason (1 960) found that the utilisation of eliminative strategy had higher correlation to the success rate. Eliminative strategy refers to one's purposeful employment of negative instance to falsify hypothesis. Wason ( 1 960) found that those, who performed better in the 2-4-6 task, tended to both eliminate more possibilities and to generate more negative instances. In other words, they gave more tries to falsify potential but incorrect rules. In Wason's study, these subjects was said to employ eliminative strategy. As oppose to the eliminative strategy, some subjects employed the enumerative strategy with the common mistaken assumption that confirming evidences alone could j ustify their conclusions. Wason's ( 1960) results showed that the eliminative strategy yielded better performance than the enumerative strategy. Falsification (eliminative strategy) can lead to a smaller search space in which candidate hypothesis in principle has higher potential to be correct. Falsification therefore is a more efficient strategy than mere verification (enumerative strategy) from both theoretical and empirical point of view.

In Wason's ( 1 960) version of the 2-4-6 task, the subjects were instructed to discover the rule that governs a triplet or "triad" of numbers. The triad, 2-4-6, was offered as an exemplary triad that followed the rule. The target rule (T-Rule) to be found is "three numbers in increasing number of magnitude". The subjects started by writing down a triad (e.g. 1 -2-3) and a hypothesis (e.g. three number less than 1 0). The experimenter then told the subject whether the triad conform to the target rule or not. Whetheric ( 1962) criticised Wason's ( 1960) work by stating that ( 1 ) the example of 2- 4-6, a conforming triad, given to the subjects as part of the instructions was misleading the subjects to use enumerative strategy, (2) the methodology Wason used did not provide any evidence of subjects actively using eliminative strategy.

Whetheric proposed another scheme to classify subjects' behaviour. The classification scheme included four cases: confirming-positive, confirming-negative, disconfirming positive, and disconfirming-negative. In Whetheric's scheme, if a subj ect's hypothesis

Web-IRA, Web-Sys, and Web-RF is "three number less than 10", and then the subject proposes the triad 1 -2-3, this is classified as confirming-positive - the triad is confirming to the subject's current hypothesis (confirming), and the experimenter also gives a positive answer (positive). If the subject, holding the same hypothesis, but writes down the triad 3-2- 1 , then this is classified as confirming-negative - the triad confirms to the subject's current hypothesis, but the experimenter gives a negative answer. The similar principle is applied for disconfirming-positive and disconfirming-negative. Under his classification scheme, Whetheric pointed out that only conforming-negative and disconfirming-positive can be use to eliminate hypothesis and only they can be said to be eliminative strategy.

The number of confirming or disconfirming hypothesis can in fact depend on the nature of the task. In the experiment 1 of Gorman and Gorman ( 1 984), the T-Rule was

"at least one even number", and in experiment 2, the T-Rule was "no two numbers the same". The number of disconfirming-negative hypothesis in the group instructed to follow the eliminative strategy was higher in experiment 2 than that of in experiment 1 . In general, disconfirming-negatives are not useful to determine whether the hypothesis should be given up or not. As a direct consequence, the group performed worse in experiment 2 than in experiment 1 . Gorman and Gorman ( 1 984) speculated that this phenomenon was caused by the fact that it was more difficult to create confirming hypothesis to the "no two numbers the same" rule in experiment 2, i.e. the subjects could not obtain enough number of falsifying triads in order to successfully complete the task.

Also, falsifying triads or verifying triads alone could not yield optimal performance. In other study by Gorman et al. (1984), it was demonstrated that the proportion of falsifying triads to verifying triads was a key predictor of success. It is interesting to note that the choice of different T-Rule has such great impact on subj ect's behaviour. The fact that only a minority of subjects employed the eliminative strategy may lead to the possibility to think that the eliminative strategy is not easy to be employed in the 2-4-6 task. This speculation is further supported by Mahoney and DeMonbreun's (cited in Gorman and Gorman, 1 984) experiment which showed that neither university students nor working scientists tend to disconfirm hypothesis on the 2-4-6 task unless instructed or hinted to do so.

These aforementioned researchers (i.e. Wason, 1 960; Whetheric, 1 962; Gorman and Gorman 1 984; and Gorman et aI., 1 984) limited their research to behaviour output only. What are the cognitive processes involved in the eliminative strategy? What happens mentally when someone is said to use eliminative or enumerative strategy? In Whetheric's scheme, disconfirming-positive can in fact be used to indicate subject's intention to employ the eliminative strategy, and therefore be a starting point to answer the above questions. However, Whetheric did not orient his article to answer our question.

In order to obtain further insights about cognitive processes involved in eliminative strategy, an example is used. What is required to find out the target rule (the T-Rule) "three numbers in increasing order of magnitude"? In terms of magnitude of the triad, the T-Rule can be represented as "S-M-B" (short for Small-Medium-Big). "S-M-B" represents a class of triads that confirm to the T -Rule. " 1 -3-5", " 1 -5-7" , " 1 00- 1 000- 1 0000", and "2-7- 1 5" are all members of the "S-M-B" class.

Web-IRA, Web-S ys, and Web-RF

In this task, the key action to success is to falsify "three numbers not in increasing order of magnitude" (hereafter called the falsifying rule, the F-Rule). Unlike T-Rule, which has only one class of triads, F-Rule has five classes of triads -"B-M-S", "M-B S", "M-S-B", "S-B-M", and "B-S-M". Logically speaking, one needs to falsify all these five classes in order to falsify the F-Rule and hence work out the T -Rule otherwise, it can only be called guess. Although one can argue that sometime humans make decisions based on probabilities instead of complete knowledge, but pursuit of this argument is beyond the scope of this study. What we wish to stress is that when it is difficult to find the F-Rule, such as the experiment 2 of Gorman and Gorman (1 984), it becomes difficult to find the final answer using eliminative strategy (falsification).

The inference about the need to falsify the F-Rule "three numbers not in increasing order of magnitude", and the enumeration of the five classes of triads in the F-Rule, are in fact deductive reasoning. This deductive reasoning is what we called the deduction-in-induction - the deduction needed for the completion of an inductive reasoning task (Wason's 2-4-6 task). The ability to systematically enumerate and examine all possibilities (in this example, the five classes of triads in the F-Rule) is what we called the systematic thinking in this study. "Induction may lead to incorrect conclusions if antecedent information is not accurate or representative, and systematic validity checks should be conducted to maintain confidence" (Marchionini, 1 997). Systematic thinking is not only important to the hypothesis testing stage, but equally or more important in the hypothesis generation stage of inductive reasoning. What are the possible categories that the triads could be classified in, their magnitude, their order of sequence, are they odd or even number, or multiple of a constant? If the subjects could have those categories in mind, they could then systematically test whether the T -Rule belongs to any one of the categories, instead of trial-and-error or wasting time on repeated tests of same category.

The questions now are about systematic thinking. What is systematic thinking and how is it related to inductive reasoning ability. Let's try to use an example to elucidate

. those questions. For example, when a subject is required to find the T-Rule "3 numbers in ascending order" (i.e. the T-Rule in Wason, 1 960), the key is to falsify "3 numbers not in ascending order" (falsifying rule, F-Rule). The focus here is the "sequence of the order of magnitude" (target focus, T-Focus) of the 3 numbers. For another example, when the T-Rule is "no two numbers the same" (T-Rule used in Gorman and Gorman, 1984), the F-Rules become "a =b or a =c or b =c", where a, b, c represents the first, second, and third number in the triads. In the second example, one needs to focus on the "equality of2 of the numbers" (T-Focus).

Harverty et al. (2000) distinguished "experiment space", which is a mental mode that one is gathering data in order to form hypothesis, from "hypothesis space", which is a mental mode where one has acquired a hypothesis to be tested. For example, in the 2- 4-6 task, when one is first trying some triads with any expectation, then one is working in the experiment space; and when one sees the 3 pairs of triads (2-4-6), (4- 6-8), ( 1 0-20-30) with positive confirmations, one may form the hypothesis "3 even numbers" in the hypothesis space. One needs to come back to the experiment space to try out some more triads to test whether the hypothesis is correct or not.

Web-IRA, Web-Sys, and Web-RF

Harverty et al. (2000) further distinguished "local hypothesis" from "global hypothesis". A local hypothesis is formed from some observed pattern of the data in the experiment space. When a local hypothesis gathers its creditability, it can be promoted to a global hypothesis. It is only when one has found the T-Focus, it is possible to develop a local hypothesis into a global hypothesis. Figure 1 1 - 1 illustrates different stages of inductive reasoning.

Global Hypothesis Local Hypothesis Hypothesis Space find T -Focus eliminate hypothesises systematic enumeration generate hypothesises

E xperiment Space find pattern in data

. _Figure_{1 1 - 1 :}_{Transition of different stages in inductive reasoning}

In order to find the T-Focus among all the candidates, researchers have found the employment of eliminative strategy to be an efficient one (Wason, 1 960; Whetheric,

1 962).

What is the individual difference that determines whether eliminative strategy is used or not? An important ability called systematic thinking can help to investigate this question in more detail.

In their study of children's growth of logical thinking from a developmental perspective, Inhelder and Piaget ( 1 958) hypothesized the existence of the mental construct called combinatorial schema which is the ability to make systematic combinations of given factors in order to obtain a desired effect. Inhelder and Piaget used chemical problem solving as the experiment and therefore stressed on the ability to systematically process given factors. In our study however, focus is placed on systematic thinking itself, which is the ability to systematically list and test factors, because of the assumption that the knowledge required to solve the problem has already been acquired.

Two stages precede the elimination of a candidate solution, namely the formation stage of the candidate solution and the test stage of candidate solution. In the formation stage, the systematic thinking is important if not essential. Using Wason' s (1 960) 2-4-6 task as an example, a subject may start by considering the following aspects:

• _{whether there is a threshold (e.g. all 3 numbers less than 20),}

Web-IRA, Web-Sys, and Web-RF

• whether there is a common difference between adjacent numbers (e.g. a, a+2, a+2+2),

• odd or even numbers, for all or for at least a certain amount or for at a certain

order/position, and • more others.

Even though systematically listing all possibilities is impossible, the identification of those broad categories allows the starting point of elimination. That is, without taking any of these broad aspects into consideration, no elimination strategy is possible in a logical sense because of not knowing what to eliminate.

In the test stage, systematic thinking has even a more critical role. Using an example of testing the candidate hypothesis "no 2 numbers are the same", one needs to negate the following three rules (F-Rule).

1 . a=b

2. a=c

3 . b=c

when the triad is represented by (a-b-c).

The act of negating the three rules is actually what was called the eliminative strategy by Wason ( 1 960), Whetheric ( 1 962) and Gorman and Gorman ( 1 984). This leads to our hypothesis that there exists a close relationship between systematic thinking and eliminative strategy - maybe eliminative strategy is in fact of a function of systematic thinking. This is also one of the questions we would like to find out. Answering this

In document Cognitive trait model for adaptive learning environments : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information System [i e Systems], Massey University, Palmerston North, New Zealand (Page 148-172)