• No results found

Chapter 3 Research Design

3.2 Procedure/Research process

Research work on this project was divided into several phases beginning with a relatively large phase that was preliminary to the analysis of the relationship between problem solving and spatial ability which was the focus of the subsequent phases. An outline of the work

conducted in these phases is provided in Table 3-2Error! Reference source not found. along with a summary of the outcomes and predominant methods of analysis used in each phase.

Phase Work conducted in this phase Outcomes relevant to next phase Methods Chapter Prelim The role of spatial thinking in the

engineering curriculum was examined by

(i) measuring SA in different years of study

(ii) comparing SA with math skills (iii) comparing SA with electric circuits knowledge

Justification to pursue the role of spatial ability in electric circuits problem solving and/or math problem solving

Quantitative 4

1 Pilot a set of math problems Significant relationship between math problem solving and spatial ability

Quantitative 5

2 Administer two math tests, one consisting of a set of simple problems and a second consisting of a set of core competency questions

Significant relationship between spatial ability and the test of problem solving only, therefore spatial ability relevant to problem representation step and not problem solution step.

Quantitative 5

3 Analyse verbal and written responses to each individual math problem based on the Mayer framework – an interpretive process

Greater understanding of how weak and strong visualizers can differ in problem representation approach in one to the others to search for consistency, common

Table 3-2. Summary of the phases and outcomes of each phase in the research work.

Preliminary work examined the role of spatial ability and spatial skills in the engineering curriculum in three ways: (i) with electrical engineering in DIT as a case study, variation in spatial ability across students in different years of the programme was measured as was its relationship with all modules/courses grades on the programme, (ii) scores on US college entrance tests, include measures of math ability, were compared to spatial ability using a sample of freshman engineering students at OSU and (iii) understanding of electric circuits concepts was compared with spatial ability for samples of first year engineering students at DIT. This work involved the administration of different tests to samples of students or seeking permission to access data from previously administered tests followed by statistical analysis of these quantitative data and all three aspects are described in detail in Chapter 4. The

preliminary work phase also provided an opportunity to learn how to conduct research in this area and to reflect what research questions related to problem solving/spatial ability could be addressed in greater detail.

Following the preliminary work, research on the major theme of the study, the spatial-problem representation relationship, began in Ohio State University (OSU) with a pilot study using a small sample of undergraduate teaching assistants in the Department of Engineering Education (EED). This resulted in a set of problems that were administered to a larger sample of first year engineering students at both DIT and OSU. Funding for this phase of the study came from a NSF grant to examine the relationship between spatial ability and problem solving in

engineering which included the use of electroencephalography (EEG). Only the OSU students participated in this aspect of the project. The objectives of this research project and those of the NSF funded project overlapped sufficiently to allow data to be collected for both projects concurrently. Hence, data were collected from participants at OSU while they solved problems using both a think aloud and an EEG protocol. The EEG data were analysed by others while this thesis is concerned with the analysis of the written and audio data.

Spatial ability vs. Problem representation, quantitative work

Phases 1 and 2 of the main study were designed to evaluate the statistical significance of the relationship between spatial ability and problem representation. Both the literature review and some preliminary work established that spatial ability correlated with assessments of routine reasoning and problem solving tasks in several STEM subjects. Assuming the non-routine component of problem solving is contained in problem representation and not in the solution phase, it was hypothesised that if problem representation could be isolated from problem solution, one would find a correlation between spatial ability and representation but not between spatial ability and problem solution. Isolation was indirectly achieved by creating two math tests, one consisting of a set of problems and the other of a set of core

competencies corresponding the solution phase of each of the problems. Logically, problem representation is the difference between spatial-problem solving and spatial-problem solution and this argument is elaborated further in Chapter 5 where both the hypothesis and

corresponding null hypothesis are presented.

Instruments

Between the preliminary work and phases 1 and 2, data from six different types of test were collected in this study, as listed in Table 3-3. With regard to spatial ability, four spatial tests - MCT, PSVT:R, Revised PSVT:R, MRT-A - were selected based on relevance to the spatial visualization and rotation factors, ability to discriminate among engineering students and satisfying reliability criteria. The standard protocols for administering all four tests are outlined in Chapter 2 and were followed in this study. In preliminary work, data were collected using an electric circuits concept test called The Determining and Interpreting Resistive Electric Circuit Concepts Test (DIRECT, Engelhardt & Beichner, 2004) and was

administered using the protocol recommended by the authors which is described in Chapter 4.

Course grade data were also collected as part of preliminary work from both DIT and OSU and,

for the latter only, college entrance test data were obtained. Finally, to examine the math problem solving – spatial ability relationship, two math tests were developed for this study.

The samples of participants are described in Chapters 4 and 5.

Test Source Reliability/Validity

MCT (CEEB, 1939) Medium to high reliability, KR-20 = .57 to .64

(Sorby & Baartmans, 2000), KR-20 = .81 (Kelly Jr, 2012)

PSVT:R (Guay, 1976) High reliability, internal consistency KR-20 = .74 to .83 (Branoff, 1998)

PSVT:R Revised (Yoon, 2011) High reliability, Cronbach’s α= .84 (Maeda et al., 2013)

MRT-A (Peters, Laeng, et al., 1995) High reliability, split half reliability = .72 to .80 (Geiser et al., 2006)

DIRECT (Engelhardt & Beichner, 2004) High reliability, internal consistency KR-20 = .70 (Engelhardt & Beichner, 2004)

MPT OSU Dept. of Mathematics,

unpublished High reliability, Cronbach’s α= .79 (Chapter 4) SAT Math www.collegeboard.org Test is valid (Camara & Echternacht, 2000; Shaw

et al., 2016)

ACT Math www.act.org Test is reliable (Powers, Li, Suh, & Harris, 2016) ACT SCIRE www.act.org Test is reliable (Powers et al., 2016)

DIT course grades DIT academic grade book Not available

OSU GPA OSU academic grade book Not available

Math pilot problem Appendix A (created for this study) Not available

Math problems Appendix B (created for this study) Low reliability, Cronbach’s α = .49 (6 items) Math questions Appendix B (created for this study) Medium reliability, Cronbach’s α = .61 (6 items)

Table 3-3. List of measurement instruments used in this study.

Quantitative methods

Working in the post positivist epistemology, one must conform to standards of validity and reliability with regard to data collection to strengthen and support generalizability in respect of the findings. Validity and/or reliability data, obtained from the literature where possible or measured as part of this work, are presented in Table 3-3 and discussed below. Also discussed is the nature of the distribution of the data collected from the samples as statistical methods depend, for example, on the data being normally distributed. When conducting statistical analysis of quantitative data, one must adhere to the rules and principles that accompany these methods.

With regard to validity of spatial tests, i.e. the extent to which they measure what they claim to measure, this is a matter of debate that is tied to the ontology of spatial ability and how

many factors it contains. As discussed in the literature review, a pragmatic assumption was made that spatial skills that are relevant to STEM education can be measured by tests that fall under the intrinsic-dynamic grouping (Davis, 2015; Uttal et al., 2013) and, based on their widespread use in the literature, tests from this group such as the MRT, PSVT:R and MCT are valid measurements of spatial ability . Tests of reliability assess the extent to which the test agrees with itself and include internal consistency measures such as Cronbach’s alpha, Kuder-Richardson 20 and split-half reliability. A test is considered reliable if performance on any one part of the test is predictive of performance on another. If the result of a reliability test is low it indicates that performance is based on several factors and if high, it is more likely the test assesses a single factor. Threshold values above which reliability is accepted are 0.7 for Cronbach’s alpha (Field, 2013) and 0.8 for Kuder-Richardson 20 (Sorby & Baartmans, 2000) and all of the spatial tests used in this study have been shown to meet one of these criteria (see Table 3-3). The MCT is arguably the most difficult of these tests with the average scores in samples of engineering students lower for the MCT than other spatial tests (e.g., Farrell et al., 2015). Test difficulty can lead to guessing and, therefore, a reduction in reliability (Sorby &

Baartmans, 2000). For example, when administered to younger, middle school students, for whom the test is very challenging, reliability was found to be poor (Hungwe, Sorby, Molzon, Charlesworth, & Wang, 2014). Among STEM students in higher education, the spatial tests used in this study have been measured to be reliable and are accepted as valid tests of spatial ability factors.

As discussed by Engelhardt & Beichner (Engelhardt & Beichner, 2004), DIRECT meets the threshold value of reliability and is also, they argue, a valid instrument for measuring student understanding of direct current resistive electric circuits. The evaluated content validity through feedback from experts who teach the subject on the content of the test and construct validity was evaluated using factor analysis and qualitative analysis of interviews with students as they answered the questions.

Reliability data are provided for the ACT and measurements of validity are provided for both the SAT and ACT which are based on correlations between scores on these tests and college grades (Powers et al., 2016; Shaw et al., 2016). Indeed, these tests are considered as benchmarks in the US education system as they are used to determine eligibility for college entry and are deemed to be accurate predictors of performance in college. Data collected from course grades in DIT and OSU were analysed in the preliminary work but these measurements come without any indication of reliability or validity which undermines the potential to generalise findings. However, they are what matter to students whose final level of award depend on them and issues of validity and reliability are a moot point since there is no alternative measure of academic performance that can be used. The OSU Math Placement Test, also used in the preliminary study, is an unpublished test administered to freshman students during orientation. Since reliability data for this test were not available, a Cronbach’s alpha was calculated as part of the work which indicated the test had a high level of reliability.

Finally, both the math problems and questions test used in the main body of the study were found to have a Cronbach’s alpha less than .7. They both fail to pass this test, particularly so for the math problems, which implies the 6 questions on the test are measuring several factors rather than one, i.e. performance on one part of the test does not predict performance on another. The implications this has for the results and the generalizability of the findings are discussed in Chapter 8.

Before statistical methods such as correlation and regression were applied to the data, normality of the data distributions were evaluated by plotting a frequency distribution and visually checking it, calculating skewness, kurtosis and their standard errors and conducting tests of normality. The Pearson correlation was used as standard unless the assumption of normality was violated, in which case a Spearman rho correlation was used instead. Simple regression and analysis of variance were calculated using sums of squares which is the

standard approach (Field, 2013). All statistical procedures were performed using IBM SPSS version 21.

Some of the key findings from the quantitative work are correlation values that indicate the magnitude or strength of the relationship between two variables and the significance level or probability that these particular data distributions occurred by chance. Although statistical results, correlation magnitudes are subject to interpretation and debate as to what is or isn’t large. Cohen (1988) suggested a rule of thumb, shown in Table 3-1, for assigning a qualitative description to a correlation in behavioural science studies which can be referred to when judging correlations reported in this study.

Correlation, r Description Variance, r2

.10 Small effect 1 %

.30 Medium effect 9 %

.50 Large effect 25 %

Table 3-4. Heuristic for judging correlation magnitudes (Cohen, 1988)

Spatial ability vs. Problem representation - qualitative work

Selecting a quantitative analysis procedure is relatively more straightforward than choosing a qualitative method from the many that are available. A sensible approach is to select a procedure that is a good match with the research question so there is compatibility between the two. Having established the existence of a significant relationship between spatial ability and problem solving among engineering students in phases 1 and 2, new research questions about the spatial-problem representation relationship were now asked in phases 3 and 4.

Approach changed from verifying deduced hypotheses to a mode of discovery to learn about the relationship between two cognitive factors, spatial ability and problem representation.

This required a qualitative approach that not only matched the new objectives but could also be applied to the data collected and the context in which they were collected. Data consisted primarily of written responses to the six problems collected from all 115 participants in the study and audio recordings of thinking aloud collected from the OSU sample and for three of

the six problems. A qualitative method was required that could work with these data and address the research questions.

Dominant qualitative methodologies include ethnography, narrative, phenomenology,

grounded theory, action research, discourse analysis, phenomenography and basic qualitative inquiry. Of these, ethnography and narrative research require the researcher to spend extensive amounts of time with the sample, repeatedly observing participants and developing an intimate knowledge of the phenomenon as it exists in their lives, approaches not matched with the research problem in this case. Phenomenology and grounded theory were also not a good match as the assumption of a relationship between spatial ability and problem

representation would undermine these methods. Bracketing or epoché in phenomenology requires the setting aside of prior assumptions, such as the role of spatial ability in problem representation, in order to allow freedom for the truth to emerge about what it is like to experience solving word story problems in maths. Likewise, grounded theory emerges from the views of the participants who, unaware of what their factors of intelligence are, cannot describe them. Discourse analysis was also ruled out for the same reasons. Action research is used to improve the researcher’s ability to perform a task and was not relevant. Of the several dominant modes of qualitative inquiry, all apart from phenomenography and basic qualitative inquiry, were redundant as they were not suited to the research question and/or type of data collected.

Phenomenography has been widely used to examine students’ approaches to problem solving (e.g., Irving, 2010; Walsh, 2009; Zoltowski, Oakes, & Cardella, 2012) and is worth considering in this case. While criticised for ignoring the noetic side of experiencing a phenomenon, i.e. the attitude we bring to bear on the experience (Ashworth & Greasley, 2009), it is a method that results in an outcome space which is a hierarchical list of the (assumed to be) limited set of ways of experiencing the phenomenon (Walsh, 2009). Since we learn through variation, we

can learn about the phenomenon by observing the various ways in which it is manifest. In this case, there are two phenomena of interest - ‘math problem representation’ and ‘spatial ability testing’ - and a phenomenographic method would require the collection of rich descriptions of experiencing these phenomena, i.e. interviewing students about performing each task. Two outcome spaces would then be created from the same set of participants which would allow a relationship between them to be examined using relational phenomenography (Walsh, 2009).

Taking this approach to spatial ability could be problematic as interviewing students about taking spatial tests is likely to lead to a verbatim description of their moves (e.g., ‘I looked at that corner on the first image and then rotated it and then checked the other part’, and so on) which is unlikely to result in anything other than the already established analytic/holistic dichotomy (e.g. Geiser, Lehmann, Corth, & Eid, 2008). Information on the cognitive processes that are asserted by cognitive psychologists to be occurring in our minds as we undertake these tasks has evaded capture by think aloud protocols. Phenomenography, while a useful method for examining approaches to problem solving, was deemed to be unsuitable in this case given the presence of spatial ability in the research question.

It was decided the most appropriate match to the question and the data was to employ basic qualitative inquiry as implemented by coding the participants solutions to the problems in a quest to understand why there were differences in success rates in the problems, what mistakes were being made, what types of representation were created and how these varied between weak and strong visualizers. Two options to coding were available – the tabula rasa, open out approach and the prior assumptions, focus down approach.

Rather than make prior assumptions about how participants in my sample approach problem solving, the open out approach would start with a blank slate and, through reading the data, generate a set of categories to describe the approaches to problem solving. A benefit of the open out approach is the truthfulness of the codes as they come directly from the data. A

drawback is the failure to see something meaningful that is hidden or obscure and which others have seen before when analysing similar data sets. The focus down approach begins with what has been reported in the literature with regard to problem solving approaches.

Assuming the model or models provided in the literature are relevant to this context, a set of codes are developed first and then used to analyse the data. While the focus down approach offers the potential to save time which can be made available to do other work, the drawback is assumptions may be inappropriately forced on the data.

I employed both a focus down and an open up approach in the data analysis which I summarise here and illustrate in more detail in Chapter 6. The Mayer framework (1992) outlined in the literature review was used to develop a set of initial codes for analysing problem solutions. Each transcript from each problem (no. of participants = 115, no. of problems = 6) was examined and checked against these codes which were updated as needed to best match the data. These were scored in a binary form – present/absent = 1/0. Analysis first required interpretation of the text and, if available, the audio recording, in order to judge whether a participant employed a particular representation or not. In many cases this was evident and in some cases unclear. I worked alone on this analysis and did not have anyone else available to independently judge the participants’ solutions. Hence, no inter-rater

I employed both a focus down and an open up approach in the data analysis which I summarise here and illustrate in more detail in Chapter 6. The Mayer framework (1992) outlined in the literature review was used to develop a set of initial codes for analysing problem solutions. Each transcript from each problem (no. of participants = 115, no. of problems = 6) was examined and checked against these codes which were updated as needed to best match the data. These were scored in a binary form – present/absent = 1/0. Analysis first required interpretation of the text and, if available, the audio recording, in order to judge whether a participant employed a particular representation or not. In many cases this was evident and in some cases unclear. I worked alone on this analysis and did not have anyone else available to independently judge the participants’ solutions. Hence, no inter-rater