Exploring ‘gradeness’: the quantitative analysis

In assuming comparability examining groups continue to use some of the methods described in Chapter 3 to correct for deviations or to indicate for example that standards are falling in respect of school teaching, students’ skills or subject difficulty. My interest is in the meaning of ‘gradeness’ - that students’ achieved grades have common currency. 1 intended to use some of the methods used by examining group personnel but with the strengthening of certain of their aspects, for example

sampling, and with a particular interpretation taking account of the caveats that I raised about the methods in Chapter 3. I do not assume comparability as do examining groups. I explored quantitative relationships to see what ‘gradeness’ meant for my particular populations and in so doing, wished to expose the dynamic nature of assessment that as I have indicated in Chapter 3, is not controllable or monitorable by technical means. As I am using the methods of assessment technicists in examining groups, I also tend to use their discourse, for example severity of grading, when in fact, as I have argued in Chapter 3, differences in groups of students’ examination performances may have nothing to do with severity of grading, the examinations may just be different.

4.1 Comparing the study’s Welsh Joint Examining Consortium (WJEC)

populations

4.1.1 The populations’ examination centres

The examination centre profile (see Appendix 3) revealed that the number of centres increased from 1993 (53) with the addition of new centres for 1994 (64). There was an increase in the student population sizes too (1993 = 631; 1994 = 792). This might be explained by the 1994 examinations being the first in which students were required to be examined in all three science subjects, either by Double Award, Single Award or Triple Award (separate sciences) GCSE examinations. In these respects the 1994 population differs from that of 1993 in the science education background of its students and geographical distribution of its centres and in turn, whatever influences these might have on students’ performances. The 1994 and 1995 profiles showed the study’s populations increasing in

student numbers from 1994 to 1995. These profiles showed a greater degree of similarity than dissimilarity in centre geographical distribution, although there was a small decrease in centres based in South and West Glamorgan (centres beginning with 687) from 1994 to 1995. I can only speculate about the reasons for the decrease in West and South Glamorgan centres. Centres may have changed to another examining group or changed their policy from entering students for WJEC’s Triple Award GCSE to WJEC Double Award Science GCSE. In 1995 the requirement for centres to enter their students for all three separate science subjects with the same examining group was introduced. This may have prompted some of the 1994 centres to move from WJEC for the 1995 examinations, especially if they had been accustomed to picking particular examining groups for each of the science subjects because they perceived them to favour students’ attainment. It would be reasonable to assume that the introduction of new syllabuses would serve as a stimulus for centres reviewing their entry policies including whether to continue to enter their students for Triple Award GCSE or Double or Single Award Science GCSE with WJEC. These sorts of issues I intended to explore when I engaged with teachers in centres as a means of understanding the historical influences on examination comparability (Chapter 6). Appendix 3, however, shows there was a substantial core of common centres associated with each of this study’s examination sessions (1993 - 1995), with an influx of new centres in the 1994 session.

For each student in each of my study’s populations (see Appendix 4) I had identified their examination centre’s:

(1) status, for example independent, comprehensive, college of further education; (2) locus of control, for example voluntary aided;

(3) age range of students; (4) intake of boys and girls,

and calculated the percentage of each population coming from all boys, all girls, and coeducational type centres.

Even allowing for the influx of new centres in 1994, Appendix 4 shows a striking consistency in the centres’ status and type for all of the study’s populations. More than 85 per cent of each of the

study’s populations consist of students attending secondary comprehensive schools, secondary

independent schools being the next most common category but at a significantly less represented level. The vast majority of the students in each population are based in maintained schools, the next most common category being independent schools. In this respect I can say that any skew towards high performance for any of my populations is unlikely to be due to a disproportionate number of independent centres (see Chapter 3.2.2). The vast majority of the students in each population are based in schools covering the age range 11-19 years with percentages ranging from 70.9 in 1993 to 82.4 in 1995, tier 03. The next most common category is 11-16 schools with percentages varying from 5.9 (1995, tier 02) to 18.7 (1993). This represents a substantial variation in the relative proportion of students in 11-16 schools across the study’s populations. Nevertheless, each of the populations is dominated by students in centres that have staff and the resources to teach post GCSE, for example ‘A’ level. There was a striking consistency in the percentage of students who come from all boys, all girls and coeducational centres in the study’s populations.

My view was that there was sufficient similarity in the nature of the centres associated with the examination sessions of this study to support the usefulness of my comparison of their students’ achieved grades across time (1993 - 1995).

4.1.2 Coursework arrangements

My intention was to identify whether there had been any disparities in the nature of the coursework and its administration between the three science subjects and across the years of the study that might in turn impact on my populations’ achieved grades. The aspects of coursework considered were:

(1) whether it was practical based;

(2) the general types of activity expected of students; (3) whether it was teacher-assessed;

(4) the percentage of the total marks allocated to it (weighting); (5) possible changes in the above (1) - (4).

As noted in Chapter 2, coursework weighting changed over the years specifically increasing from 20 per cent to 25 per cent for all of the science subjects for the 1995 assessments to reflect the

National Curriculum at GCSE. The nature of the coursework also changed at this time. For both the 1993 and 1994 examination sessions biology and chemistry coursework consisted of practical exercises written by teachers with exemplar materials to serve as guidance within similar assessment objective frameworks. They were administered and also marked by teachers. Physics had a different arrangement: teachers were expected to assess their students in a similar way to biology and chemistry teachers but the outcomes only counted as 10 per cent of the coursework weighting - the remaining 10 percent resulted from a WJEC set practical test, administered and marked by teachers. The 1995 examination session saw a rationalization of coursework assessment, with GCSE biology, chemistry and physics all requiring the assessment of students’ skills to plan and carry out investigations using teacher selected, implemented and marked practical tasks. For each of the examination sessions of this study, moderation of practical coursework (including the physics practical tests of 1993 and 1994) by WJEC personnel aimed to achieve a measure of consistency in standards within and between the different GCSE science subjects across time.

Although the potential consequences of the physics’ practical tests of 1993 and 1994 for students’ motivation and overall relative achievements are recognized, they are currently unquantifiable. Similarly the change in weighting and nature of practical coursework first observed in the 1995 examination session might also have influenced students’ motivation and achievements. Certainly the increase in weighting should have an impact on the profile of expected achievements at any particular grade boundary. It could be argued that moderation of coursework by WJEC should remove variations in standards so that grade A in 1993 is achieved with the same standard of scientific skills, including those in the practical domain, as in 1995. One could also argue that these identified differences in the nature and weighting of coursework, impact on the validity of comparing students’ achievements across the different GCSE science subjects and across the study’s examination sessions. I bear these arguments in mind when I discuss emerging patterns in my findings at the end of the Chapter.

4.2 Presentation of the findings

Students' GCSE results from the three WJEC and two SEG consecutive examination sessions constituted a large database. I decided to present the analysis and interpretation of findings about the

nature of students’ performances in different science subjects first followed by the findings about the nature of the relationship between these performances and variables such as the examination paper cognitive skill demands. This would allow an understanding of the data to be developed and help make sense of any relationships emerging.

Each of the relationships explored is dealt with in turn. For each analytical treatment used to explore the potential relationship the data and analysis are first discussed within years and then across years to explore any consistency in findings when this approach does not introduce unnecessary repetition. Otherwise, the data and analysis from the different years are discussed together for each analytical treatment. The 1995 WJEC examination session saw a different allocation of awarded grades than for

1994 and 1993. This meant I had to consider students entered for Tier 03 and Tier 02 examination papers as separate groups. The 1995 WJEC data and analysis are therefore presented as separate Tier 03 and Tier 02 outcomes. The SEG data and analysis are presented after that for WJEC. The findings are represented in bar charts and line graphs with different colours for biology, chemistry, physics, English and mathematics. A dark shade and a hatched pattern in the subject's colour are used to represent boys’ and girls’ performances respectively.

Given my sample sizes and that my study is concerned with social science and educational norms for statistical significance, I have adopted a five per cent significance level for rejecting the null hypothesis as recommended by Coolican (1994). Differences or relationships in the examination performance findings are counted as significant when p < 0.05.

4.3 Exploring relationships between students’ performances in WJEC biology,

chemistry and physics GCSE examinations

In document Comparability and Examination Performance: Technical and Social Approaches to Its Study (Page 115-119)