4.3.1 Introduction
4.3.1.1 This section first presents a general background to the JL and CE examinations and then focuses on technical aspects of the validity and reliability of these examinations that can be described quantitatively. Other aspects of quality, which can be described qualitatively, are discussed separately in subsequent sections of this chapter.
4.3.2 The May 2006 Junior Lyceum and the Common Entrance examinations
4.3.2.1 The JL and the CE examinations are prepared on the same syllabi, which were drawn up by Education Officers of the primary sector of the Education Division and published by the Ministry of Education, Youth and Employment. The English, Maltese and Religion examinations held in May 2006 were based on the syllabi published in 2005 while the Mathematics and Social Studies examinations were based on the syllabi published in 1997. Both examinations consist of five written papers: English, Maltese, Mathematics, Religion and Social Studies. There are some
differences between the JL and the CE examinations in the structure of the examination papers and the length of the examinations.
4.3.2.2 In May 2006, 3968 children (2183 boys and 1785 girls) sat for the JL entrance examination. These students came from State, Church and Independent primary schools and 52 came from Area secondary schools. In the same month, 1175 boys from Church, State and Independent primary schools took the CE entrance examination and competed for the 454 places made available in the
Church boys’ secondary schools in Malta. In Gozo, 134 boys competed for 52 places at the Sacred Heart Seminary and 72 girls competed for 17 places in the Bishop‘s Conservatory School. In both examinations, students with special needs requested and obtained special examination arrangements.12
4.3.2.3 The JL examinations took place in the mornings of 8 - 12 May, 2006 and were held in the students’ own schools under the supervision of school teachers, excluding the Year 6 teachers. The CE examinations took place in the afternoons from 4.30 p.m. to 6.00 p.m. or 6.15 p.m. from the 15 to the 19 May 2006 and were held in the schools offering places for admission under the supervision of teachers from the same schools. Marking took place soon afterwards. In the case of JL
examinations, each paper was marked by three markers separately and the average mark of these markers was taken as the final score.
4.3.2.4 The JL results were published on 23 June 2006 and students were informed by text message (SMS) and in writing about the grades they obtained. Appeals were received by the 3 July and resolved by the 4 August. All students who obtained at least Grade C in the five subjects of the examination were offered places in Junior Lyceums. The individual CE results were sent by post to candidates on the 3 July and made available for public viewing on 4 - 5 July 2006 in the Church secondary schools offering places. Appeals were received by the 7 July and resolved by 19 July. The top 525 students in the list and their parents were then invited one by one according to their rank order to opt for one of the available places in the Church secondary schools. Students next on the list replaced students who declined the invitation until all available places were filled.
12 In 2006, the CE Examination Board made special arrangements for 88 candidates; nine candidates were exempted from Religion and one from Maltese; six candidates were provided with an English translation of the Social Studies paper and five with an English translation of the Religion paper. The Board also received 56 requests for consideration as grave
4.3.3 Dataset and Analysis
4.3.3.1 The analysis of the CE examination is based on the raw scores obtained by a randomly selected sample of 234 boys who sat for the examination in May 2006. The data included scores for whole questions. With these data, it is possible to obtain a statistical description of the global marks and the marks obtained in each subject. The reliability of the examination in each subject can be estimated using Cronbach’s alpha and the standard error of measurement.
4.3.3.2 Six Church boys’ schools also provided the raw scores and the ranks obtained by 244 of their students in the May 2001 CE examination and the grades that the same students obtained in the May 2006 SEC examinations. With these data, it is possible to obtain a measure of the predictive validity of the CE
examinations. The data also permit an analysis of other aspects of validity. Since the ranking of candidates can be carried out in different ways, it is possible to explore the consequences of using different methods of ranking students.
4.3.3.3 The analysis of the JL examination is based on the scores of 200 students (boys and girls) who sat for the examination in May 2006; these represented about 5% of the total number of candidates. The mean scores and standard deviations in the five subjects of the sample and of the whole cohort of candidates are compared in Table 4.1 as a check that the sample is truly
representative of the population. The differences shown in the table are not
significant statistically. As in the case of the CE examination, the analysis of the JL examination includes reliability estimates and standard errors of measurement.
Table 4.1 Sample and population average scores and standard deviations (in parenthesis) obtained in the five subjects of the May 2006 JL
examination
English Maltese Mathematics Religion Social Studies Sample 51.4 (19.9) 67.1 (16.5) 63.8 (23.3) 74.9 (14.2) 65.5 (17.1) Population 49.8 (20.0) 66.2 (16.1) 61.5 (23.1) 74.0 (13.9) 64.7 (16.5)
4.3.3.4 Besides the separate analyses of marks obtained in the JL and CE examination, a direct comparison between the two examinations is possible by considering the results of 205 out of the 234 randomly selected boys in the CE examination sample who sat for both examinations in May 2006. With these matched scores, it is possible to compare the distribution of marks in the five examinations and comment on the relative difficulty of the examinations.
Furthermore, calculations of the correlations between the JL and the CE examination scores in each subject would provide a measure of the concurrent validity of the examinations. This means that if the correlation between them is high and either the JL or the CE examination in each subject is taken as a criterion of validity, then the other examination would also be considered valid.
4.3.3.5 The next sections present the results of the analyses in the following order. First, the distribution of global scores is given for the CE examination only since the selection of students depends on their ranking according to the global scores they obtain. The global scores are not relevant in the JL examination since students pass by obtaining at least Grade C in each subject. Then the distribution of the scores obtained in each subject of the JL and CE examination are presented along with relevant statistical data that permit comparisons between the examinations. Next comes a section on reliability estimates and another on issues of validity.
4.3.4 Distribution of Scores
4.3.4.1 The distribution of raw scores indicates the efficiency of the examinations in discriminating between students of different abilities. This is an important function of a selective examination, and concerns especially the CE examination. The CE
examination papers are marked out of a maximum of 200 so that the maximum global score over the five subjects is 1000 marks. Figure 1 shows the distribution of global scores in the May 2006 CE examination for the randomly selected sample of 234 boys.
Figure 1. Distribution of Global Marks
Common Entrance Examination May 2006
0 10 20 30 40 50 60 70 80 0 - 99 100 - 199 200 - 299 300 - 399 400 - 499 500 - 599 600 - 699 700 - 799 800 - 899 0 -10 00
Fr
e
quenc
y
Figure 4.1 Distribution of Global Marks Common Entrance Examination May 2006
The distribution is clearly not normal. Indeed no student obtained a score of less than 290 and only about 10 students scored less than 500, that is, 50% of the global score. Most of the boys were well prepared for the examination and obtained an average score of 705.3, that is, 70.5 percent. With such a distribution, it
becomes more difficult to rank students because many students may be grouped in one rank and the difference between one rank and another may be minimal. Such a distribution demands high reliability in marking and great concentration from the students on the day of the examination since a small error in marking or inattention by the student may result in a drastic drop in rank order. Individual students may also perform better or worse depending on the particular questions and topics chosen for the examination with the consequence that they gain or lose in the ranking. The distribution of global scores in the JL examination is not presented since it is meaningless in the context of that examination.
4.3.4.2 Figures 4.2 to 4.6 show the distribution of marks by subject in both the JL and the CE examinations in Maltese, English, Mathematics, Religion and Social Studies. This figure shows marked differences between the distribution of marks in English, Maltese and Mathematics and the distribution in Religion and Social Studies.
Figure 4.2 Maltese: Distribution of Marks for JL and CE
0 20 40 60 80 100 0 20 40 60 80 100 Marks Fr e q ue nc y MaltJL MaltCE
0 20 40 60 80 100 0 20 40 60 80 100 Marks F re q ue nc y SocJL SocCE 0 20 40 60 80 100 0 20 40 60 80 100 Marks Fr e que nc y EngJL EngCE 0 20 40 60 80 100 0 20 40 60 80 100 Marks Fr e q u e nc y MathsJL MathsCE Figure 4.3 English: Distribution of Marks for JL and CE
Figure 4.4 Mathematics: Distribution of Marks for JL and CE
4.3.4.3 Most of the scores in the JL Maltese examination fall within the range 50 per cent to 100 per cent while those of the CE examination fall within the range 40 per cent to 90 per cent. Both ranges show a fair spread of marks and permit
discrimination between students of different abilities in the subject. The peaks of the distributions are also different and show that most students obtained a score of about 70 per cent in the CE examination and about 80 per cent in the JL
examination. Clearly, students found the Maltese paper of the CE examination more difficult than that of the JL examination. This is confirmed when the average marks and standard deviations13 of 205 candidates taking both examinations are compared as in Table 4.2. The t-test14 further confirms that the differences are significant.
Table 4.2 Average scores and standard deviations in the Maltese examinations
Maltese Junior Lyceum Common Entrance
Average percentage score 72.9 61.7
Standard deviation 11.4 12.7
Paired t-test t = 22.549, df=204, p<0.001
13
The standard deviation (SD) is a measure of the spread of the marks; the higher its value the wider is the spread of marks. In an examination marked out of 100, the SD would be about 15.
14
The paired t-test provides a measure of the significance of the difference in average scores which members of the same group obtain on different tests. Statistical significance is shown by a p value. If the p value is less than 0.05 then the difference is significant; the smaller it is, the more significant is the difference. The term df refers to the number of degrees of freedom, which in the case of paired t-tests is one less than the total number of members in the group. 0 20 40 60 80 100 0 20 40 60 80 100 Marks F re q ue nc y RelJL RelCE Figure 4.6 Religion: Distribution of Marks for JL and CE
4.3.4.4 In the English examinations, the ranges of scores of both examinations are very similar. Both show a wide spread of marks as most students obtained scores between 30 per cent and 90 per cent. This spread permits good discrimination between students of different abilities. Fewer students obtained high marks in the JL than in the CE examination and the average scores in Table 4.3 reflect this
distribution. The t-test shows that the differences are significant.
Table 4.3 Average scores and standard deviations in the English examinations
English Junior Lyceum Common Entrance
Average percentage score 59.3 62.1
Standard deviation 15.1 15.8
Paired t-test t = -4.614, df=204, p<0.001
4.3.4.5 The distributions of scores in the Mathematics examination are apparently bimodal for the sample of students in this analysis as they show two peaks in both the JL and the CE examination. In the CE examination, the distribution is flatter as more students obtained lower marks than in the JL examination. This distribution allows a better discrimination between students than the distribution of marks in the JL examination. However, one must recall the different purposes of the two
examinations. The students found the JL paper significantly easier, as reflected in the difference in the average marks and confirmed by the t-test
Table 4.4 Average scores and standard deviations in the Mathematics examinations
Mathematics Junior Lyceum Common Entrance
Average percentage score 76.9 67.0
Standard deviation 15.6 16.7
Paired t-test t = 15.944, df=204, p<0.001
4.3.4.6 In Religion, the distributions of scores in both JL and CE examinations are practically the same. The great majority of students scored marks between 80 per cent and 100 per cent, as reflected in the relatively low standard deviations (Table 4.5). This narrow range of marks does not allow for good discrimination, as most students seem to have a relatively high level of ability in the subject and found both examinations easy. The t-test shows no statistically significant difference in the scores obtained in the two examinations.
Table 4.5 Average scores and standard deviations in the Religion examinations
Religion Junior Lyceum Common Entrance
Average percentage score 80.8 79.8
Standard deviation 9.6 11.8
Paired t-test t = 1.340, df=204, p>0.05
4.3.4.7 The distributions of scores in Social Studies show that the students found both the JL and the CE examinations relatively easy. Most students in the sample obtained scores between 60 per cent and 100 per cent in the JL examination and even more students obtained scores between 70 per cent and 100 per cent in the CE examination, with a high proportion of them scoring close to 100 per cent in the CE examination. The average scores and the relatively low standard deviations shown in Table 4.6 quantify these observations. The t-test confirms that there is a highly significant difference between the scores in the JL and the CE examinations.
Table 4.6 Average scores and standard deviations in the Social Studies examinations
Social Studies Junior Lyceum Common Entrance
Average percentage score 74.1 82.6
Standard deviation 11.1 10.7
Paired t-test t = 20.222, df=204, p<0.001
4.3.5 Reliability Estimates
Internal consistency
4.3.5.1 Reliability refers to consistency in assessment results, that is, if students are assessed in a subject using equivalent forms of assessment, then the results from these assessments should be consistent and reflect the students’ ‘true’ ability in the subject. Black and Wiliam (2006) identify three sources of error that threaten reliability in assessment: (i) any student may perform better or worse depending on the actual questions used for assessment, (ii) the same student may perform better or worse from day to day, and (iii) different markers may mark the same piece of work differently.
4.3.5.2 The 11+ examinations adopt three processes to avoid sources of error and to ascertain high reliability in marking. Marking of both JL and CE examinations is regulated by means of a detailed marking scheme, which is agreed to by the
markers before the marking process. In both examinations, moderation by the chief examiner of each subject takes place to ensure consistency between markers during
the marking process. Moreover, in each subject of the JL examination, three markers score each script separately and their scores are averaged to obtain the final scores of each candidate.
4.3.5.3 The reliability of the results of an examination can be tested by estimating the internal consistency of the scores on different parts of the examination.
Cronbach’s alpha coefficient provides a good measure of the correlation between scores obtained in various parts of the examination and is therefore commonly used to estimate reliability. The alpha coefficient has a range from 0 to 1 and in a high stakes examination high reliability is indicated by a coefficient of about 0.85 or higher. Table 4.7 shows the results of calculations of Cronbach’s alpha coefficient from the raw scores awarded to the sample of students on each question in the five subjects of the May 2006 JL and CE examinations.
Table 4.7 Values of Cronbach’s Alpha coefficient of the five subjects of the May 2006 JL and CE examinations
Subject JL exam CE exam
English 0.868 0.824
Maltese 0.875 0.859
Mathematics 0.945 0.912
Religion 0.805 0.895
Social Studies 0.902 0.872
The values in Table 4.7 show that eight out of ten alpha coefficients exceed the 0.85 criterion indicating a satisfactorily high reliability. The other two values, namely, the alpha coefficient of the JL Religion examination and the alpha coefficient of CE English examination, are also close to the criterion value.
Standard Error of Measurement
4.3.5.4 The effects of the sources of error mentioned in paragraphs 4.3.4.1 and 4.3.5.1 on the results of particular candidates cannot be estimated; some
candidates may gain while others may lose marks as a result of these errors of measurement. When large numbers of candidates are involved it is possible to describe the distribution of errors and to calculate its standard deviation from the reliability of the assessment and the standard deviation of the scores.15 The standard error gives an indication of how far a student’s actual score may be from his/her true score. The true score may be defined as the score which actually reflects the student’s ability in the subject and it is the score that the student would
15 The standard error of measurement, s.e.m. = SD.√(1-r), where SD is the standard deviation of the total scores and r is the reliability of the assessment (Cronbach’s alpha).
obtain if there were no errors of measurement. Table 4.8 presents the standard errors of measurement of the JL and the CE examination.
Table 4.8 Standard Errors of Measurement (s.e.m.) of the May 2006 JL and CE examinations Subject s.e.m. JL exam s.e.m. CE exam
English 5.6 6.6 Maltese 4.8 4.8
Mathematics 5.0 5.0
Religion 5.6 3.8
Social Studies 5.2 3.9
4.3.5.5 The values in Table 4.8 indicate the spread of the errors in each subject. These values may be interpreted as follows taking the s.e.m. for Mathematics as an example.
(a) There is a 68% chance that the scores obtained by the students taking the Mathematics examination are within 5 marks (i.e. one standard deviation) of their true scores. In other words, if a student scores 50 marks in the
examination, then there are two out of three chances that the student’s true score in this subject is somewhere between 45 and 55.
(b) There is a 96% chance that the scores obtained by the students are within 10