Chapter 3 Data, cohort samples and performance measures
3.5 Selected cohorts for correlation analysis
In the following sections we present an exhaustive analysis of the correlation be- tween standardised Simce scores and school marks standardised at school-grade level. Knowing that the National Examination is not taken in all grades every
year, we want to to find out an alternative to replace the Simce scores when two consecutive measures of academic performance are required.
Figure 3.2 presents in which grades the Simce exam has been taken during the period 2003 - 2013. The Simce has been mainly taken in 4th, 8th and 10th
grades, where 8th and 10th were tested every other year. Just from 2013, the
National examination started to be taken every year in 4th, 6th, 8th and 10th
grades. In addition, 2nd grade which started being taken, but only in Language. Figure 3.2: Timeline - Simce
2003 2007 2008 2009
10th Grade
8th Grade
4th Grade 4th Grade 4th Grade 4th Grade
10th Grade 10th Grade
8th Grade 8th Grade
2004 2005 2006
Note: The grades we show in this <meline are those where Simce is taken for the corresponding years. We separate between Primary and Secondary grades. Source: The Na<onal Examina<on (Simce) data set
2010 2011 2012 2013
4th Grade 4th Grade 4th Grade 4th Grade
10th Grade 8th Grade 10th Grade 10th Grade 8th Grade 6th Grade 4th Grade Secondary Primary 2nd Grade 2nd Grade
Note: The grades we show in this timeline are those where Simce is taken for the corresponding years. We separate between Primary and Secondary grades.
Source: The National Examination (Simce) data set.
The results obtained from the Simce exam can be used as measures of pupil cognitive abilities, which is what we need for modelling Value Added measures. Pupil academic performance, represented by Simce scores can be explained by unobserved heterogeneities such as teacher and school effects. However, Value Added Models require consecutive measures of pupil achievement to capture the impacts of teachers (and schools) from one year to the next. From Figure 3.2 we can see that cohorts taking the Simce exam in 4th grade do not take the exam
either in 3rd or 5th grade, which it would be desirable for most of the Value Added
Models.
To address the lack of Simce scores for consecutive grades, we propose to use school marks standardised at school-grade level. Language and Maths school marks correspond to the final grades obtained for each pupil in every subject, and they assess the same pupil cognitive abilities we require for academic measures in the Value Added Models.
Although we are aware of the difference of evaluation criteria between school marking and National Examination testing, we investigate whether standardised Language and Maths marks at school level can be used as good proxies for stan-
dardised Language and Maths Simce scores. Firstly, we analyse the correlation levels for three different 4th grade cohorts. Secondly, we compare graphically ker-
nel distributions of both academic measures, and finally we apply a regression analysis where we confirm the positive correlation between standardised school marks and standardised Simce scores.
We select three 4th grade cohorts to carry out our analyses. The selected
cohorts we show in Figure 3.3 are representative of the five 4th grade cohorts, from 2005 to 2009, which we will use in this thesis.
The selected cohorts are taken from the SPD, which is considered our master base, and we merge the Simce scores (from the National Examination data base) and the Language and Maths marks (from the School Marks data base) for our correlation analysis. The key matching variable between the databases is the unique student identification number (Mrun).
Figure 3.3: Selected cohorts
2007 4th Grade Cohort 1 2005 2006 Cohort 2 Cohort 3 2008 2009 2004
Note: Cohort 1(a) and Cohort 1(b) are mainly composed by the same group of students which have passed from 4th grade to 8th grade without repeating. The differences between students could be explained by other students who repeated at least once during this period, plus other attrition problems.
3.5.1
Student panel and performance measure matching
The matching process of the SPD to the National Examination and School Marks DBs generates some missing observations. However, we have created a register for those cases where Simce scores and school marks are not available, and we will use them for selection purposes when we estimate our Value Added Models. In Table 3.9 we show the matching between the the two performance data bases (Simce and School Marks), where we take the Simce Scores (SScs), Language Marks (LMrk) and Maths Marks (MMrk) to be assigned to every pupil in the cohort. The percentage of missing values is presented in every case, and we can observe their availability across cohorts.In all cohorts up to 2007, we can see how the rate of missing observations when matching to the individual Simce score stays around 7%, while in 2009 it increases to 11%, approximately. It is not clear why the rate of missing Simce Scores increased in 2009.
There are some reasons why a school, and therefore a student, does not have a Simce score. These reasons are related to the minimum number of students taking the exam and the absenteeism rate on the day of the exam. There is a list
of requirements that schools have to fulfil. When any of requirements fails, the results of the exam are not provided. The requirements could vary from year to year depending on the design of the exam.
Table 3.9: Match between performance data bases (selected cohorts)
2005 With%Match Without%Match Total% %%Missing
Simce%Score%(SScs) 248,819 19,343 268,162 7.2% Lang%Marks%(LMrk) 263,872 4,290 268,162 1.6% Maths%Marks%(MMrk) 264,936 3,226 268,162 1.2% Both%SSmc%E%LMrk 246,854 2,325 249,179 0.9% Both%SSmc%E%MMrk 247,618 2,025 249,643 0.8% All%SSmc%E%LMrk%E%MMrk 246,498 2,009 248,507 0.8%
2007 With%Match Without%Match Total% %%Missing
Simce%Score%(SScs) 238,785 18,559 257,344 7.2% Lang%Marks%(LMrk) 253,033 4,311 257,344 1.7% Maths%Marks%(MMrk) 253,972 3,372 257,344 1.3% Both%SSmc%E%LMrk 236,913 2,439 239,352 1.0% Both%SSmc%E%MMrk 237,554 2,141 239,695 0.9% All%SSmc%E%LMrk%E%MMrk 236,372 2,131 238,503 0.9%
2009 With%Match Without%Match Total% %%Missing
Simce%Score%(SScs) 222,933 27,342 250,275 10.9% Lang%Marks%(LMrk) 247,385 2,890 250,275 1.2% Maths%Marks%(MMrk) 248,424 1,851 250,275 0.7% Both%SSmc%E%LMrk 222,075 2,310 224,385 1.0% Both%SSmc%E%MMrk 222,807 1,725 224,532 0.8% All%SSmc%E%LMrk%E%MMrk 222,075 1,725 223,800 0.8% Student,panel:,4th,Grade,Cohorts
Note: In this table we represent the matching observed between the Student Panel Dataset (SPD), Simce scores (SScs), Language marks (LMrk) and Maths Marks (MMrk), for every selected cohort.