• No results found

Selected cohorts for correlation analysis

Chapter 3 Data, cohort samples and performance measures

3.5 Selected cohorts for correlation analysis

In the following sections we present an exhaustive analysis of the correlation be- tween standardised Simce scores and school marks standardised at school-grade level. Knowing that the National Examination is not taken in all grades every

year, we want to to find out an alternative to replace the Simce scores when two consecutive measures of academic performance are required.

Figure 3.2 presents in which grades the Simce exam has been taken during the period 2003 - 2013. The Simce has been mainly taken in 4th, 8th and 10th

grades, where 8th and 10th were tested every other year. Just from 2013, the

National examination started to be taken every year in 4th, 6th, 8th and 10th

grades. In addition, 2nd grade which started being taken, but only in Language. Figure 3.2: Timeline - Simce

2003   2007   2008   2009  

10th  Grade  

8th  Grade  

4th  Grade   4th  Grade   4th  Grade   4th  Grade  

10th  Grade   10th  Grade  

8th  Grade   8th  Grade  

2004   2005   2006  

Note:  The  grades  we  show  in  this  <meline  are  those  where  Simce  is  taken  for  the  corresponding  years.  We  separate  between  Primary  and  Secondary  grades.   Source:  The  Na<onal  Examina<on  (Simce)  data  set  

2010   2011   2012   2013  

4th  Grade   4th  Grade   4th  Grade   4th  Grade  

10th  Grade   8th  Grade   10th  Grade   10th  Grade   8th  Grade   6th  Grade   4th  Grade   Secondary   Primary   2nd  Grade   2nd  Grade  

Note: The grades we show in this timeline are those where Simce is taken for the corresponding years. We separate between Primary and Secondary grades.

Source: The National Examination (Simce) data set.

The results obtained from the Simce exam can be used as measures of pupil cognitive abilities, which is what we need for modelling Value Added measures. Pupil academic performance, represented by Simce scores can be explained by unobserved heterogeneities such as teacher and school effects. However, Value Added Models require consecutive measures of pupil achievement to capture the impacts of teachers (and schools) from one year to the next. From Figure 3.2 we can see that cohorts taking the Simce exam in 4th grade do not take the exam

either in 3rd or 5th grade, which it would be desirable for most of the Value Added

Models.

To address the lack of Simce scores for consecutive grades, we propose to use school marks standardised at school-grade level. Language and Maths school marks correspond to the final grades obtained for each pupil in every subject, and they assess the same pupil cognitive abilities we require for academic measures in the Value Added Models.

Although we are aware of the difference of evaluation criteria between school marking and National Examination testing, we investigate whether standardised Language and Maths marks at school level can be used as good proxies for stan-

dardised Language and Maths Simce scores. Firstly, we analyse the correlation levels for three different 4th grade cohorts. Secondly, we compare graphically ker-

nel distributions of both academic measures, and finally we apply a regression analysis where we confirm the positive correlation between standardised school marks and standardised Simce scores.

We select three 4th grade cohorts to carry out our analyses. The selected

cohorts we show in Figure 3.3 are representative of the five 4th grade cohorts, from 2005 to 2009, which we will use in this thesis.

The selected cohorts are taken from the SPD, which is considered our master base, and we merge the Simce scores (from the National Examination data base) and the Language and Maths marks (from the School Marks data base) for our correlation analysis. The key matching variable between the databases is the unique student identification number (Mrun).

Figure 3.3: Selected cohorts

2007   4th  Grade   Cohort  1   2005   2006   Cohort  2   Cohort  3   2008   2009   2004  

Note: Cohort 1(a) and Cohort 1(b) are mainly composed by the same group of students which have passed from 4th grade to 8th grade without repeating. The differences between students could be explained by other students who repeated at least once during this period, plus other attrition problems.

3.5.1

Student panel and performance measure matching

The matching process of the SPD to the National Examination and School Marks DBs generates some missing observations. However, we have created a register for those cases where Simce scores and school marks are not available, and we will use them for selection purposes when we estimate our Value Added Models. In Table 3.9 we show the matching between the the two performance data bases (Simce and School Marks), where we take the Simce Scores (SScs), Language Marks (LMrk) and Maths Marks (MMrk) to be assigned to every pupil in the cohort. The percentage of missing values is presented in every case, and we can observe their availability across cohorts.

In all cohorts up to 2007, we can see how the rate of missing observations when matching to the individual Simce score stays around 7%, while in 2009 it increases to 11%, approximately. It is not clear why the rate of missing Simce Scores increased in 2009.

There are some reasons why a school, and therefore a student, does not have a Simce score. These reasons are related to the minimum number of students taking the exam and the absenteeism rate on the day of the exam. There is a list

of requirements that schools have to fulfil. When any of requirements fails, the results of the exam are not provided. The requirements could vary from year to year depending on the design of the exam.

Table 3.9: Match between performance data bases (selected cohorts)

2005 With%Match Without%Match Total% %%Missing

Simce%Score%(SScs) 248,819 19,343 268,162 7.2% Lang%Marks%(LMrk) 263,872 4,290 268,162 1.6% Maths%Marks%(MMrk) 264,936 3,226 268,162 1.2% Both%SSmc%E%LMrk 246,854 2,325 249,179 0.9% Both%SSmc%E%MMrk 247,618 2,025 249,643 0.8% All%SSmc%E%LMrk%E%MMrk 246,498 2,009 248,507 0.8%

2007 With%Match Without%Match Total% %%Missing

Simce%Score%(SScs) 238,785 18,559 257,344 7.2% Lang%Marks%(LMrk) 253,033 4,311 257,344 1.7% Maths%Marks%(MMrk) 253,972 3,372 257,344 1.3% Both%SSmc%E%LMrk 236,913 2,439 239,352 1.0% Both%SSmc%E%MMrk 237,554 2,141 239,695 0.9% All%SSmc%E%LMrk%E%MMrk 236,372 2,131 238,503 0.9%

2009 With%Match Without%Match Total% %%Missing

Simce%Score%(SScs) 222,933 27,342 250,275 10.9% Lang%Marks%(LMrk) 247,385 2,890 250,275 1.2% Maths%Marks%(MMrk) 248,424 1,851 250,275 0.7% Both%SSmc%E%LMrk 222,075 2,310 224,385 1.0% Both%SSmc%E%MMrk 222,807 1,725 224,532 0.8% All%SSmc%E%LMrk%E%MMrk 222,075 1,725 223,800 0.8% Student,panel:,4th,Grade,Cohorts

Note: In this table we represent the matching observed between the Student Panel Dataset (SPD), Simce scores (SScs), Language marks (LMrk) and Maths Marks (MMrk), for every selected cohort.