• No results found

7. Model Design

7.5 Model validation: stochasticity analysis

7.5.3 SIP intervention analysis

Drawing from the Composite Survey reports, this section will incorporate the activities implemented for ESSPIN (as it was carried out in practice) to the baseline state established. It will gradually move through the rolling out phases in both states, from 2012 to 2016. There are three main data points available: 2012, 2014 and 2016. This is a useful structure and it guides the development of the SIPM. The first phase (2012) is treated as a pilot case and sensitivity tests are carried out on the system in this state before progressively building up the SIPM. Results from the next two phases will be presented and discussed in the next chapter.

a) ESSPIN 2012: Phase 1

From 2011 to 2012, the two states selected a percentage of schools (7% in Lagos and 3% in Kaduna) as pilot cases to implement the first phase of the intervention. All components of the SIP) were introduced to these schools and communities over the course of two years.90 In a

cascading sequence of activities, the school support officers were the first to receive training which was then stepped down to teachers, headteachers and School communities. A select number of teachers from each school received training on subject knowledge, and pedagogical skills, and the headteachers were coached on leadership skills and school management. The communities were advised on constituting a representative SBMC which included members from marginalised groups, and were also trained on strategies to mobilise resources (raise money) for school development purposes. After each training session (typically three to four times a term), trained teachers were expected to share what they learnt with their colleagues in an organised, formal session so that improved teaching practices could spread. Headteachers would conduct Professional Development Meetings (PDMs) for teachers, maintain good school records, and ensure that relevant measures were taken to support the SIP. Periodic inspections are carried out by the SSOs where they observed class lessons and reviewed school administration practices (attendance, accounting, SBMC and PDM sessions, teaching and learning activities etc). The school communities would participate in two to three meetings a term with teachers, headteachers, SSOs and pupils to discuss challenges faced by the school and steps to be taken. In addition, the School communities play a critical role in raising funds to meet any needs highlighted through these meetings.

These activities are modelled into the SIPM with each stakeholder carrying out the responsibilities outlined above at the specified intervals. With these new behaviours introduced, the internal variance of the updated model must be estimated to determine the number of runs which would be sufficient to analyse the outcomes for the first few years. To have confidence in the SIPM results, it is important that the internal variance is relatively low. This process helps to define the parameter space for the model with every point representing a particular combination of values. Table 7.5 below provides a summary of the variances of a select set of variables for Lagos state in 2012. Here, the average performance of headteachers made up of a combination of attributes (job satisfaction, self-efficacy, support for the program and leadership ability) are presented in Table 7.5.

90 Parallel measures around organisational restructuring initiated and mid to long-term financial planning was also initiated at the state education board level.

Table 7.5 Mean and standard deviation of model outputs

No of samples

No of iterations

Headteacher performance Budget release Running time (mins) Mean Std_dev 100 100 10.41 0.125 91.2% 35 1000 100 10.16 0.122 89 100 1000 10.16 0.120 110

The table above highlights the number of runs (iterations), and average results over that period. With this, the internal stochasticity of the model outputs can be reduced to the degree that results can be treated as almost deterministic. An average of 100 stochastic iterations constitute a single experiment (the mean from multiple runs), and 100 samples are taken to calculate the standard deviation in the model. With more iterations, the standard error decreases. However, it is worth highlighting that the individual runs from every experiment are often still highly variable, and any of these runs might more closely correspond to the real world. In general, the potential for variance in this system is relatively high because the implementation process in this environment was quite staggered and aspects of the SIP were not uniformly rolled out. As a result, it is difficult to determine exactly which components are buffered, and which are most influential. This uncertainty is captured using a function referred to as a ‘contextual factors’. Here the standard deviation is two orders of magnitude less than the mean. This is deemed a sufficiently low internal variance within the model to proceed. Accordingly, any subsequent changes to the parameters or variables will go through 100 iterations and the average value represents changes in the output.

With this structure, the results of the model can be compared with average real world data on the implementation of the SIP. Table 7.6 below show the results from the first phase of the SIP (pilot run) in Lagos and Kaduna states. The data represents over 200 schools with corresponding teachers, headteachers and School communities.

Table 7.6 Comparison of Composite Survey evaluations and SIPM results for 2012

SIP component CS1 Data Lagos results Std_error Model CS1 Data Kaduna Model results Std_error

Effective

headteachers 8.2% 8.4% 1.40 8.5% 9% 1.79

Functional SBMCs91 14.4% 11.3% 1.26 28% 31% 2.51 School development

plans 8.7% 10.5% 0.29 1.4% 1.2% 0.23

Quality Schools 7.4% 9.8% 0.59 1% 2% 0.37 These results indicate that the underlying mechanics of the model captures the dynamics of stakeholders and resources in the system relatively well, and the standard errors sit well within a reasonable range – this is a good sign. Notwithstanding this, Dyke (1981) provides useful lessons on computational simulations modelled on human systems. The study notes that even when models produce desirable results, they may still exhibit one or two potential errors: methodological, and/or heuristic errors. In the first instance, methodological errors capture cases where model processes may become over-elaborated in an effort to replicate the real system. The consequence of this is that as the model grows more complex, it is increasingly difficult to ascertain the precise effect of each element. In the case of the latter, heuristic errors centres on the assertion that the model is an accurate representation of the system being studied. Every modeller must make assumptions in the process of building simulations but, it is important to note that observed similarities between the two systems does not necessarily mean that the same underlying principles are at work. It simply means that there is some degree of logical similarity between the processes.

With these potential errors in mind, additional measures have been taken to reduce uncertainty. Specifically, the SIPM is re-calibrated using a range of assumptions (with extreme and slight variations) to see how the output is affected by differing the assumptions underpinning the model algorithms (see subsequent section). This approach is useful because the “artificial data” generated from models tend to be less fragmented and more complete than “real world data”.92 This often

means that they can be used for more robust analysis. After this treatment, any similarities still exhibited by the SIPM results with the real world data (from the Composite Surveys) would be good indicators that within the limitations of the simulated system, the model can produce results consistent with the observed phenomena. As such, commonalities within the components can be identified and studied in a more complete manner. However, at the end of the day, every model is greatly dependent on the judgement and experience of the modeller. As such, a way of mitigating potential oversights of the modeller is to adopt a collaborative approach (as is the case for this

91 School communities

92 This is a reasonable expectation because the rules and behaviours are more explicitly thought out, defined and then encapsulated in models.

research) whereby the stakeholders in the system and education practitioners are actively engaged throughout the modelling process. In this way they can act as a check on the validity of the model at each stage.

b) Sensitivity tests

As was carried out on the baseline state of the model, the first phase is also tested for internal variation of parameters that influence the SIPM results.

The first test investigates the effects of changing the weights in the attributes of teachers. In this case (see Table 7.7 below), the weight is distributed between the teachers and trainers. The relative influence of each component is represented here using different increments; larger step sizes at the extreme ends of the spectrum, and smaller step sizes in the area of interest (closer to the middle). As mentioned above, even with the same inputs, probabilistic models can produce varying results over different runs, and so 1000 iterations are conducted and the mean is taken as a single run. The same sensitivity test is performed on all the agent attributes and every time step allows for a probability of change in a relationship/attribute status: improve, worsen or stay the same. To execute this, a uniformly distributed random variable (between 0 and 1) is generated with each time step for each agent relationship and attribute (U). This is then compared with the change probability vector in the following way: If U < Pimprove, then the relationship status increases. Otherwise, if U ≥ Pimprove and U < (Pimprove + Pstay), the relationship status remains the same. Otherwise, the relationship status will decrease in the next time step.

Table 7.7 Weightings sensitivity analysis (attributes changes from training process) Weighting for teachers Weighting for trainers Mean learning (teachers) Mean competence

level of teachers Std_dev

1% 99% -14.68 -3.78 0.023 10% 90% 22.07 5.41 0.016 30% 70% 41.21 12.03 0.017 50% 50% 87.16 24.19 0.118 60% 30% 31.22 19.89 0.173 80% 20% 18.23 43.09 0.333 99% 1% -8.47 -0.11 0.221

The results from this test show that the distribution of these weightings matter significantly. Each behaviour (in this case learning) is defined based on a fine, intricate and weighted balance of attributes of the recipients and providers of training programmes. For example, on the teacher

side, the learning process is dependent on their level of inquiry (a product of self-efficacy and support for the programme), and the quality of their interactions with their trainers. From the government-side, the abilities of the trainer and amount of investment in the capacity development process drives the improvement in teaching and learning. The teachers who participate in the training are expected to learn and step-down their training and so knowledge spreads through the system. Table 7.7 demonstrates that at the extreme ends of the spectrum, when one half of the equation is weighted as more or less null, there is a negative effect across the board. This is a reasonable expectation as both sides must be actively engaged in the process for the SIP to be implemented. Closer to the middle, the shifts in the weightings contribute differently to overall teaching competence and learning. As such, after an initial examination and elimination of the boundary conditions, the choice of the most appropriate distribution is a subjective one which relies on observations from the field.

Consultations with key informants and subject matter experts provided more clarity on the degree to which teachers could develop capacity their own, and how much external support could substantially produce positive shifts in the system. There was unanimous consensus that efforts on the part of individual teachers could only go so far, improvements in teaching and learning would greatly depend on how much resources are invested. Results from the Composite Surveys also validate this choice. Consequently, a distribution ranging from 1:9 to 3:7 towards the trainers/government was considered reasonable (highlighted in the shaded segment of Table 7.7). Furthermore, this distribution in the SIPM more accurately reflects the figures reported in that evaluation. Overall, this test highlights that agents in the model are very responsive to slight variations in assumptions, thus emphasising the necessity of treating the model results with great care.

Table 7.8 below summarises the sensitivity analyses carried out for the number of time periods. This test helps in determining the rate at which each attribute in the system changes over the course of a time step – change can occur in a positive or negative direction. In this case, the SIPM references Mital (2015) and sets the step size based on the amount of time it takes it attribute to reach its maximum level (a subjective decision that is not necessarily the same for all attributes). Here, the step size is determined by dividing the maximum value of the attribute by the time it takes to reach that value from zero. On average, a typical academic year runs from September to July (~9.5 months structured in three academic terms). The temporal scale for the SIPM accounts for the early phase of the intervention dedicated towards training the SSOs/Master-trainers who

would step-down training to teachers. This phase occurs prior to the start of the start of the school term. Accordingly, the SIPM is extended to account for the cascading processes of training and the time horizon for the first phase of the SIPM is set for 10 months of the academic year.

Table 7.8 Time-period sensitivity analysis

No of time periods Step size Headteacher effectiveness levels 2 5.1 10.16 6 1.7 7.58 15 0.7 7.13 25 0.4 6.98 50 0.2 6.91 100 0.1 6.88

Here, the changes in variables is represented by the time step. As the model runs, the time-period increases, and the length of the time-step decreases. This is because attributes in this model can change from zero to its maximum or minimum level, and is expected to do so within the time horizon of the model. Therefore, a greater time step is expected within a shorter time horizon, and as the model continues to run, and the time horizon gradually reduces, the magnitude by which the state of a variable changes is expected to reduce. As the model is extended to the second and third phases of ESSPIN (more in the next chapter), times steps will vary within the time horizon to account for greater frequency of interactions – thus, during periods of high activity, time-steps will be further divided into weeks rather than months. For example, the time-step for curriculum development and capacity development processes for the school support officers/Master-trainers (occurring peripheral to the school) may be taken as one month whereas the activities taking place within the school involving teaching, stepping-down, community meetings, professional development sessions and overall school administration is modelled on a weekly scale. Further down the line, during term-breaks/holidays, a time-step may represent a month.

As is the case with all models, the SIPM is a simplified representation of reality. It attempts to illuminate the implicit assumptions, rules and strategies stakeholders use, and make them more overt. The modelling exercise demonstrates that not all variables are of equal importance. Observations from the sensitivity tests show the points of inflection for the SIPM. It is clear that

decisions made at this stage go on to shade the results as the model is scaled up and out. As such, the key informants and subject matter experts offer one of the most reliable validation checks, which is then triangulated against the programme reports. On a more general note, these tests highlight the strengths and limitations of this research approach; while it is conceivable possible to build a nuanced and context specific model depending on the quality of data available, some of the processes that produce these changes are so indirect, it is often a challenge to accurately recreate those processes in exact detail. Overall, this observation is a useful one because it allows for some freedom in analysing acceptable states further down the line without representing the exact threads that weave to create the effects precisely.

This concludes the model design and development phase of the research. With this foundation, the model can be extended for the next roll-out phases of the SIP in Lagos and Kaduna states using data points for 2014 and 2016. The process has involved building, testing and validating the SIPM to ensure results can be treated with confidence. Taken in conjunction with the analytical framework presented in Chapter 6, analysis of simulations and results of the SIPM can be applied to the ESSPIN intervention, and its impact over the subsequent years.