Data reporting strategy for SDG 4 learning outcome indicators

1. Setting a strategy to measure learning outcomes

1.3. Data reporting strategy for SDG 4 learning outcome indicators

INDICATORS

Since there is no perfect solution, there is one long- term strategy for reporting with a series of short- term interim stepping stones. We can address each country’s needs by adopting a portfolio approach that allows for a menu of tools for reporting and is sensitive to country specificities. The fact that the UIS is working on interim/immediate and long-term solutions also allows a high degree of practicality along the road to reaching the most “perfect” or comparable datasets.

The workflow is organized in such a way as to take two time perspectives into consideration:

Long term

The objective is to allow the existing diversity of tools (depending on each case) to be used for reporting in the same scale based on a linking strategy that enables countries to use the same threshold as reference with a minimum set of procedures for data integrity.

Interim/immediate

The objective is to maximise country reporting using national or cross-national initiatives that they have conducted or participated in, but that are not yet globally comparable. The UIS will footnote these criteria for short-term reporting.

1.3.1 Work programme

An ideal programme for reporting will have gone through three steps: conceptual framework,

methodological framework and a reporting framework, as described in Table 1.2. Each of these contains several complex sub-steps. For various levels and types of assessment, much of this work has already been done and the focus of the work is restricted to some specific dimensions depending on the indicator.

Conceptual framework

The design of an assessment/survey is defined by its purpose and by defining what to measure and how to measure it. The decisions made in this phase affect the possibilities of what can be done with the data collected.7_{The main questions in terms of comparing} different assessment results are:

m What is the construct (for instance, reading/

mathematics?) and skills/abilities measured? For example, depending on the curriculum in a country, national assessments usually have different content coverage for a given grade compared to another country.

7 Purpose, population target, test construction, domains, potential inferences, sample procedures and mode of assessing as relevant criteria for comparing the designs of assessments are key dimensions.

m What population is included? In the case of a age-

or grade-based school assessment programme, it does not mean all children would be assessed even within the school as they might be excluded from the assessment or simply do not attend school regularly. The challenge is more serious if a large proportion of children and youth are not enrolled in school.

Implications for global comparability

The requirement is to define a minimum content alignment in compliance with a global content framework of reference, defining specific skills/abilities that are important for students to learn in order to function well in their communities and later in life in terms of employment.

Definitions for populations are more difficult and depend on political decisions. The sample should be at least representative of in-school children.

Methodological framework

There are many operational issues that affect both quality and comparability. Since SDG 4 data cover many countries and include many different initiatives, it is essential to define some minimum good

practices for assessment programmes to follow while respecting national authority and autonomy.

Key questions in terms of comparability are:

m Will the sample framework provide results that

are valid for the population of the country? The nature of the sample is critical for the validity of the assessment programme as a measure of student learning progress at the country level, independent of any considerations of international consistency.

m Will the operational design and data generation

be reliable? Robust, consistent operations and procedures are an essential part of any large-scale survey to maximise data quality and minimise the impact of procedural variation on results.

Implications for global comparability

There are two aspects to consider:

m Procedural alignment by complying with a

minimum set of good practices on how the test was developed and how the data were collected and used in the development of the assessment.

m A variety of tools could serve to inform a given

indicator. In some cases, it will be necessary to generate these tools as global public goods.

Reporting framework

Each assessment uses different standard-setting approaches to build levels of performance so that

Table 1.2 Key phases in an assessment programme

Phase What it addresses Main components

Conceptual

framework What and who to assess?

m Assessment/survey framework

(cognitive, non cognitive and contextual)

m Target population

Methodological framework

How to assess? m Test design m Sampling frame m Operational design m Data analysis

Reporting framework

How to report? m Defining scales m Benchmarking m Defining progress Source: UNESCO Institute for Statistics (UIS).

Figure 1.1 Interim reporting of SDG 4 indicators

the scores can be classified in different categories. For education systems participating in the same cross-national learning assessments, results are comparable, but results are not comparable across different cross-national learning assessments or between national assessments.

From the point of view of reporting, there are two critical points. The first one refers to linking and the second to the definition of the minimum proficiency level. Linking is the general term used to relate test scores on one test/form with another. Methods could be classified as equating, test calibration, projection and moderation. Others classify into equating, scale aligning and predicting. It is important to moderate differences between tests, that were designed for completely different purposes, and to express them in a way that allows some degree of comparability in the same scale. This procedure, in turn, would allow fair inferences about the subjects (countries) compared. The second point refers to the definition of the minimum proficiency level: what is the minimum set of contents and abilities each child should know? The SDG indicators are bringing to the table a concept not yet discussed in many countries.

Implications for global comparability

1. Alignment of results which are linked to a definition of a global point of reference as specified in each of the assessments. The solution demands flexibility and needs dialogue about critical issues – such as what each child must learn and what is the minimum.

2. Different approaches have been proposed. They all have different implications in terms of ownership, policymaking, financial costs and pedagogical implications for teachers. The way forward lies in a hybrid that embeds a portfolio approach.

3. Interim/immediate reporting starts with cross- national assessments with the comparability they permit, and all other initiatives are reported by highlighting the lack of cross-national comparability in footnotes.

1.4 HOW DOES A COUNTRY REPORT

In document SDG 4 Data Digest 2018: Data to Nurture Learning (Page 31-33)