DRAFT EBRD Evaluation Department Guidance Introductory Notes

(1)

Introductory Notes

The Evaluation Department (EvD) at the EBRD evaluates the performance of the Bank’s completed projects and programmes relative to objectives in order to perform two critical functions: reinforcing institutional accountability for the achievement of results; and, providing objective analysis and relevant findings to inform operational choices and to improve performance over time. EvD reports directly to the Board of Directors, and is independent from the Bank’s Management. Whilst EvD considers Management’s views in preparing its evaluations, it makes the final decisions about the content of its reports.

These guidelines have been prepared by EvD and are circulated under the authority of the Chief Evaluator. The views expressed herein do not necessarily reflect those of EBRD Management or its Board of Directors.

Nothing in this document shall be construed as a waiver, renunciation or modification by the EBRD of any immunities, privileges and exemptions of the EBRD accorded under the Agreement Establishing the European Bank for Reconstruction for Development, international convention or any applicable law.

These draft guidelines were prepared by Keith Leonard, Deputy Chief Evaluator of the EBRD Evaluation department and Nick Burke, Evaluation Consultant.

London EC2A 2JN United Kingdom Web site: www.ebrd.com

All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature.

Version Date: 07Apr15

DRAFT – EBRD Evaluation Department Guidance Note

Evaluation Performance Rating

These guidelines will apply to all evaluation in EBRD

Effective date: 1 January 2015 (pilot application in EvD) 1 January 2016 (pilot application for self-evaluation)

(2)

Introductory Notes

Evaluation in EBRD

1

The focus of evaluation is on results – the outputs, outcomes and impacts that flow from the inputs the

European Bank for Reconstruction and Development (EBRD) provides (financing, deal structuring,

technical cooperation (TC), policy dialogue, staff support for implementation, etc.). These inputs allow a

range of activities to be carried out that lead to outputs and outcomes and contribute to wider impacts. As

well as being accountable for performance, the Bank must show that the lessons of past and current

experience are integrated into new activities, thereby demonstrating an active pursuit of improved

performance over time.

Evaluation plays an important role in this by providing a basis for institutional accountability for results, and

objective analysis and findings to inform operational choices and improve performance over time.

Evaluation must provide credible evidence, analysis, and independent judgment, along with

evidence-based findings and recommendations that are relevant, valuable, and actionable.

Further, it is likely that EBRD’s clients will increasingly seek ideas, knowledge and expertise from the Bank

to go with the financing provided. Evaluation provides a valuable source of knowledge on what works,

what doesn’t and why. The importance of policy dialogue in addressing policy impediments to transition is

also likely to grow. Evaluation can provide a basis for evidence-based policy, thereby strengthening the

messages EBRD wishes to convey. Finally, the meaning of transition itself is evolving. A move to more

comprehensive results measurement and results-based management can be served by an evaluation

system that takes account of the full range of results from the Bank’s strategies, policies and projects.

In playing this role, evaluation is fundamentally different from monitoring although evaluation does use the

output of monitoring as part of its evidence base for evaluation (see Table 1 for a comparison between

monitoring and evaluation). The purpose of monitoring is not to measure the ‘amount’ of success

achieved. Rather, monitoring takes place during implementation to provide guidance as to whether

expected results achievement is on- or off-track, and thereby to provide warning of the need for corrective

action. The Transition Impact Monitoring System (TIMS) is the tool with which the Bank monitors results

being produced by projects. It does this by tracking progress against a limited set of indicators (called

benchmarks in TIMS

2

_{) during implementation. Monitoring does not need to track a comprehensive set of}

indicators as its interest is in seeing whether the project is on- or off-track rather than providing an

assessment of the full range of results achieved. Monitoring under TIMS stops when the benchmark is

achieved or it becomes clear that it is not going to be achieved. Evaluation comes in after implementation

is complete, generally when there is at least one year’s operational data available. The focus of evaluation

is on the totality of results attributable (or plausibly in part attributable) to the Bank, whether those results

were anticipated or not. Evaluation uses all the evidence available on operational, financial, transition,

environmental

and

social

results

whether

previously

monitored

or

not.

1_{See EBRD (2013). Evaluation Policy. Board approved 16 January 2013. Available at} http://www.ebrd.com/what-we-do/evaluation-policy.html

2_{EvD considers that use of the term benchmark contributes to confusion as in practice the benchmark may be either or both an}

indicator by which performance will be measured and a target in terms of a timebound level of achievement over a baseline level of performance. It is good practice to keep indicators and targets separate to avoid such confusion.

(3)

Introductory Notes

Figure 1: Terminology used in the Guidance Note

Table 1: Comparison of Monitoring and Evaluation

Monitoring Evaluation

Takes place periodically during implementation Takes place one-time when operational

Can be used to clarify/modify expected results and associated targets

Takes Board-approved expected results and targets as given – assesses against all stated (or inferred) results whether monitored or not

Translates objectives into a limited set of performance indicators that should show movement during implementation

Uses a wider range of indicators and all available evidence to assess realised results

Tracks delivery of inputs, conduct of activities and whether expected results achievement is on- or off-track

Assesses achievement of expected and unanticipated results and the causal contribution of activities to such

Reports progress to managers and alerts them to problems so that corrective action can be taken

Provides findings, lessons and recommendations to improve future operations

Among its performance attributes, goes beyond results to assess relevance, process and allocative efficiency, and sustainability

To the extent possible, incorporates a counterfactual into performance assessment

EBRD

Corporate & Strategic Objectives

(incl. short-term priorities*)

Project

Sponsor(s)

Project-level Business

Objectives

Objec ves determine the scope

of the project and the investment

decision

Expected results are the an cipated effects of the project on key stakeholders,

categorized via the evalua on framework

Expected Results:

Outputs

Outcomes

Impacts

Unan cipated

Results

* For example, crisis

(4)

Introductory Notes

Source: Adapted from International Program for Development Evaluation Training presentation (January 2015)

Why rate performance?

Ratings have been used in EBRD since the creation of the Evaluation Department (EvD) and will continue

to be used for both self-evaluations by operational teams and independent evaluation by EvD. The

reasons why EBRD uses performance ratings in evaluation are:



A robust and consistently applied rating system means performance assessments can be

aggregated to track overall performance as well as that at various levels of disaggregation such

as country, region, sector or other dimensions, which would not be possible without the use of

ratings.

3



The use of a rating system is good practice under the Evaluation Cooperation Group (ECG)

Good Practice Standards for the Evaluation of Private Sector Operations.

4

_{EvD is committed to}

following these standards in evaluation, with customisation to its unique mandate as required.



A rating system provides a structure, consistency and transparency to performance assessment

that may not be present or apparent without the discipline of a rating system.



The use of ratings provides a quick means of checking inter-rater consistency among

evaluators, and between self-evaluation and independent evaluation.



Performance improvement means identifying and re-enforcing successful features and avoiding

past problems. The clarity provided by a rating system focuses attention on what did or did not

go according to plan, and why.

How ratings are used

For all evaluations where ratings are used, these seek both to describe performance and to explain it. As

such, the ratings help focus discussions between Banking and EvD, within Management and with the

Audit Committee, on what worked, what didn’t and why.

Aggregate ratings and various levels of disaggregation are reported and discussed in EvD’s Annual

Evaluation Review, which is always considered by the Board. Trends in criteria and sub-criteria ratings are

used to help explain trends in overall, sector and regional performance. Performance ratings may also be

used in special studies.

Purpose of these guidelines

Up until early 2013, guidance on the rating of investment operations was included in the Evaluation Policy.

Amendments to this policy, approved by the Board in January 2013 (see footnote 1), established that

henceforth methodological guidance would be issued in the form of stand-alone guidance notes rather

than being part of the policy. This allowed the policy to focus exclusively on strategic issues. It also means

that guidance can be revised without the need for a revision to the policy, which requires Board approval

and external consultation. This guidance note serves the following purposes:

3_{EvD maintains a database of over 800 project performance ratings dating from 1996.}

4_{The ECG brings together the independent evaluation departments of all the main international finance}

institutions/development banks. More information on the ECG and its Good Practice Standards are available at www.ecgnet.org.

(5)

Introductory Notes



To replace the guidance on project performance rating contained in previous versions of the

Evaluation Policy.



To make changes to ensure that the manner in which performance is assessed remains

relevant to EBRD’s evolving business needs, and that it stays ahead of emerging good practice.



To address some problem areas in the current project performance rating system, including:

lack of clarity and transparency in the definition of criteria and sub-criteria; how these should be

assessed; which to include in the overall rating; and how they are aggregated to derive an

overall rating.



To introduce a more internationally-recognised terminology into the performance rating system

– one that not only serves internal needs but which is also recognisable and understandable

outside the Bank. To the extent possible, the terminology of the Evaluation Network of the

OECD-DAC has been used, with some customisation to EBRD’s particular mandate.

5

_{The main}

terms of concern are those of inputs, activities, outputs, outcomes and impacts. These are

framed within the OECD-DAC evaluation criteria of relevance, effectiveness (termed ‘results’ in

these guidelines), efficiency, sustainability and impact.



To address the absence of guidelines for rating TC.



To provide the basis for performance rating in those special studies where performance rating is

to be used.

The extent and manner of application of the guidelines in special studies will be outlined in the study’s

approach paper. Many sector and thematic studies produced by EvD (including the evaluation of policies

and strategies) have included performance ratings although no guidance has existed for doing so. A

decision on the use of ratings in special studies, whether only at the criteria level or also an overall rating,

should be made on a case-by-case basis and confirmed in the approach paper prepared at the start of the

evaluation. Whether or not ratings are to be used, the approach paper should outline the basis for

performance assessment using these guidelines as the starting point.

Insofar as these guidelines are used for project performance rating, they will be the basis for

self-evaluation by banking teams in the Operations Performance Assessments (OPAs), for EvD’s validation of

these (OPAVs), and for EvD’s independent evaluations (OEs).

What these guidelines do not cover

These guidelines only cover how to derive a performance rating. They are not a complete guide to the

conduct of evaluation in EBRD. Specifically, they do not cover:



Sources of data, including the roles of quantitative and qualitative data (use of mixed methods)

and the use of triangulation.



The analysis and interpretation of data and performance ratings, and their distillation into

findings that explain performance, lessons and/or recommendations.



Tips for carrying out field investigations.

5_See_{http://www.oecd.org/dac/evaluation/glossaryofkeytermsinevaluationandresultsbasedmanagement.htm}_{for the}

(6)

Introductory Notes

These and related aspects are or will be covered in other guidance notes. In time, these will provide a

complete ‘how to’ for evaluation in EBRD.

Basic premises underlying the guidelines



A fundamental premise underlying these guidelines is that the boundary around what is being

evaluated is not narrowly drawn. For example, in the case of an individual transaction the

evaluation boundary is not the transaction itself – rather it is the transaction in the context of

why EBRD is involved. The consequence of this is that overall performance assessment takes

into account the performance of EBRD on selected dimensions as well as performance of the

project itself.



The exercise of evaluator judgment and discretion is an essential and legitimate part of

evaluation in EBRD, particularly given the varied and dynamic contexts in which EBRD

operates: the frequent deficiencies in data availability and reliability; the time and resource

constraints for the conduct of evaluations; and the frequent difficulties in attributing results to

EBRD alone. The guidance outlined in this document is the default approach that should be

followed unless there is compelling reason to vary it. Evaluators may exercise discretion in

applying this guidance provided it is done so transparently and ideally pre-approved in the

evaluation approach paper. It is mandatory that in all significant cases where evaluator

discretion has been exercised that this be fully transparent in evaluation reports, with

justification provided and the consequences for the performance rating made clear.



Because EBRD is a publicly-owned bank with wider objectives based on its mandate, and

because evaluation is generally carried out one to two years after final disbursement, it is

frequently the case that the impacts, and in some cases outcomes, may not be fully apparent or

measurable (for example, relevant data may not yet exist at the time of evaluation). For these

reasons, the achievement of impacts (and possibly outcomes) sometimes has to be inferred

rather than directly measured. Therefore, it is necessary for evaluators to construct a results

framework for the evaluation even where none exists in approval documents, or where one

does exist but is deficient for the purposes of evaluation. In constructing the results framework,

the evaluator is guided by the objectives and targets that are set out in the approval document

or that can be reasonably inferred from it. The results framework should provide a structure for

the evaluation based around the project’s expected outputs, outcomes and impacts. Using a

‘theory of change’, the evaluator is able to build a plausible case that a project’s expected

results have been, or will be, achieved even if they are not fully apparent or measurable.



Again, because of the limitations posed by data and resource constraints, the assessment of

outcome and impact considers the extent and plausibility of EBRD’s contribution to results,

particularly at the impact level, rather than seeking to establish direct attribution.



Evaluation must be evidence-based – findings, lessons and recommendations must flow

exclusively from the evidence presented and not from evaluator beliefs and personal opinions.



It is strongly preferred that there be a mix of qualitative and quantitative data and analysis

(mixed methods) coming from multiple sources (triangulation). Overall or aggregate

performance should avoid double-counting to the extent possible.



Assessment of performance should generally take into account three dimensions: expected

results, being the anticipated outputs, outcomes and impacts (so-called objectives-based

(7)

Introductory Notes

evaluation or ‘before and after’ evaluation); what would have happened without the project (the

counterfactual or ‘with and without’ evaluation); and, as a reality check, performance against

industry standards or other market benchmarks (if not already incorporated in expected results).



Changes to scope during implementation are often needed. Generally, performance is

assessed against the properly-approved revised scope although the effect of the scope-revision

on results may be commented upon. The reasons why scope-change became necessary are

explored under bank handling – if these could not have reasonably been foreseen at approval

and the changes to restore relevance were made in a timely manner, this aspect of bank

handling should be assessed positively, while the reverse situation (inadequate design and/or a

response that was not timely or did not restore relevance) would be assessed negatively.



Because learning is as important as accountability, the criteria considered and evidence

collected for performance assessment must help explain, as well as rate, performance.



A self-evaluation of all TCs is conducted on their completion with the evaluation ratings reported

in EBRD’s institutional scorecard. Transactional TCs are also assessed in the self-evaluation of

the investment operation to which they are attached.



The evaluation framework applies to all evaluations where performance rating is carried out. It

applies to self-evaluation as well as independent evaluation by EvD.

What is new in the guidelines?

The principal changes and refinements to prior practice for performance assessment of projects include:



Closer alignment with OECD-DAC Evaluation Network evaluation criteria as the basic structure

of performance assessment, namely: relevance; effectiveness (termed ‘results’ in this guidance

note); efficiency; sustainability; and impact.



A more explicit, as well as revised, set of sub-criteria.



Application of a results framework consisting of a hierarchy of inputs, activities, outputs,

outcomes and impacts (with definitions derived from OECD-DAC) and the logical connection

between them. This uses a theory of change for assessing performance in (commonly

occurring) situations where results (particularly impacts and sometimes outcomes) are not fully

observable at evaluation, or they are not measurable because data is unavailable or

incomplete.



Eliminating the separate consideration of the achievement of operational objectives, transition

impact, environmental and social objectives and the results of TC, policy dialogue and staff

contribution to capacity development (some of which were not previously routinely considered in

a performance rating). Instead, all anticipated results are now considered under the criterion of

results and identified as outputs, outcomes or impacts. However, to retain continuity, EvD will

derive ratings for transition impact, environmental and social performance, additionality and

sound banking (although the latter was not previously rated), based on a distillation of relevant

findings within the evaluation document.



Creating a clear separation between results that can be observed at evaluation or can plausibly

be inferred, and those where their future achievement is largely speculative. The former are

rated and included in the overall performance rating, while judgments are made about the latter

(8)

Introductory Notes

but there is no rating and so no inclusion in the overall performance rating. Some other changes

have been made to what is included in overall performance rating and what is not.



More explicit attention to unanticipated results (positive and negative) and use of the

counterfactual.



Assessing the plausible contribution of EBRD’s operations to the achievement of outcomes and

impacts, rather than seeking to establish direct attribution, which may not be possible within the

time and resource limitations of evaluation in the Bank.



Adoption of a numeric scoring and weighting system to derive each criterion rating from

component sub-criteria ratings, and the overall performance rating from criteria ratings.

6



Establishment of clear benchmarks for rating sub-criteria, criteria and overall performance,

whilst recognising the need for some evaluator discretion where observed performance is close

to a boundary.



Incorporation of guidance for the performance rating of TC.



Adoption of six rather than four categories for overall performance rating to allow for a more

granular rating of performance (previously around 80 per cent of performance ratings fell into

the categories of successful and partly successful).



Dropping the word ‘successful’ from rating category descriptors and its replacement where

necessary with ‘satisfactory’ as a less value-laden term.

Evaluation framework

This version of the guidelines covers the performance rating of investment projects and TC. Work is

ongoing to develop the criteria and sub-criteria for performance assessment in sector studies. For other

types of special studies, where ratings are to be used, the criteria and sub-criteria should be established in

the study approach paper. In such cases, performance rating should follow these guidelines to the extent it

is rational and possible to do so. As experience builds up it may be possible to extend guidance to a wider

range of special studies.

Table 2 outlines the evaluation framework’s four criteria for investment projects (with their 19 sub-criteria,

not all of which are always applicable). Only the first three criteria and associated sub-criteria are rated

and included in the overall rating. For self and independent evaluation of investment projects, the default

position is that sub-criteria should have an equal weight in determining their parent criterion rating and,

together, the overall performance rating. However, weights may be varied by evaluators exercising their

discretion in a transparent manner. Table 2 also shows four derived ratings, which reflect the Bank’s

unique mandate: transition impact; environmental and social performance; additionality; and sound

banking.

The guidelines apply the OECD-DAC lens of outputs, outcomes and impacts that are linked to one another

by a connecting theory of change. This is different from the way in which EvD has traditionally described

results for investment projects where operational results, transition results and environmental and social

6_{Previously, there was little if any specific guidance on how to rate sub-criteria or derive criteria ratings. The derivation of an}

overall performance rating was based on an incomplete matrix showing combinations of ratings for four of the seven criteria. This method was largely non-transparent, so that it was unclear how a particular rating had been derived. The method did not permit any determinant analysis of the relationship between sub-criteria, criteria and overall performance.

(9)

Introductory Notes

results all had separate ratings. In order to continue to provide that traditional lens there are four derived

ratings. The four match the mandates of the Bank. The first three (D1 to D3 in Table 2) match existing

ratings and so their inclusion in this guidance will permit continuity of time-series data (while recognising

that there are some differences in the underlying sub-criteria). The fourth derived rating, for sound

banking, is new. While it is a core mandate of EBRD, it has not been rated before. Derived ratings D3 and

D4 are generated directly and automatically from sub-criteria ratings in Table 2, while D1 and D2 require

some evaluator judgment based on a distillation of relevant findings within the results criterion.

Table 2: Investment project evaluation criteria and sub-criteria

1. Relevance 1.1. Strategic relevance 1.2. Relevance of design 1.3. Expected additionality 1.4. Demonstrated additionality 2. Results 2.1. Achievement of outputs

2.2. Contribution to expected outcomes 2.3. Contribution to expected impacts

2.4. Performance against benchmarks (if relevant) 2.5. Unanticipated results (positive or negative) 3. Efficient Resource Use

3.1. Financial performance of project or client 3.2. Implementation efficiency

3.3. Bank investment profitability 3.4. Bank handling

3.5. Consultant performance (if relevant)

4. Other performance attributes (assessed but not rated)

4.1. Sustainability of achieved results 4.2. Client’s contribution

4.3. Co-financier’s contribution (if any) 4.4. Innovation features (if applicable) 4.5. Merit features (if applicable) Derived ratings

D1. Transition impact (derived based on evaluator-flagged transition results drawn from 2.1, 2.2, 2.3 and 2.5)

D2. Environmental and social performance (derived based on evaluator-flagged environmental and social-related results drawn from 2.1, 2.2, 2.3 and 2.5)

D3. Additionality (rated automatically based on 1.3 and 1.4) D4. Sound banking (rated automatically based on 1.2, 3.3 and 3.4)

Table 3 summarises the equivalent performance rating framework for TC, and also shows the differences

between self-evaluation and independent evaluation.

7

_{This is to reflect the fact that self-evaluation is}

always carried out for individual TCs whereas, apart from the very largest non-transactional TCs, EvD

tends to evaluate TC in clusters defined by a common theme (and in such cases may adopt a broader

range of criteria).

7_{All TC is self-evaluated (via a Project Completion Report - PCR) whether transactional or not. For investment projects with}

transactional TC, the TC is not rated separately when independently evaluated by EvD (though the PCR rating would be noted). Rather, the TC outputs and outcomes are incorporated in the results assessment of the transaction as a whole.

(10)

Introductory Notes

Table 3: Technical cooperation evaluation criteria and sub-criteria

1. Relevance

1.1. Strategic relevance 1.2. Relevance of design

2. Results

2.1. Achievement of outputs

2.2. Contribution to expected outcomes 2.3. Unanticipated results (positive or negative) a 2.4. Sustainability of achieved results a 3. Efficient Resource Use

3.1. Bank handling 3.2. Client’s handling

(equivalent of client’s contribution in Table 2)

4. Other performance attributes (assessed but not rated) 4.1. Donor’s contribution

(not required for self-evaluation) 4.2. Innovation features

(not required for self-evaluation) 4.3. Merit features

(not required for self-evaluation)

a_{These sub-criteria are rated but are not determinants of the overall rating for TC self-evaluation. Independent evaluators may,}

however, deem it necessary to consider them in the overall rating.

For self-evaluation of TC, a decision has been made to pre-assign weightings to criteria as follows:

relevance (25%); results (60%); and efficiency (15%).

8

_{When validating self-evaluation ratings for}

individual TCs, EvD will apply weightings consistent with those used for the self-evaluation. For other

purposes, however, EvD may choose to use different weightings and to include anticipated results and/or

sustainability as sub-criteria contributing to the overall rating.

Table 4: Four-category rating system for sub-criteria

Category Investment Projects (Annex 1) TC (Annex 2)

Achievement score (& grey zone)

Excellent Performance meets or exceeds the excellent

benchmark specified in the Annex.

75% (65% to 85%)

A substantial majority achievement level for the sub-criteria/criteria being assessed.

Fully satisfactory

Performance meets or exceeds the fully

satisfactory benchmark specified in the

Annex.

50% (40% to 60%)

A majority achievement level for the sub-criteria/criteria being assessed.

Partly unsatisfactory

Performance meets or exceeds the partly

unsatisfactory benchmark specified in the

Annex.

25% (15% to 35%)

A minority achievement level for the sub-criteria/criteria being assessed.

Unsatisfactory Performance fails to meet the benchmark for a partly unsatisfactory rating as specified in the Annex.

0% (No grey zone) Failure to reach even a minority

achievement level.

Other Ratings:

8_{However, other weightings can be applied, including equal weightings. Evaluators may wish to vary weights according to the}

type of TC, for example transactional versus non-transactional, or for policy dialogue where a higher weighting for relevance and efficiency might be appropriate since results may take many years to materialise.

(11)

Introductory Notes

Category Investment Projects (Annex 1) TC (Annex 2)

Achievement score (& grey zone)

Not applicable Certain sub-criteria (e.g., consultant performance in investment evaluations) may not be relevant in all projects and so should be rated not applicable.

No score is attributed to ratings of not applicable and so the sub-criterion has no effect on the synthesis criterion or on the overall performance rating.

No opinion possible

Where there is insufficient evidence to assign a rating (for example due to the premature closure of a project before any meaningful data could be collected or inferred), a rating of no

opinion possible may be assigned, though this should be a last resort and in most cases

should prompt further evaluative research or fieldwork.

No score is attributed to ratings of no opinion possible and so the sub-criterion has no effect on the synthesis criterion nor on the overall performance rating.

Sub-criteria ratings

Sub-criteria ratings are the basic building blocks, from which criteria and overall performance ratings are

derived. Sub-criteria and criteria are rated using a four-category system as shown in Table 4. Guidance on

applying the rating categories for each sub-criterion is given in Annexes 1 and 2 for investment and TC

evaluations respectively.

In recognition of the fact that boundaries between rating categories are almost never certain given data

limitations and the degree of judgment necessary, the evaluator is permitted some discretion where the

evidence points to a rating close to a threshold. For example, in TC evaluations, such discretion is allowed

should the achievement level and system score (per progress or completion reports) fall within the ‘grey

zone’ and therefore close to a boundary. The rating decision in such cases should be based on

clearly-stated, objective considerations.

Derivation of criteria ratings

To assist in rating decisions, a rating tool has been developed and is embedded in revised evaluation

templates. This tool provides recommended criteria and overall performance ratings based on sub-criteria

ratings and weights. The recommended criteria ratings are based on the following algorithm:



A score (s) is assigned for each sub-criterion rating as follows: excellent = +2.0;

fully satisfactory = +0.5; partly unsatisfactory = –0.5; unsatisfactory = –2.0

9



For investment projects, sub-criteria are weighted high, medium or low, with weightings (w) of

x2.0; x1.5 and x1.0 respectively. (Accordingly, sub-criteria weighted as high have twice the

influence as those weighted low).



For TC, sub-criteria are weighted by percentages, such that their combined weightings sum to

100 per cent.



For each criterion, the weighted average score (cs

avg

) is calculated from the underlying

sub-criteria ratings using the equation: cs

avg

= ∑ws / ∑w



The criterion rating is then determined from cs

avg

based on the following table:

9_{Note that the scoring scale is not linear. This is because ratings of either excellent or unsatisfactory tend to reflect project}

performance at the positive or negative extremes, which is better captured by a non-linear scale, i.e. one having a slight tail at the low and high end.

(12)

Introductory Notes

Table 5: Determination of criterion ratings

Weighted Average Score of Sub-Criteria (csavg) Criterion Rating

csavg > 0.9 Excellent

0.0 < csavg <= 0.9 Fully satisfactory

– 0.9 <= csavg <= 0.0 Partly unsatisfactory

csavg < – 0.9 Unsatisfactory

Derivation of overall performance rating

For the rating of investment projects, framework agreements and TC, only the first three criteria in Tables

2 and 3 (i.e., relevance, results and efficiency) are used for the derivation of the overall performance

rating. For special studies, the decision as to what to include in the overall performance assessment is

made on a case-by-case basis. The recommended overall performance rating is derived as follows:



As for criteria ratings, the rating tool should be used to determine the overall performance

rating. The overall performance score (PS

avg

) is determined from the weighted average of all

sub-criteria scores using the following equation: PS

avg

= ∑ws / ∑w



A six-category rating system is applied to produce the overall project performance rating.

10

_This

is determined from PS

avg

as shown in Table 6 below.

Table 6: Determination of the overall project performance rating

Weighted Average Score of All Sub-Criteria (PSavg) Overall Performance Rating

PSavg > 0.9 Outstanding

0.45 < PSavg <= 0.9 Good

0.0 < PSavg <= 0.45 Acceptable

– 0.45 <= PSavg <= 0.0 Below standard

– 0.9 <= PSavg < – 0.45 Poor

PSavg < – 0.9 Very poor

Where the recommended performance rating is shown to be close to a boundary, evaluator discretion can

still be exercised, though the justification for this needs to be provided in the accompanying text.

Applying the rating framework

The detailed characteristics of the evaluation sub-criteria and criteria as they apply to investment

operations and TC are described in Annexes 1 and 2 respectively. Each EvD corporate or thematic

10_{Previously, EvD used a four-category rating system for overall performance (highly successful, successful, partly successful}

or unsuccessful). However, almost 80% were rated as successful or partly successful. The new guidelines allow a more granulated rating of performance.

(13)

Introductory Notes

evaluation approach paper should specify the sub-criteria, criteria and weightings to be used to rate

overall performance (if performance is to be rated), ensuring that there is consistency with the principles

and guidance outlined in this paper.

Annex 3 provides an example of a pro-forma results framework by which a project's expected results are

viewed through the OECD-DAC lens of outputs, outcomes and impacts, linked to one another by a

connecting theory of change.

Annex 4 contains document templates for the self-evaluation OPA, and independent validation OPAV and

evaluation OE instruments.

(14)

Annex 1: Rating criteria and sub-criteria for investment projects

Annex 1: Detailed guidance on rating sub-criteria and criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

1. Relevance

1.1 Strategic Relevance

Definition: Assesses the degree of

relevance to the Bank’s strategic agenda, both at approval and at evaluation.

The evidence provided for strategic relevance should demonstrate how the project actively helped EBRD to deliver on its policy and strategy intentions, rather than happening to be in alignment (or not being in conflict) with (often very generally-worded) strategies and policies.

Excellent: At approval, the project was closely

aligned with and capable of helping deliver on the Bank’s strategic agenda, and has remained aligned with evolving Bank strategy over time.

Fully satisfactory: At approval, the project was

reasonably well aligned with the Bank’s strategic agenda and was capable of making some contribution to the Bank’s realisation of its strategic agenda, and this has remained so over time.

Partly unsatisfactory: At approval, the project was

only weakly aligned with the Bank’s strategic agenda and was capable of making only a limited contribution to the realisation of this agenda, and/or has limited relevance to evolving Bank strategy since approval.

Unsatisfactory: At approval, the project lacked

alignment with the Bank’s strategic agenda, and that has remained the case since approval.

The relevant country and sector strategies are an obvious starting point for assessing strategic relevance, but projects may support other parts of the Bank’s strategic agenda such as, for example, the gender strategy.

Each of the rating benchmarks defines the extent to which, at approval, the project could plausibly have helped the Bank deliver on its strategic agenda.

Projects are often approved under one set of strategies and policies but evaluated under more recently approved ones. The evaluator should note any relevant and significant changes to the strategic agenda and discuss whether the project was more or less aligned as a result of these changes. The performance assessment should consider strategic relevance at approval and at evaluation.

(15)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

1.2 Relevance of Design

Definition: Focuses on: the

soundness of the design logic (implicit or explicit theory of change); the evaluability of the operation (i.e., the adequacy of the specification of expected results and the indicators for measuring their achievement); the adequacy of identification and incorporation of the lessons of past experience; and the adequacy of risk identification and mitigation.

Assess the following design features: (i) The plausibility of the sources of

expected transition impact, i.e. evidence of a sound design logic (theory of change) such that the transaction could plausibly produce this impact.

(ii) The completeness and clarity of specification of expected results (outputs, outcomes and impacts) of the project (including any

associated TC and policy dialogue) and the adequacy of the TIMS benchmarks and other indicators. (iii) Use of past experience and

lessons to shape design. (iv) Whether risk factors have been

adequately identified and taken into account in design through

identification of coping strategies.- as part of this assessment the assumptions implicit in the theory of change should be spelled out to see if these pose unidentified risks.

Excellent: All four aspects of project design proved

to be fully appropriate during implementation.

Fully satisfactory: Three of the four aspects of

project design proved to be fully appropriate during implementation.

Partly satisfactory: Only one or two of the four

aspects of project design proved to be fully appropriate during implementation.

Unsatisfactory: None of the four aspects of project

design proved to be appropriate during implementation.

Evaluability is defined as the extent to which the expected results of a project are verifiable in a reliable and credible manner.

This analysis focuses in the first instance on whether the inputs provided by EBRD in terms of investment and any supporting TC and/or policy dialogue would plausibly have produced the expected results, particularly transition impact. This requires the evaluator to identify, and assess the validity of, the implicit or explicit theory of change in order to arrive at a judgment as to whether the expected results are realistic.11

All projects, TC, strategies and policies should demonstrably take account of the lessons from past experience whether sourced from EvD findings, the team’s experience or elsewhere. Identification of lessons in the approval document is not a sufficient test – something must have been done differently as a result if there is a relevant lesson.

11_{At its most basic level a theory of change explores whether the inputs and activities carried out are necessary and sufficient to produce the expected outputs; whether those outputs are necessary and sufficient to}

achieve the stated outcomes; and whether those outcomes are necessary for contributing to impact achievement. However, a well-constructed theory of change (which should usefully be illustrated in diagrammatic form) goes beyond this technical logic: (i) to make explicit the embedded assumptions in the project ‘storyline’; and (ii) to identify and take into account contextual factors that could hinder or support success in moving along the chain from inputs to impacts (for example, the political economy dimension). Constructing a theory of change should be a reflective process that looks at the project in its context. A review of the theory of change can be found at http://r4d.dfid.gov.uk/pdf/outputs/mis_spc/DFID_ToC_Review_VogelV7.pdf.

(16)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

1.3 Expected Additionality – were

claims made at approval plausible?

Definition: Describes how the Bank

planned to add value by one or more of the following: (i) its financial terms and conditions; (ii) the unique attributes the Bank brought to the project; (iii) inclusion of legal covenants that would not have otherwise been agreed by the client; and (iv) mobilisation of additional commercial finance.

Assess whether, at the time of project approval, the additionality claims in the approval document were plausible. For example, the assessment should consider the availability (if any) of alternative sources of finance other than from EBRD and on what terms.

Excellent: All claims justifying additionality were

plausible at the time of approval.

Fully satisfactory: All claims justifying important

areas of additionality were plausible at the time of approval.

Partly unsatisfactory: One or more claims justifying

important areas of additionality were not plausible.

Unsatisfactory: Most or all claims justifying the

Bank’s additionality were not plausible.

Based on the information and knowledge that would have been available at the time of approval (that is, not using the benefit of hindsight) the evaluator will judge whether the claimed additionality was plausible at approval, or not.

1.4 Demonstrated Additionality –

was EBRD additional in fact

At the time of the evaluation, assess the extent to which the operation was additional (whether identified as such at approval or not). In particular, the assessment should look at whether the Bank’s attributes, legal covenants and/or the expected additional commercial financing, actually happened.

Excellent: All aspects of claimed additionality were

borne out and/or there were significant unforeseen ways in which the Bank was additional.

Fully satisfactory: All important aspects of claimed

additionality were borne out and/or there were unforeseen ways in which the Bank was additional.

Partly unsatisfactory: One or more important

aspects of claimed additionality were not borne out.

Unsatisfactory: Most or all aspects of claimed

additionality were not borne out.

This assessment is focusing on whether there is evidence that the additionality statements were in fact borne out during implementation. For example, did the Bank’s attributes come out during

implementation? Were legal covenants met in full or were some waived? Was supplementary finance actually mobilised and used as intended?

The evaluator should also note and take account of ways the Bank was additional in unforeseen ways (for example, capturing an opportunity to engage in policy dialogue that did not exist at approval).

(17)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

2. Results

2.1 Achievement of Outputs

Definition: The extent to which

expected outputs were achieved. For evaluation purposes in EBRD, outputs are defined as the products, capital goods and services which result from an operation. In a departure from past practice where results were assessed under the criteria of achievement of operational objectives, and in some cases achievement of transition impact and environmental and social performance these guidelines group results according to the OECD-DAC criteria of outputs, outcomes and impacts. See guidance in the right-hand column for what constitutes outputs.

Compares achieved outputs with expectations as at approval, or as revised if any revisions were formally approved during implementation. Not all outputs have equal value in scope and scale with regard to achieving an outcome. Also, some outputs will have quantitative targets while others may only be expressed in qualitative terms. The evaluator should transparently use discretion through, for example, using variable weightings for different outputs in reaching a rating for output achievement.

As achievement of outputs is largely within the control of the project, the boundaries for the rating categories have been raised compared with some other aspects of performance, which are subject to greater risk of realisation. To the extent that the client needs to make specific investments in order to address existing environmental and social issues or to ensure compliance of new facilities (e.g., via an

Environmental and Social Action Plan), then the delivery of such should be assessed under outputs.

Assuming a score of 1.0 for an achieved output, 0.5 for one partly achieved, and 0.0 for one not achieved, calculate the aggregate level of achievement. The overall proportion of output achievement (in percentage terms) is then derived from this aggregate score divided by the number of expected outputs. The thresholds defining the rating benchmarks below can then be applied directly. For example, if output achievement was 6.0/8=75%, a fully satisfactory rating would be assigned. However, in making the calculation the evaluator may place more emphasis on certain outputs where these are judged to have particular significance, or visa versa.

Excellent: At least 85% of outputs have been

achieved.

Fully satisfactory: At least 60% of outputs have

been achieved.

Partly unsatisfactory: At least 35% of outputs have

been achieved.

Unsatisfactory: Less than 35% of outputs have

been achieved.

Outputs cover all results that fit the definition in column 1, including any outputs among the TI benchmarks. The Board approval document usually describes outputs in different sections of the text or in annexes, most often with indicators and associated target values. Outputs may be described in any or all of operational objectives, transition impact benchmarks, legal covenants, transactional TC, policy dialogue or capacity building (e.g. in procurement) conducted by staff, and environmental and social action plans, etc. Prior to rating results, and in the absence of an adequate results matrix embodying a theory of change in the Board document, the evaluator prepares a results framework (partial in some cases where information in the approval document is absent). This should identify the project’s outputs (and outcomes and impacts) with appropriate indicators, baselines and associated target values, derived from the information contained in the Board approval documents and its annexes. This results framework should include expected outputs whether or not associated indicators and targets were set – the evaluator will need to formulate these in such situations. The results framework is appended to the evaluation.

If the expected outputs were formally revised during implementation, following procedures in the Operations Manual, the project’s achievements will be assessed against these revised expectations. The reasons why revisions were necessary are considered under Bank Handling and judged positively or negatively accordingly.

(18)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

2.2 Contribution to Expected Outcomes

Definition: The extent to which

expected outcomes described in the approval document or revised subsequently have been achieved, plausibly and significantly as a result of the project.

Outcomes are assessed against the expected results using the associated indicators, their baselines and target values, as described in the approval document or as formally revised during implementation.

Evidence would include: (i) baseline and at-evaluation values of

performance indicators; and (ii) other quantitative and qualitative information relevant to the expected outcomes. The rating reflects the project’s incremental contribution to observed outcomes, regardless of whether this represented movement in the right or wrong direction (positive or negative). When a positive outcome is achieved but there is evidence that this is primarily due to other factors, the rating may be adjusted downward

accordingly. This requires consideration of the counterfactual, i.e. what would have happened in the absence of the project.

For example, if outcome indicators meet or exceed their target values, but there is evidence that this was due mainly to external factors unrelated to the project, a partly unsatisfactory rating may be warranted.

Assuming a score of 1.0 for an achieved outcome, 0.5 for one partly achieved, and 0.0 for one not achieved, calculate the aggregate level of achievement. The overall proportion of outcome achievement (in percentage terms) is then derived from this aggregate score divided by the number of expected outcomes. The thresholds defining the rating benchmarks can then be applied directly. For example, if outcome achievement was 4.5/8=56%, a fully satisfactory rating would be assigned. However, in making the calculation the evaluator may place more emphasis on certain expected outcomes where these are judged to have particular significance, or visa versa.

Excellent: At least 75% of outcomes have been

achieved as a result of the project.

Fully satisfactory: At least 50% of outcomes have

been achieved as a result of the project.

Partly unsatisfactory: At least 25% of outcomes

have been achieved as a result of the project.

Unsatisfactory: Less than 25% of outcomes have

been achieved as a result of the project.

For evaluation purposes in EBRD, outcomes are defined as the short-term and medium-term effects directly attributable to delivery of the operation’s outputs. Outcomes should be discernable by project completion.

The results framework will include expected outcomes, their indicators, baselines and target values, as described or implied in the Board document along with any changes approved during implementation and following Operation Manual procedures.

Statements of expected outcomes are typically found in project documents such as: the Board document; President’s Recommendation; Summary Fact Sheet under ‘Project Description/Business Purpose’; Terms and Conditions under ‘Use of Proceeds’ in Section 3; and in annexes including the environment and social action plan. Also, the table of transition impact benchmarks (Board document) could include indicators and target values that are more appropriate for measuring outcome achievement. As well as being assessed against expectations (‘before and after’ assessment), observed outcomes are compared with a plausible without-project counterfactual (‘with and without without-project’ assessment). It may not be possible to come up with a scientific or rigorous counterfactual. Rather, the intent is for some consideration of the extent to which observed results would have occurred

(19)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

Conversely, if outcome indicators deteriorated, but there is evidence that the decline would have been worse in the absence of the project, a fully

satisfactory rating may still be

warranted.

anyway, and whether they were due to the project.12

If the expected outcomes, indicators or targets were formally revised during implementation, the project will be assessed against these revised expectations.

For projects involving financial intermediaries, the assessment should consider the extent to which such intermediation reached target groups. If the client’s sub-borrowers cannot be identified, a ‘before and after’ comparison of the intermediary’s portfolio is made to determine whether it increased its exposure to the target group. A theory-based method (e.g., a results framework) is then used to establish plausible causality. This might include evidence, for instance, that the intermediary had improved its marketing, screening, and credit procedures with the intention of increasing its reach to small and medium-sized enterprises.

(20)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

2.3 Contribution to Expected Impacts

Definition: The contribution at the

time of evaluation to the achievement of impacts (including environmental or social impacts) at the project or company level, and at the sector or economy level.

The word ‘contribution’ is used in recognition of the fact that it can be difficult to attribute impacts to a single intervention or indeed group of related interventions. However, evidence must be presented to show a plausible and significant

contribution by the EBRD operation to the achieved impact.

Focusing on the impact benchmarks and associated description in the Board approval document (but see guidance), an assessment is made of the realised impact at the time of evaluation. If, as a result of achieving compliance (or not) with safeguards, there are identifiable environmental or social impacts on neighbours of the project (positive or negative), then these should be assessed under this sub-criterion.

Other apparent impacts not identified at approval and not covered by transition impact benchmarks are evaluated under unanticipated results (see below).

The analysis focuses here on what has been achieved (what can be observed at the time of evaluation or what can plausibly be inferred from the evidence available at evaluation). It does not consider the potential for achieving further future impact – this is considered under Sustainability of Results (see below). This separation is made to delineate clearly the

observable or plausibly inferable (which is included in overall performance rating) and the speculative (which is not).

Assuming a score of 1.0 for an achieved impact, 0.5 for one partly achieved, and 0.0 for one not achieved, calculate the aggregate level of achievement. The overall proportion of impact achievement (in percentage terms) is then derived from this aggregate score divided by the number of expected impacts. The thresholds defining the rating benchmarks can then be applied directly. For example, if impact achievement was 4.5/8=56%, a

fully satisfactory rating would be assigned.

However, in making the calculation the evaluator may place more emphasis on certain expected impacts where these are judged to have particular significance, or visa versa.

Excellent: At least 75% of expected impacts have

been achieved.

Fully satisfactory: At least 50% of expected

impacts have been achieved.

Partly unsatisfactory: At least 25% of expected

impacts have been achieved.

Unsatisfactory: Less than 25% of expected impacts

have been achieved.

For the purposes of evaluation in EBRD, impacts are defined as the positive or negative long-term effects, expected or unanticipated, to which an operation contributes, directly or indirectly, as a result of outcomes.

Given its unique mandate, EBRD has its own definition of transition impact, which is rated separately (see D.1 below).By definition and in practice, transition impact and the benchmarks by which it is monitored in the Transition Impact Monitoring System (TIMS) are, under OECD-DAC terminology, a mix of activities, outputs, outcomes and impacts. Accordingly, the transition impact and benchmarks cited in approval documents will need to be re-categorised as outputs, outcomes and impacts in the results framework for the evaluation. The evaluator should therefore consider here only that part of transition impact that accords with the OECD-DAC definition of impact. Where a transition impact is more correctly an expected output or outcome, it is assessed under 2.1 or 2.2 respectively.

(21)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

2.4 Performance Against Industry Benchmarks or other Standards

Definition: A comparison of project

performance with that of peers in the industry and/or with relevant EU or other standards.

At the time of the evaluation, assess the extent to which the performance achieved meets industry and/or sector norms, and/or EU or other relevant standards.

To the extent that local or EU environmental and social performance standards exceed those of EBRD, then compliance with such standards should be assessed under this sub-criterion. This analysis should refer to market benchmarks, regulatory requirements or standards of good practice that prevail at the time of evaluation, regardless of whether these standards existed at the time of approval. The evaluator should comment, however, on the extent to which such

benchmarks have changed significantly over time.

There are two ways in which the rating of this sub-criterion may be determined as reflected below:

Excellent: The project or client is in the top quartile

of a relevant peer group, or performance is 50% or more above the industry average, benchmark or standard (that is, 1.5 times the

average/benchmark/ standard).

Fully satisfactory: The project or client is in the

second quartile of a relevant peer group, or performance is at least 100% of the industry average, benchmark or standard.

Partly unsatisfactory: The project or client is in the

third quartile of a relevant peer group, or performance is between 50% and 99% of the industry average, benchmark or standard.

Unsatisfactory: The project or client is in the bottom

quartile of a relevant peer group, or performance is less than 50% of the industry average, benchmark or standard.

This sub-criterion is included to control for cases where different projects might be set more or less challenging approval targets, which can lead to lower or higher achievement ratings. Assessing performance against market or regulatory benchmarks therefore provides a reality check on the robustness of results achievement in the context of the project’s wider operating environment.

For projects where the expected results have targets anchored in market-based benchmarks, less weight should be placed on this particular sub-criterion when synthesising the overall rating of results. Conversely, where there is a significant disconnect between results targets and market-based benchmarks, the evaluator should use this sub-criterion as a modifier of the overall

performance rating by assigning an appropriate rating and higher weight.

(22)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

2.5 Unanticipated Results

Definition: Other positive or negative

results that are not covered in the above sub-criteria.

Assesses positive or negative results that were not anticipated at approval and not, therefore, covered by the outputs, outcomes or impact sub-criteria.

The rating reflects the incremental results, whether positive or negative. Taking account of the counterfactual might reveal that the quantum of an unanticipated positive or negative result could have been different in the absence of the project.

Excellent: The project made a substantial and

plausible contribution to the achievement of significant positive unanticipated results.

Fully satisfactory: The project made a plausible

contribution to the achievement of net positive unanticipated results.

Partly unsatisfactory: The project contributed to net

negative unanticipated results.

Unsatisfactory: The project contributed to

significant negative unanticipated results.

Not applicable: There was no evidence of

unanticipated results.

To be included, unanticipated results must be truly unexpected, attributable to the project, quantifiable, of significant magnitude, and at least as well-evidenced as the project’s other results.

Typically, Board documents identify how a project will contribute to two or three transition impact criteria only, and so the evaluation should assess whether in reality the project contributed to other transition impact criteria.

In some cases projects might have both positive and negative unanticipated results. In these cases, the evaluator will need to exercise judgment as to whether on balance the rating should be fully

(23)

Annex 1: Rating criteria and sub-criteria for investment projects

Criteria and Sub-Criteria

Evidence

Rating Benchmarks

Guidance

3. Efficiency

3.1 Financial Performance of the Project and/or Client

Definition: Rating of company

financial performance in cases of pure corporate finance, equity investments or balance sheet restructuring. This will also normally be the indicator used for FI operations.

Rating of project financial performance in cases of project finance. This will normally be an indicator for infrastructure projects. In a few cases it will be necessary to consider both analyses. The

approach taken in the projections that were presented at approval should serve as a guide.

Refer to the Financial Analysis Summary Sheet attached to the FRM and subsequent monitoring reports and to Equity Valuation Sheets for equity projects. Significant items and ratios from the income statement and balance sheet will vary according to the operation type, but will include all those cited in financial covenants.

Conduct a variance analysis describing the financial performance from the operation’s inception to the time of evaluation, and analyse the reasons for differences from the performance that was expected at approval.

If approval documents included project FIRR and EIRR, review and recalculate those measures.

Excellent: Actual project or company results meet

or exceed appraisal estimates, such that the company has been able to service its debt obligations without problem and has generated a return to its shareholders well in excess of the cost of debt.

Fully satisfactory: Actual project or company results

are slightly below appraisal estimates, such that the company has been able to meet its debt obligations without problem but has failed to generate a return to its shareholders in excess of the cost of debt.

Partly unsatisfactory: Actual project or company

results are below appraisal estimates, such that the company has barely been able to service its debt obligations and has yielded a zero or negative return to its shareholders.

Unsatisfactory: Actual project or company results

are significantly below appraisal estimates, such that the company has failed to service its debt obligations in a timely manner.

Most of the variance analysis information should be available in past monitoring reports and Credit Review Memos, but the evaluator will need to provide a succinct overview of financial

performance over the operation’s life, not just over the last year.

Any amendment of the financial model after approval should be noted. Unless the amended model was approved at Board or OpsCom, the evaluator should compare actual performance against projections at approval. However, there should be comment on whether performance has been in line with the revised model and whether further revisions have been made.

If it is not possible to prepare an ex-post EIRR, then commentary should be included on the likely directional effect on the expected EIRR from factors such as cost increases/decreases, reduced/increased benefits and delayed or timely realisation of benefits.

The evaluator may exercise discretion if a plausible case exists that performance in the immediate future is likely to be materially different from the historical performance reported in project/company accounts.