Introductory Notes
The Evaluation Department (EvD) at the EBRD evaluates the performance of the Bank’s completed projects and programmes relative to objectives in order to perform two critical functions: reinforcing institutional accountability for the achievement of results; and, providing objective analysis and relevant findings to inform operational choices and to improve performance over time. EvD reports directly to the Board of Directors, and is independent from the Bank’s Management. Whilst EvD considers Management’s views in preparing its evaluations, it makes the final decisions about the content of its reports.
These guidelines have been prepared by EvD and are circulated under the authority of the Chief Evaluator. The views expressed herein do not necessarily reflect those of EBRD Management or its Board of Directors.
Nothing in this document shall be construed as a waiver, renunciation or modification by the EBRD of any immunities, privileges and exemptions of the EBRD accorded under the Agreement Establishing the European Bank for Reconstruction for Development, international convention or any applicable law.
These draft guidelines were prepared by Keith Leonard, Deputy Chief Evaluator of the EBRD Evaluation department and Nick Burke, Evaluation Consultant.
© European Bank for Reconstruction and Development, 2014 One Exchange Square
London EC2A 2JN United Kingdom Web site: www.ebrd.com
All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature.
Version Date: 07Apr15
DRAFT – EBRD Evaluation Department Guidance Note
Evaluation Performance Rating
These guidelines will apply to all evaluation in EBRD
Effective date: 1 January 2015 (pilot application in EvD) 1 January 2016 (pilot application for self-evaluation)
Introductory Notes
Evaluation in EBRD
1The focus of evaluation is on results – the outputs, outcomes and impacts that flow from the inputs the
European Bank for Reconstruction and Development (EBRD) provides (financing, deal structuring,
technical cooperation (TC), policy dialogue, staff support for implementation, etc.). These inputs allow a
range of activities to be carried out that lead to outputs and outcomes and contribute to wider impacts. As
well as being accountable for performance, the Bank must show that the lessons of past and current
experience are integrated into new activities, thereby demonstrating an active pursuit of improved
performance over time.
Evaluation plays an important role in this by providing a basis for institutional accountability for results, and
objective analysis and findings to inform operational choices and improve performance over time.
Evaluation must provide credible evidence, analysis, and independent judgment, along with
evidence-based findings and recommendations that are relevant, valuable, and actionable.
Further, it is likely that EBRD’s clients will increasingly seek ideas, knowledge and expertise from the Bank
to go with the financing provided. Evaluation provides a valuable source of knowledge on what works,
what doesn’t and why. The importance of policy dialogue in addressing policy impediments to transition is
also likely to grow. Evaluation can provide a basis for evidence-based policy, thereby strengthening the
messages EBRD wishes to convey. Finally, the meaning of transition itself is evolving. A move to more
comprehensive results measurement and results-based management can be served by an evaluation
system that takes account of the full range of results from the Bank’s strategies, policies and projects.
In playing this role, evaluation is fundamentally different from monitoring although evaluation does use the
output of monitoring as part of its evidence base for evaluation (see Table 1 for a comparison between
monitoring and evaluation). The purpose of monitoring is not to measure the ‘amount’ of success
achieved. Rather, monitoring takes place during implementation to provide guidance as to whether
expected results achievement is on- or off-track, and thereby to provide warning of the need for corrective
action. The Transition Impact Monitoring System (TIMS) is the tool with which the Bank monitors results
being produced by projects. It does this by tracking progress against a limited set of indicators (called
benchmarks in TIMS
2) during implementation. Monitoring does not need to track a comprehensive set of
indicators as its interest is in seeing whether the project is on- or off-track rather than providing an
assessment of the full range of results achieved. Monitoring under TIMS stops when the benchmark is
achieved or it becomes clear that it is not going to be achieved. Evaluation comes in after implementation
is complete, generally when there is at least one year’s operational data available. The focus of evaluation
is on the totality of results attributable (or plausibly in part attributable) to the Bank, whether those results
were anticipated or not. Evaluation uses all the evidence available on operational, financial, transition,
environmental
and
social
results
whether
previously
monitored
or
not.
1 See EBRD (2013). Evaluation Policy. Board approved 16 January 2013. Available at http://www.ebrd.com/what-we-do/evaluation-policy.html
2 EvD considers that use of the term benchmark contributes to confusion as in practice the benchmark may be either or both an
indicator by which performance will be measured and a target in terms of a timebound level of achievement over a baseline level of performance. It is good practice to keep indicators and targets separate to avoid such confusion.
Introductory Notes
Figure 1: Terminology used in the Guidance Note
Table 1: Comparison of Monitoring and Evaluation
Monitoring Evaluation
Takes place periodically during implementation Takes place one-time when operational
Can be used to clarify/modify expected results and associated targets
Takes Board-approved expected results and targets as given – assesses against all stated (or inferred) results whether monitored or not
Translates objectives into a limited set of performance indicators that should show movement during implementation
Uses a wider range of indicators and all available evidence to assess realised results
Tracks delivery of inputs, conduct of activities and whether expected results achievement is on- or off-track
Assesses achievement of expected and unanticipated results and the causal contribution of activities to such
Reports progress to managers and alerts them to problems so that corrective action can be taken
Provides findings, lessons and recommendations to improve future operations
Among its performance attributes, goes beyond results to assess relevance, process and allocative efficiency, and sustainability
To the extent possible, incorporates a counterfactual into performance assessment
EBRD
Corporate & Strategic Objectives
(incl. short-term priorities*)
Project
Sponsor(s)
Project-level Business
Objectives
Objec ves determine the scope
of the project and the investment
decision
Expected results are the an cipated effects of the project on key stakeholders,
categorized via the evalua on framework
Expected Results:
Outputs
Outcomes
Impacts
Unan cipated
Results
* For example, crisisIntroductory Notes
Source: Adapted from International Program for Development Evaluation Training presentation (January 2015)
Why rate performance?
Ratings have been used in EBRD since the creation of the Evaluation Department (EvD) and will continue
to be used for both self-evaluations by operational teams and independent evaluation by EvD. The
reasons why EBRD uses performance ratings in evaluation are:
A robust and consistently applied rating system means performance assessments can be
aggregated to track overall performance as well as that at various levels of disaggregation such
as country, region, sector or other dimensions, which would not be possible without the use of
ratings.
3
The use of a rating system is good practice under the Evaluation Cooperation Group (ECG)
Good Practice Standards for the Evaluation of Private Sector Operations.
4EvD is committed to
following these standards in evaluation, with customisation to its unique mandate as required.
A rating system provides a structure, consistency and transparency to performance assessment
that may not be present or apparent without the discipline of a rating system.
The use of ratings provides a quick means of checking inter-rater consistency among
evaluators, and between self-evaluation and independent evaluation.
Performance improvement means identifying and re-enforcing successful features and avoiding
past problems. The clarity provided by a rating system focuses attention on what did or did not
go according to plan, and why.
How ratings are used
For all evaluations where ratings are used, these seek both to describe performance and to explain it. As
such, the ratings help focus discussions between Banking and EvD, within Management and with the
Audit Committee, on what worked, what didn’t and why.
Aggregate ratings and various levels of disaggregation are reported and discussed in EvD’s Annual
Evaluation Review, which is always considered by the Board. Trends in criteria and sub-criteria ratings are
used to help explain trends in overall, sector and regional performance. Performance ratings may also be
used in special studies.
Purpose of these guidelines
Up until early 2013, guidance on the rating of investment operations was included in the Evaluation Policy.
Amendments to this policy, approved by the Board in January 2013 (see footnote 1), established that
henceforth methodological guidance would be issued in the form of stand-alone guidance notes rather
than being part of the policy. This allowed the policy to focus exclusively on strategic issues. It also means
that guidance can be revised without the need for a revision to the policy, which requires Board approval
and external consultation. This guidance note serves the following purposes:
3 EvD maintains a database of over 800 project performance ratings dating from 1996.
4 The ECG brings together the independent evaluation departments of all the main international finance
institutions/development banks. More information on the ECG and its Good Practice Standards are available at www.ecgnet.org.
Introductory Notes
To replace the guidance on project performance rating contained in previous versions of the
Evaluation Policy.
To make changes to ensure that the manner in which performance is assessed remains
relevant to EBRD’s evolving business needs, and that it stays ahead of emerging good practice.
To address some problem areas in the current project performance rating system, including:
lack of clarity and transparency in the definition of criteria and sub-criteria; how these should be
assessed; which to include in the overall rating; and how they are aggregated to derive an
overall rating.
To introduce a more internationally-recognised terminology into the performance rating system
– one that not only serves internal needs but which is also recognisable and understandable
outside the Bank. To the extent possible, the terminology of the Evaluation Network of the
OECD-DAC has been used, with some customisation to EBRD’s particular mandate.
5The main
terms of concern are those of inputs, activities, outputs, outcomes and impacts. These are
framed within the OECD-DAC evaluation criteria of relevance, effectiveness (termed ‘results’ in
these guidelines), efficiency, sustainability and impact.
To address the absence of guidelines for rating TC.
To provide the basis for performance rating in those special studies where performance rating is
to be used.
The extent and manner of application of the guidelines in special studies will be outlined in the study’s
approach paper. Many sector and thematic studies produced by EvD (including the evaluation of policies
and strategies) have included performance ratings although no guidance has existed for doing so. A
decision on the use of ratings in special studies, whether only at the criteria level or also an overall rating,
should be made on a case-by-case basis and confirmed in the approach paper prepared at the start of the
evaluation. Whether or not ratings are to be used, the approach paper should outline the basis for
performance assessment using these guidelines as the starting point.
Insofar as these guidelines are used for project performance rating, they will be the basis for
self-evaluation by banking teams in the Operations Performance Assessments (OPAs), for EvD’s validation of
these (OPAVs), and for EvD’s independent evaluations (OEs).
What these guidelines do not cover
These guidelines only cover how to derive a performance rating. They are not a complete guide to the
conduct of evaluation in EBRD. Specifically, they do not cover:
Sources of data, including the roles of quantitative and qualitative data (use of mixed methods)
and the use of triangulation.
The analysis and interpretation of data and performance ratings, and their distillation into
findings that explain performance, lessons and/or recommendations.
Tips for carrying out field investigations.
5 See http://www.oecd.org/dac/evaluation/glossaryofkeytermsinevaluationandresultsbasedmanagement.htm for the
Introductory Notes
These and related aspects are or will be covered in other guidance notes. In time, these will provide a
complete ‘how to’ for evaluation in EBRD.
Basic premises underlying the guidelines
A fundamental premise underlying these guidelines is that the boundary around what is being
evaluated is not narrowly drawn. For example, in the case of an individual transaction the
evaluation boundary is not the transaction itself – rather it is the transaction in the context of
why EBRD is involved. The consequence of this is that overall performance assessment takes
into account the performance of EBRD on selected dimensions as well as performance of the
project itself.
The exercise of evaluator judgment and discretion is an essential and legitimate part of
evaluation in EBRD, particularly given the varied and dynamic contexts in which EBRD
operates: the frequent deficiencies in data availability and reliability; the time and resource
constraints for the conduct of evaluations; and the frequent difficulties in attributing results to
EBRD alone. The guidance outlined in this document is the default approach that should be
followed unless there is compelling reason to vary it. Evaluators may exercise discretion in
applying this guidance provided it is done so transparently and ideally pre-approved in the
evaluation approach paper. It is mandatory that in all significant cases where evaluator
discretion has been exercised that this be fully transparent in evaluation reports, with
justification provided and the consequences for the performance rating made clear.
Because EBRD is a publicly-owned bank with wider objectives based on its mandate, and
because evaluation is generally carried out one to two years after final disbursement, it is
frequently the case that the impacts, and in some cases outcomes, may not be fully apparent or
measurable (for example, relevant data may not yet exist at the time of evaluation). For these
reasons, the achievement of impacts (and possibly outcomes) sometimes has to be inferred
rather than directly measured. Therefore, it is necessary for evaluators to construct a results
framework for the evaluation even where none exists in approval documents, or where one
does exist but is deficient for the purposes of evaluation. In constructing the results framework,
the evaluator is guided by the objectives and targets that are set out in the approval document
or that can be reasonably inferred from it. The results framework should provide a structure for
the evaluation based around the project’s expected outputs, outcomes and impacts. Using a
‘theory of change’, the evaluator is able to build a plausible case that a project’s expected
results have been, or will be, achieved even if they are not fully apparent or measurable.
Again, because of the limitations posed by data and resource constraints, the assessment of
outcome and impact considers the extent and plausibility of EBRD’s contribution to results,
particularly at the impact level, rather than seeking to establish direct attribution.
Evaluation must be evidence-based – findings, lessons and recommendations must flow
exclusively from the evidence presented and not from evaluator beliefs and personal opinions.
It is strongly preferred that there be a mix of qualitative and quantitative data and analysis
(mixed methods) coming from multiple sources (triangulation). Overall or aggregate
performance should avoid double-counting to the extent possible.
Assessment of performance should generally take into account three dimensions: expected
results, being the anticipated outputs, outcomes and impacts (so-called objectives-based
Introductory Notes
evaluation or ‘before and after’ evaluation); what would have happened without the project (the
counterfactual or ‘with and without’ evaluation); and, as a reality check, performance against
industry standards or other market benchmarks (if not already incorporated in expected results).
Changes to scope during implementation are often needed. Generally, performance is
assessed against the properly-approved revised scope although the effect of the scope-revision
on results may be commented upon. The reasons why scope-change became necessary are
explored under bank handling – if these could not have reasonably been foreseen at approval
and the changes to restore relevance were made in a timely manner, this aspect of bank
handling should be assessed positively, while the reverse situation (inadequate design and/or a
response that was not timely or did not restore relevance) would be assessed negatively.
Because learning is as important as accountability, the criteria considered and evidence
collected for performance assessment must help explain, as well as rate, performance.
A self-evaluation of all TCs is conducted on their completion with the evaluation ratings reported
in EBRD’s institutional scorecard. Transactional TCs are also assessed in the self-evaluation of
the investment operation to which they are attached.
The evaluation framework applies to all evaluations where performance rating is carried out. It
applies to self-evaluation as well as independent evaluation by EvD.
What is new in the guidelines?
The principal changes and refinements to prior practice for performance assessment of projects include:
Closer alignment with OECD-DAC Evaluation Network evaluation criteria as the basic structure
of performance assessment, namely: relevance; effectiveness (termed ‘results’ in this guidance
note); efficiency; sustainability; and impact.
A more explicit, as well as revised, set of sub-criteria.
Application of a results framework consisting of a hierarchy of inputs, activities, outputs,
outcomes and impacts (with definitions derived from OECD-DAC) and the logical connection
between them. This uses a theory of change for assessing performance in (commonly
occurring) situations where results (particularly impacts and sometimes outcomes) are not fully
observable at evaluation, or they are not measurable because data is unavailable or
incomplete.
Eliminating the separate consideration of the achievement of operational objectives, transition
impact, environmental and social objectives and the results of TC, policy dialogue and staff
contribution to capacity development (some of which were not previously routinely considered in
a performance rating). Instead, all anticipated results are now considered under the criterion of
results and identified as outputs, outcomes or impacts. However, to retain continuity, EvD will
derive ratings for transition impact, environmental and social performance, additionality and
sound banking (although the latter was not previously rated), based on a distillation of relevant
findings within the evaluation document.
Creating a clear separation between results that can be observed at evaluation or can plausibly
be inferred, and those where their future achievement is largely speculative. The former are
rated and included in the overall performance rating, while judgments are made about the latter
Introductory Notes
but there is no rating and so no inclusion in the overall performance rating. Some other changes
have been made to what is included in overall performance rating and what is not.
More explicit attention to unanticipated results (positive and negative) and use of the
counterfactual.
Assessing the plausible contribution of EBRD’s operations to the achievement of outcomes and
impacts, rather than seeking to establish direct attribution, which may not be possible within the
time and resource limitations of evaluation in the Bank.
Adoption of a numeric scoring and weighting system to derive each criterion rating from
component sub-criteria ratings, and the overall performance rating from criteria ratings.
6
Establishment of clear benchmarks for rating sub-criteria, criteria and overall performance,
whilst recognising the need for some evaluator discretion where observed performance is close
to a boundary.
Incorporation of guidance for the performance rating of TC.
Adoption of six rather than four categories for overall performance rating to allow for a more
granular rating of performance (previously around 80 per cent of performance ratings fell into
the categories of successful and partly successful).
Dropping the word ‘successful’ from rating category descriptors and its replacement where
necessary with ‘satisfactory’ as a less value-laden term.
Evaluation framework
This version of the guidelines covers the performance rating of investment projects and TC. Work is
ongoing to develop the criteria and sub-criteria for performance assessment in sector studies. For other
types of special studies, where ratings are to be used, the criteria and sub-criteria should be established in
the study approach paper. In such cases, performance rating should follow these guidelines to the extent it
is rational and possible to do so. As experience builds up it may be possible to extend guidance to a wider
range of special studies.
Table 2 outlines the evaluation framework’s four criteria for investment projects (with their 19 sub-criteria,
not all of which are always applicable). Only the first three criteria and associated sub-criteria are rated
and included in the overall rating. For self and independent evaluation of investment projects, the default
position is that sub-criteria should have an equal weight in determining their parent criterion rating and,
together, the overall performance rating. However, weights may be varied by evaluators exercising their
discretion in a transparent manner. Table 2 also shows four derived ratings, which reflect the Bank’s
unique mandate: transition impact; environmental and social performance; additionality; and sound
banking.
The guidelines apply the OECD-DAC lens of outputs, outcomes and impacts that are linked to one another
by a connecting theory of change. This is different from the way in which EvD has traditionally described
results for investment projects where operational results, transition results and environmental and social
6 Previously, there was little if any specific guidance on how to rate sub-criteria or derive criteria ratings. The derivation of an
overall performance rating was based on an incomplete matrix showing combinations of ratings for four of the seven criteria. This method was largely non-transparent, so that it was unclear how a particular rating had been derived. The method did not permit any determinant analysis of the relationship between sub-criteria, criteria and overall performance.
Introductory Notes
results all had separate ratings. In order to continue to provide that traditional lens there are four derived
ratings. The four match the mandates of the Bank. The first three (D1 to D3 in Table 2) match existing
ratings and so their inclusion in this guidance will permit continuity of time-series data (while recognising
that there are some differences in the underlying sub-criteria). The fourth derived rating, for sound
banking, is new. While it is a core mandate of EBRD, it has not been rated before. Derived ratings D3 and
D4 are generated directly and automatically from sub-criteria ratings in Table 2, while D1 and D2 require
some evaluator judgment based on a distillation of relevant findings within the results criterion.
Table 2: Investment project evaluation criteria and sub-criteria
1. Relevance 1.1. Strategic relevance 1.2. Relevance of design 1.3. Expected additionality 1.4. Demonstrated additionality 2. Results 2.1. Achievement of outputs
2.2. Contribution to expected outcomes 2.3. Contribution to expected impacts
2.4. Performance against benchmarks (if relevant) 2.5. Unanticipated results (positive or negative) 3. Efficient Resource Use
3.1. Financial performance of project or client 3.2. Implementation efficiency
3.3. Bank investment profitability 3.4. Bank handling
3.5. Consultant performance (if relevant)
4. Other performance attributes (assessed but not rated)
4.1. Sustainability of achieved results 4.2. Client’s contribution
4.3. Co-financier’s contribution (if any) 4.4. Innovation features (if applicable) 4.5. Merit features (if applicable) Derived ratings
D1. Transition impact (derived based on evaluator-flagged transition results drawn from 2.1, 2.2, 2.3 and 2.5)
D2. Environmental and social performance (derived based on evaluator-flagged environmental and social-related results drawn from 2.1, 2.2, 2.3 and 2.5)
D3. Additionality (rated automatically based on 1.3 and 1.4) D4. Sound banking (rated automatically based on 1.2, 3.3 and 3.4)
Table 3 summarises the equivalent performance rating framework for TC, and also shows the differences
between self-evaluation and independent evaluation.
7This is to reflect the fact that self-evaluation is
always carried out for individual TCs whereas, apart from the very largest non-transactional TCs, EvD
tends to evaluate TC in clusters defined by a common theme (and in such cases may adopt a broader
range of criteria).
7 All TC is self-evaluated (via a Project Completion Report - PCR) whether transactional or not. For investment projects with
transactional TC, the TC is not rated separately when independently evaluated by EvD (though the PCR rating would be noted). Rather, the TC outputs and outcomes are incorporated in the results assessment of the transaction as a whole.
Introductory Notes
Table 3: Technical cooperation evaluation criteria and sub-criteria
1. Relevance
1.1. Strategic relevance 1.2. Relevance of design
2. Results
2.1. Achievement of outputs
2.2. Contribution to expected outcomes 2.3. Unanticipated results (positive or negative) a 2.4. Sustainability of achieved results a 3. Efficient Resource Use
3.1. Bank handling 3.2. Client’s handling
(equivalent of client’s contribution in Table 2)
4. Other performance attributes (assessed but not rated) 4.1. Donor’s contribution
(not required for self-evaluation) 4.2. Innovation features
(not required for self-evaluation) 4.3. Merit features
(not required for self-evaluation)
a These sub-criteria are rated but are not determinants of the overall rating for TC self-evaluation. Independent evaluators may,
however, deem it necessary to consider them in the overall rating.
For self-evaluation of TC, a decision has been made to pre-assign weightings to criteria as follows:
relevance (25%); results (60%); and efficiency (15%).
8When validating self-evaluation ratings for
individual TCs, EvD will apply weightings consistent with those used for the self-evaluation. For other
purposes, however, EvD may choose to use different weightings and to include anticipated results and/or
sustainability as sub-criteria contributing to the overall rating.
Table 4: Four-category rating system for sub-criteria
Category Investment Projects (Annex 1) TC (Annex 2)
Achievement score (& grey zone)
Excellent Performance meets or exceeds the excellent
benchmark specified in the Annex.
75% (65% to 85%)
A substantial majority achievement level for the sub-criteria/criteria being assessed.
Fully satisfactory
Performance meets or exceeds the fully
satisfactory benchmark specified in the
Annex.
50% (40% to 60%)
A majority achievement level for the sub-criteria/criteria being assessed.
Partly unsatisfactory
Performance meets or exceeds the partly
unsatisfactory benchmark specified in the
Annex.
25% (15% to 35%)
A minority achievement level for the sub-criteria/criteria being assessed.
Unsatisfactory Performance fails to meet the benchmark for a partly unsatisfactory rating as specified in the Annex.
0% (No grey zone) Failure to reach even a minority
achievement level.
Other Ratings:
8 However, other weightings can be applied, including equal weightings. Evaluators may wish to vary weights according to the
type of TC, for example transactional versus non-transactional, or for policy dialogue where a higher weighting for relevance and efficiency might be appropriate since results may take many years to materialise.
Introductory Notes
Category Investment Projects (Annex 1) TC (Annex 2)
Achievement score (& grey zone)
Not applicable Certain sub-criteria (e.g., consultant performance in investment evaluations) may not be relevant in all projects and so should be rated not applicable.
No score is attributed to ratings of not applicable and so the sub-criterion has no effect on the synthesis criterion or on the overall performance rating.
No opinion possible
Where there is insufficient evidence to assign a rating (for example due to the premature closure of a project before any meaningful data could be collected or inferred), a rating of no
opinion possible may be assigned, though this should be a last resort and in most cases
should prompt further evaluative research or fieldwork.
No score is attributed to ratings of no opinion possible and so the sub-criterion has no effect on the synthesis criterion nor on the overall performance rating.
Sub-criteria ratings
Sub-criteria ratings are the basic building blocks, from which criteria and overall performance ratings are
derived. Sub-criteria and criteria are rated using a four-category system as shown in Table 4. Guidance on
applying the rating categories for each sub-criterion is given in Annexes 1 and 2 for investment and TC
evaluations respectively.
In recognition of the fact that boundaries between rating categories are almost never certain given data
limitations and the degree of judgment necessary, the evaluator is permitted some discretion where the
evidence points to a rating close to a threshold. For example, in TC evaluations, such discretion is allowed
should the achievement level and system score (per progress or completion reports) fall within the ‘grey
zone’ and therefore close to a boundary. The rating decision in such cases should be based on
clearly-stated, objective considerations.
Derivation of criteria ratings
To assist in rating decisions, a rating tool has been developed and is embedded in revised evaluation
templates. This tool provides recommended criteria and overall performance ratings based on sub-criteria
ratings and weights. The recommended criteria ratings are based on the following algorithm:
A score (s) is assigned for each sub-criterion rating as follows: excellent = +2.0;
fully satisfactory = +0.5; partly unsatisfactory = –0.5; unsatisfactory = –2.0
9
For investment projects, sub-criteria are weighted high, medium or low, with weightings (w) of
x2.0; x1.5 and x1.0 respectively. (Accordingly, sub-criteria weighted as high have twice the
influence as those weighted low).
For TC, sub-criteria are weighted by percentages, such that their combined weightings sum to
100 per cent.
For each criterion, the weighted average score (cs
avg) is calculated from the underlying
sub-criteria ratings using the equation: cs
avg= ∑ws / ∑w
The criterion rating is then determined from cs
avgbased on the following table:
9 Note that the scoring scale is not linear. This is because ratings of either excellent or unsatisfactory tend to reflect project
performance at the positive or negative extremes, which is better captured by a non-linear scale, i.e. one having a slight tail at the low and high end.
Introductory Notes
Table 5: Determination of criterion ratings
Weighted Average Score of Sub-Criteria (csavg) Criterion Rating
csavg > 0.9 Excellent
0.0 < csavg <= 0.9 Fully satisfactory
– 0.9 <= csavg <= 0.0 Partly unsatisfactory
csavg < – 0.9 Unsatisfactory
Derivation of overall performance rating
For the rating of investment projects, framework agreements and TC, only the first three criteria in Tables
2 and 3 (i.e., relevance, results and efficiency) are used for the derivation of the overall performance
rating. For special studies, the decision as to what to include in the overall performance assessment is
made on a case-by-case basis. The recommended overall performance rating is derived as follows:
As for criteria ratings, the rating tool should be used to determine the overall performance
rating. The overall performance score (PS
avg) is determined from the weighted average of all
sub-criteria scores using the following equation: PS
avg= ∑ws / ∑w
A six-category rating system is applied to produce the overall project performance rating.
10This
is determined from PS
avgas shown in Table 6 below.
Table 6: Determination of the overall project performance rating
Weighted Average Score of All Sub-Criteria (PSavg) Overall Performance Rating
PSavg > 0.9 Outstanding
0.45 < PSavg <= 0.9 Good
0.0 < PSavg <= 0.45 Acceptable
– 0.45 <= PSavg <= 0.0 Below standard
– 0.9 <= PSavg < – 0.45 Poor
PSavg < – 0.9 Very poor
Where the recommended performance rating is shown to be close to a boundary, evaluator discretion can
still be exercised, though the justification for this needs to be provided in the accompanying text.
Applying the rating framework
The detailed characteristics of the evaluation sub-criteria and criteria as they apply to investment
operations and TC are described in Annexes 1 and 2 respectively. Each EvD corporate or thematic
10 Previously, EvD used a four-category rating system for overall performance (highly successful, successful, partly successful
or unsuccessful). However, almost 80% were rated as successful or partly successful. The new guidelines allow a more granulated rating of performance.
Introductory Notes
evaluation approach paper should specify the sub-criteria, criteria and weightings to be used to rate
overall performance (if performance is to be rated), ensuring that there is consistency with the principles
and guidance outlined in this paper.
Annex 3 provides an example of a pro-forma results framework by which a project's expected results are
viewed through the OECD-DAC lens of outputs, outcomes and impacts, linked to one another by a
connecting theory of change.
Annex 4 contains document templates for the self-evaluation OPA, and independent validation OPAV and
evaluation OE instruments.
Annex 1: Rating criteria and sub-criteria for investment projects
Annex 1: Detailed guidance on rating sub-criteria and criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
1. Relevance
1.1 Strategic Relevance
Definition: Assesses the degree of
relevance to the Bank’s strategic agenda, both at approval and at evaluation.
The evidence provided for strategic relevance should demonstrate how the project actively helped EBRD to deliver on its policy and strategy intentions, rather than happening to be in alignment (or not being in conflict) with (often very generally-worded) strategies and policies.
Excellent: At approval, the project was closely
aligned with and capable of helping deliver on the Bank’s strategic agenda, and has remained aligned with evolving Bank strategy over time.
Fully satisfactory: At approval, the project was
reasonably well aligned with the Bank’s strategic agenda and was capable of making some contribution to the Bank’s realisation of its strategic agenda, and this has remained so over time.
Partly unsatisfactory: At approval, the project was
only weakly aligned with the Bank’s strategic agenda and was capable of making only a limited contribution to the realisation of this agenda, and/or has limited relevance to evolving Bank strategy since approval.
Unsatisfactory: At approval, the project lacked
alignment with the Bank’s strategic agenda, and that has remained the case since approval.
The relevant country and sector strategies are an obvious starting point for assessing strategic relevance, but projects may support other parts of the Bank’s strategic agenda such as, for example, the gender strategy.
Each of the rating benchmarks defines the extent to which, at approval, the project could plausibly have helped the Bank deliver on its strategic agenda.
Projects are often approved under one set of strategies and policies but evaluated under more recently approved ones. The evaluator should note any relevant and significant changes to the strategic agenda and discuss whether the project was more or less aligned as a result of these changes. The performance assessment should consider strategic relevance at approval and at evaluation.
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
1.2 Relevance of Design
Definition: Focuses on: the
soundness of the design logic (implicit or explicit theory of change); the evaluability of the operation (i.e., the adequacy of the specification of expected results and the indicators for measuring their achievement); the adequacy of identification and incorporation of the lessons of past experience; and the adequacy of risk identification and mitigation.
Assess the following design features: (i) The plausibility of the sources of
expected transition impact, i.e. evidence of a sound design logic (theory of change) such that the transaction could plausibly produce this impact.
(ii) The completeness and clarity of specification of expected results (outputs, outcomes and impacts) of the project (including any
associated TC and policy dialogue) and the adequacy of the TIMS benchmarks and other indicators. (iii) Use of past experience and
lessons to shape design. (iv) Whether risk factors have been
adequately identified and taken into account in design through
identification of coping strategies.- as part of this assessment the assumptions implicit in the theory of change should be spelled out to see if these pose unidentified risks.
Excellent: All four aspects of project design proved
to be fully appropriate during implementation.
Fully satisfactory: Three of the four aspects of
project design proved to be fully appropriate during implementation.
Partly satisfactory: Only one or two of the four
aspects of project design proved to be fully appropriate during implementation.
Unsatisfactory: None of the four aspects of project
design proved to be appropriate during implementation.
Evaluability is defined as the extent to which the expected results of a project are verifiable in a reliable and credible manner.
This analysis focuses in the first instance on whether the inputs provided by EBRD in terms of investment and any supporting TC and/or policy dialogue would plausibly have produced the expected results, particularly transition impact. This requires the evaluator to identify, and assess the validity of, the implicit or explicit theory of change in order to arrive at a judgment as to whether the expected results are realistic.11
All projects, TC, strategies and policies should demonstrably take account of the lessons from past experience whether sourced from EvD findings, the team’s experience or elsewhere. Identification of lessons in the approval document is not a sufficient test – something must have been done differently as a result if there is a relevant lesson.
11 At its most basic level a theory of change explores whether the inputs and activities carried out are necessary and sufficient to produce the expected outputs; whether those outputs are necessary and sufficient to
achieve the stated outcomes; and whether those outcomes are necessary for contributing to impact achievement. However, a well-constructed theory of change (which should usefully be illustrated in diagrammatic form) goes beyond this technical logic: (i) to make explicit the embedded assumptions in the project ‘storyline’; and (ii) to identify and take into account contextual factors that could hinder or support success in moving along the chain from inputs to impacts (for example, the political economy dimension). Constructing a theory of change should be a reflective process that looks at the project in its context. A review of the theory of change can be found at http://r4d.dfid.gov.uk/pdf/outputs/mis_spc/DFID_ToC_Review_VogelV7.pdf.
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
1.3 Expected Additionality – were
claims made at approval plausible?
Definition: Describes how the Bank
planned to add value by one or more of the following: (i) its financial terms and conditions; (ii) the unique attributes the Bank brought to the project; (iii) inclusion of legal covenants that would not have otherwise been agreed by the client; and (iv) mobilisation of additional commercial finance.
Assess whether, at the time of project approval, the additionality claims in the approval document were plausible. For example, the assessment should consider the availability (if any) of alternative sources of finance other than from EBRD and on what terms.
Excellent: All claims justifying additionality were
plausible at the time of approval.
Fully satisfactory: All claims justifying important
areas of additionality were plausible at the time of approval.
Partly unsatisfactory: One or more claims justifying
important areas of additionality were not plausible.
Unsatisfactory: Most or all claims justifying the
Bank’s additionality were not plausible.
Based on the information and knowledge that would have been available at the time of approval (that is, not using the benefit of hindsight) the evaluator will judge whether the claimed additionality was plausible at approval, or not.
1.4 Demonstrated Additionality –
was EBRD additional in fact
At the time of the evaluation, assess the extent to which the operation was additional (whether identified as such at approval or not). In particular, the assessment should look at whether the Bank’s attributes, legal covenants and/or the expected additional commercial financing, actually happened.
Excellent: All aspects of claimed additionality were
borne out and/or there were significant unforeseen ways in which the Bank was additional.
Fully satisfactory: All important aspects of claimed
additionality were borne out and/or there were unforeseen ways in which the Bank was additional.
Partly unsatisfactory: One or more important
aspects of claimed additionality were not borne out.
Unsatisfactory: Most or all aspects of claimed
additionality were not borne out.
This assessment is focusing on whether there is evidence that the additionality statements were in fact borne out during implementation. For example, did the Bank’s attributes come out during
implementation? Were legal covenants met in full or were some waived? Was supplementary finance actually mobilised and used as intended?
The evaluator should also note and take account of ways the Bank was additional in unforeseen ways (for example, capturing an opportunity to engage in policy dialogue that did not exist at approval).
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
2. Results
2.1 Achievement of Outputs
Definition: The extent to which
expected outputs were achieved. For evaluation purposes in EBRD, outputs are defined as the products, capital goods and services which result from an operation. In a departure from past practice where results were assessed under the criteria of achievement of operational objectives, and in some cases achievement of transition impact and environmental and social performance these guidelines group results according to the OECD-DAC criteria of outputs, outcomes and impacts. See guidance in the right-hand column for what constitutes outputs.
Compares achieved outputs with expectations as at approval, or as revised if any revisions were formally approved during implementation. Not all outputs have equal value in scope and scale with regard to achieving an outcome. Also, some outputs will have quantitative targets while others may only be expressed in qualitative terms. The evaluator should transparently use discretion through, for example, using variable weightings for different outputs in reaching a rating for output achievement.
As achievement of outputs is largely within the control of the project, the boundaries for the rating categories have been raised compared with some other aspects of performance, which are subject to greater risk of realisation. To the extent that the client needs to make specific investments in order to address existing environmental and social issues or to ensure compliance of new facilities (e.g., via an
Environmental and Social Action Plan), then the delivery of such should be assessed under outputs.
Assuming a score of 1.0 for an achieved output, 0.5 for one partly achieved, and 0.0 for one not achieved, calculate the aggregate level of achievement. The overall proportion of output achievement (in percentage terms) is then derived from this aggregate score divided by the number of expected outputs. The thresholds defining the rating benchmarks below can then be applied directly. For example, if output achievement was 6.0/8=75%, a fully satisfactory rating would be assigned. However, in making the calculation the evaluator may place more emphasis on certain outputs where these are judged to have particular significance, or visa versa.
Excellent: At least 85% of outputs have been
achieved.
Fully satisfactory: At least 60% of outputs have
been achieved.
Partly unsatisfactory: At least 35% of outputs have
been achieved.
Unsatisfactory: Less than 35% of outputs have
been achieved.
Outputs cover all results that fit the definition in column 1, including any outputs among the TI benchmarks. The Board approval document usually describes outputs in different sections of the text or in annexes, most often with indicators and associated target values. Outputs may be described in any or all of operational objectives, transition impact benchmarks, legal covenants, transactional TC, policy dialogue or capacity building (e.g. in procurement) conducted by staff, and environmental and social action plans, etc. Prior to rating results, and in the absence of an adequate results matrix embodying a theory of change in the Board document, the evaluator prepares a results framework (partial in some cases where information in the approval document is absent). This should identify the project’s outputs (and outcomes and impacts) with appropriate indicators, baselines and associated target values, derived from the information contained in the Board approval documents and its annexes. This results framework should include expected outputs whether or not associated indicators and targets were set – the evaluator will need to formulate these in such situations. The results framework is appended to the evaluation.
If the expected outputs were formally revised during implementation, following procedures in the Operations Manual, the project’s achievements will be assessed against these revised expectations. The reasons why revisions were necessary are considered under Bank Handling and judged positively or negatively accordingly.
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
2.2 Contribution to Expected Outcomes
Definition: The extent to which
expected outcomes described in the approval document or revised subsequently have been achieved, plausibly and significantly as a result of the project.
Outcomes are assessed against the expected results using the associated indicators, their baselines and target values, as described in the approval document or as formally revised during implementation.
Evidence would include: (i) baseline and at-evaluation values of
performance indicators; and (ii) other quantitative and qualitative information relevant to the expected outcomes. The rating reflects the project’s incremental contribution to observed outcomes, regardless of whether this represented movement in the right or wrong direction (positive or negative). When a positive outcome is achieved but there is evidence that this is primarily due to other factors, the rating may be adjusted downward
accordingly. This requires consideration of the counterfactual, i.e. what would have happened in the absence of the project.
For example, if outcome indicators meet or exceed their target values, but there is evidence that this was due mainly to external factors unrelated to the project, a partly unsatisfactory rating may be warranted.
Assuming a score of 1.0 for an achieved outcome, 0.5 for one partly achieved, and 0.0 for one not achieved, calculate the aggregate level of achievement. The overall proportion of outcome achievement (in percentage terms) is then derived from this aggregate score divided by the number of expected outcomes. The thresholds defining the rating benchmarks can then be applied directly. For example, if outcome achievement was 4.5/8=56%, a fully satisfactory rating would be assigned. However, in making the calculation the evaluator may place more emphasis on certain expected outcomes where these are judged to have particular significance, or visa versa.
Excellent: At least 75% of outcomes have been
achieved as a result of the project.
Fully satisfactory: At least 50% of outcomes have
been achieved as a result of the project.
Partly unsatisfactory: At least 25% of outcomes
have been achieved as a result of the project.
Unsatisfactory: Less than 25% of outcomes have
been achieved as a result of the project.
For evaluation purposes in EBRD, outcomes are defined as the short-term and medium-term effects directly attributable to delivery of the operation’s outputs. Outcomes should be discernable by project completion.
The results framework will include expected outcomes, their indicators, baselines and target values, as described or implied in the Board document along with any changes approved during implementation and following Operation Manual procedures.
Statements of expected outcomes are typically found in project documents such as: the Board document; President’s Recommendation; Summary Fact Sheet under ‘Project Description/Business Purpose’; Terms and Conditions under ‘Use of Proceeds’ in Section 3; and in annexes including the environment and social action plan. Also, the table of transition impact benchmarks (Board document) could include indicators and target values that are more appropriate for measuring outcome achievement. As well as being assessed against expectations (‘before and after’ assessment), observed outcomes are compared with a plausible without-project counterfactual (‘with and without without-project’ assessment). It may not be possible to come up with a scientific or rigorous counterfactual. Rather, the intent is for some consideration of the extent to which observed results would have occurred
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
Conversely, if outcome indicators deteriorated, but there is evidence that the decline would have been worse in the absence of the project, a fully
satisfactory rating may still be
warranted.
anyway, and whether they were due to the project.12
If the expected outcomes, indicators or targets were formally revised during implementation, the project will be assessed against these revised expectations.
For projects involving financial intermediaries, the assessment should consider the extent to which such intermediation reached target groups. If the client’s sub-borrowers cannot be identified, a ‘before and after’ comparison of the intermediary’s portfolio is made to determine whether it increased its exposure to the target group. A theory-based method (e.g., a results framework) is then used to establish plausible causality. This might include evidence, for instance, that the intermediary had improved its marketing, screening, and credit procedures with the intention of increasing its reach to small and medium-sized enterprises.
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
2.3 Contribution to Expected Impacts
Definition: The contribution at the
time of evaluation to the achievement of impacts (including environmental or social impacts) at the project or company level, and at the sector or economy level.
The word ‘contribution’ is used in recognition of the fact that it can be difficult to attribute impacts to a single intervention or indeed group of related interventions. However, evidence must be presented to show a plausible and significant
contribution by the EBRD operation to the achieved impact.
Focusing on the impact benchmarks and associated description in the Board approval document (but see guidance), an assessment is made of the realised impact at the time of evaluation. If, as a result of achieving compliance (or not) with safeguards, there are identifiable environmental or social impacts on neighbours of the project (positive or negative), then these should be assessed under this sub-criterion.
Other apparent impacts not identified at approval and not covered by transition impact benchmarks are evaluated under unanticipated results (see below).
The analysis focuses here on what has been achieved (what can be observed at the time of evaluation or what can plausibly be inferred from the evidence available at evaluation). It does not consider the potential for achieving further future impact – this is considered under Sustainability of Results (see below). This separation is made to delineate clearly the
observable or plausibly inferable (which is included in overall performance rating) and the speculative (which is not).
Assuming a score of 1.0 for an achieved impact, 0.5 for one partly achieved, and 0.0 for one not achieved, calculate the aggregate level of achievement. The overall proportion of impact achievement (in percentage terms) is then derived from this aggregate score divided by the number of expected impacts. The thresholds defining the rating benchmarks can then be applied directly. For example, if impact achievement was 4.5/8=56%, a
fully satisfactory rating would be assigned.
However, in making the calculation the evaluator may place more emphasis on certain expected impacts where these are judged to have particular significance, or visa versa.
Excellent: At least 75% of expected impacts have
been achieved.
Fully satisfactory: At least 50% of expected
impacts have been achieved.
Partly unsatisfactory: At least 25% of expected
impacts have been achieved.
Unsatisfactory: Less than 25% of expected impacts
have been achieved.
For the purposes of evaluation in EBRD, impacts are defined as the positive or negative long-term effects, expected or unanticipated, to which an operation contributes, directly or indirectly, as a result of outcomes.
Given its unique mandate, EBRD has its own definition of transition impact, which is rated separately (see D.1 below).By definition and in practice, transition impact and the benchmarks by which it is monitored in the Transition Impact Monitoring System (TIMS) are, under OECD-DAC terminology, a mix of activities, outputs, outcomes and impacts. Accordingly, the transition impact and benchmarks cited in approval documents will need to be re-categorised as outputs, outcomes and impacts in the results framework for the evaluation. The evaluator should therefore consider here only that part of transition impact that accords with the OECD-DAC definition of impact. Where a transition impact is more correctly an expected output or outcome, it is assessed under 2.1 or 2.2 respectively.
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
2.4 Performance Against Industry Benchmarks or other Standards
Definition: A comparison of project
performance with that of peers in the industry and/or with relevant EU or other standards.
At the time of the evaluation, assess the extent to which the performance achieved meets industry and/or sector norms, and/or EU or other relevant standards.
To the extent that local or EU environmental and social performance standards exceed those of EBRD, then compliance with such standards should be assessed under this sub-criterion. This analysis should refer to market benchmarks, regulatory requirements or standards of good practice that prevail at the time of evaluation, regardless of whether these standards existed at the time of approval. The evaluator should comment, however, on the extent to which such
benchmarks have changed significantly over time.
There are two ways in which the rating of this sub-criterion may be determined as reflected below:
Excellent: The project or client is in the top quartile
of a relevant peer group, or performance is 50% or more above the industry average, benchmark or standard (that is, 1.5 times the
average/benchmark/ standard).
Fully satisfactory: The project or client is in the
second quartile of a relevant peer group, or performance is at least 100% of the industry average, benchmark or standard.
Partly unsatisfactory: The project or client is in the
third quartile of a relevant peer group, or performance is between 50% and 99% of the industry average, benchmark or standard.
Unsatisfactory: The project or client is in the bottom
quartile of a relevant peer group, or performance is less than 50% of the industry average, benchmark or standard.
This sub-criterion is included to control for cases where different projects might be set more or less challenging approval targets, which can lead to lower or higher achievement ratings. Assessing performance against market or regulatory benchmarks therefore provides a reality check on the robustness of results achievement in the context of the project’s wider operating environment.
For projects where the expected results have targets anchored in market-based benchmarks, less weight should be placed on this particular sub-criterion when synthesising the overall rating of results. Conversely, where there is a significant disconnect between results targets and market-based benchmarks, the evaluator should use this sub-criterion as a modifier of the overall
performance rating by assigning an appropriate rating and higher weight.
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
2.5 Unanticipated Results
Definition: Other positive or negative
results that are not covered in the above sub-criteria.
Assesses positive or negative results that were not anticipated at approval and not, therefore, covered by the outputs, outcomes or impact sub-criteria.
The rating reflects the incremental results, whether positive or negative. Taking account of the counterfactual might reveal that the quantum of an unanticipated positive or negative result could have been different in the absence of the project.
Excellent: The project made a substantial and
plausible contribution to the achievement of significant positive unanticipated results.
Fully satisfactory: The project made a plausible
contribution to the achievement of net positive unanticipated results.
Partly unsatisfactory: The project contributed to net
negative unanticipated results.
Unsatisfactory: The project contributed to
significant negative unanticipated results.
Not applicable: There was no evidence of
unanticipated results.
To be included, unanticipated results must be truly unexpected, attributable to the project, quantifiable, of significant magnitude, and at least as well-evidenced as the project’s other results.
Typically, Board documents identify how a project will contribute to two or three transition impact criteria only, and so the evaluation should assess whether in reality the project contributed to other transition impact criteria.
In some cases projects might have both positive and negative unanticipated results. In these cases, the evaluator will need to exercise judgment as to whether on balance the rating should be fully
Annex 1: Rating criteria and sub-criteria for investment projects
Criteria and Sub-Criteria
Evidence
Rating Benchmarks
Guidance
3. Efficiency
3.1 Financial Performance of the Project and/or Client
Definition: Rating of company
financial performance in cases of pure corporate finance, equity investments or balance sheet restructuring. This will also normally be the indicator used for FI operations.
Rating of project financial performance in cases of project finance. This will normally be an indicator for infrastructure projects. In a few cases it will be necessary to consider both analyses. The
approach taken in the projections that were presented at approval should serve as a guide.
Refer to the Financial Analysis Summary Sheet attached to the FRM and subsequent monitoring reports and to Equity Valuation Sheets for equity projects. Significant items and ratios from the income statement and balance sheet will vary according to the operation type, but will include all those cited in financial covenants.
Conduct a variance analysis describing the financial performance from the operation’s inception to the time of evaluation, and analyse the reasons for differences from the performance that was expected at approval.
If approval documents included project FIRR and EIRR, review and recalculate those measures.
Excellent: Actual project or company results meet
or exceed appraisal estimates, such that the company has been able to service its debt obligations without problem and has generated a return to its shareholders well in excess of the cost of debt.
Fully satisfactory: Actual project or company results
are slightly below appraisal estimates, such that the company has been able to meet its debt obligations without problem but has failed to generate a return to its shareholders in excess of the cost of debt.
Partly unsatisfactory: Actual project or company
results are below appraisal estimates, such that the company has barely been able to service its debt obligations and has yielded a zero or negative return to its shareholders.
Unsatisfactory: Actual project or company results
are significantly below appraisal estimates, such that the company has failed to service its debt obligations in a timely manner.
Most of the variance analysis information should be available in past monitoring reports and Credit Review Memos, but the evaluator will need to provide a succinct overview of financial
performance over the operation’s life, not just over the last year.
Any amendment of the financial model after approval should be noted. Unless the amended model was approved at Board or OpsCom, the evaluator should compare actual performance against projections at approval. However, there should be comment on whether performance has been in line with the revised model and whether further revisions have been made.
If it is not possible to prepare an ex-post EIRR, then commentary should be included on the likely directional effect on the expected EIRR from factors such as cost increases/decreases, reduced/increased benefits and delayed or timely realisation of benefits.
The evaluator may exercise discretion if a plausible case exists that performance in the immediate future is likely to be materially different from the historical performance reported in project/company accounts.