Representativity challenges - Indicators and frameworks

Chapter 2 Theoretical foundations of STA

2.3 Indicators and frameworks

2.3.2 Representativity challenges

161 The first key point about indicators understood as scientific variables is that they can be measured⁴⁹ or observed by following a defined procedure. To be credible, indicators first should represent the phenomenon they intend to measure and be consistent in this representation. In other words they should meet criteria for scientific soundness - what Gudmundsson et al. call representativity (Gudmundsson et al. 2016): they should be clear (well defined), valid (based on confirmed causal mechanisms), reliable (predictable and reproducible via a measurement process), sensitive (accurately capture changes), and robust (insensitive to interferences).

Closely related to this are concerns for measurability: this is mostly a practical concern about data

availability or cost-effectiveness in obtaining data (in terms of effort, time, resources and skills), and in some cases about ethical issues (e.g. privacy concerns when collecting detailed travel data).

162 Below is an example of a indicator of car ownership compared to a national average in Nørrebro district, the Copenhagen neighbourhood where I live. The accompanying picture shows what a street with ‘Meget lav’

(very low) car ownership looks like in real life. Although the legend actually uses plain words, the data is based on a numerical variable based on an interval scale (deciles).

Figure 9: Example of an indicator of car ownership levels in Copenhagen using a named variable (from ‘very low’ to ‘very high’) based on an interval scale (numerically equal population deciles), compared to a national average, disaggregated

geographically on a 100m2 raster. The lowest decile in car ownership is shown in orange. But is it useful?

163 Indicators summarise and simplify, and they are often indirect and approximate. What the geographically disaggregated interval indicator in Figure 9 illustrates is that indicators can be quite elaborate, but at the same time they can never fully represent or explain a phenomenon. While it may be considered a fact that the cell above indeed represents a low car ownership rate for this geographical area, it does not explain that a large portion of the streets are shared use areas limited to 15kph with no provision for on-street parking, that most of the neighbourhood has a number of bollards preventing through traffic, that Nørrebro is well known

49 Measures are variables based on a standard unit, whereas a metric is a quantitative indicator based on two or more measures. Cardinal variables use an interval scale (with equal numerical intervals e.g. time ) or a ratio scale (same as an interval scale but with a true zero e.g. distance travelled or CO2 concentrations) and are therefore quantitative by definition. Categorical variables are qualitative and use a nominal scale (e.g. categories such as vehicle types, or a binary state such as ‘implemented’). Ordinal variables use a ranking scale (e.g. judgments such as modal preference on a Likert scale, where the intervals between positions on the scale cannot be said to truly represent equal distances between judgments). These variables can be absolute, relative to a specific target or norm, ratios between multiple aspects, or more complex aggregates forming an index. They can also be expanded on a timeline or in space.

for its high level of provision of wide and segregated cycling infrastructure, or that most residents may not be able to afford to buy or even to drive a car. Indicators, by being reductionist, are limited and therefore partial.

In other words: “Not everything that can be counted counts, and not everything that counts can be counted”⁵⁰.

164 In many disciplines there is a tacit preference for putting numbers on what we know about reality as a way to express it scientifically: “Indeed, it is maintained that one of the essential functions of indicators is to

quantify” (Gallopín 1996). Joumard & Gudmundsson explain that quantitative indicators are preferred

“because of the potential precision and reproducibility provided by standard numerical metrics” (2010).

According to Gallopín, qualitative indicators can be preferable when “the attribute of interest is not

inherently quantifiable”, or when the cost of obtaining (or modelling) quantitative data becomes prohibitive (1996). For example, measuring wider economic impacts of large transport infrastructure projects such as high-speed rail remains difficult due to the complex land-use interaction they induce (Mackie, Worsley, and Eliasson 2014). What makes quantification attractive is that under certain conditions, numerical values can be more easily compared, which is one of the fundamental purposes of indicators (Astleithner et al. 2004).

The European Environmental Agency’s (EEA) yearly Transport and Environment Reporting Mechanism (TERM) report is a good illustration of this: all its indicators are numerical and serve to compare values, whether it is between years, between countries, between transport modes, engine or vehicle types. In addition, most of the EEA indicators are intended to be compared to a numerical target (e.g. 95 grams of CO2 per kilometre as the fleet average to be achieved by all new cars by 2021⁵¹).

165 Yet the dispassionate assessment intended by numerical values risks hiding the subjectivity of the

methodological choices and the assumptions made to devise such specific values. Money is often used as a common unit to allow comparison of transport impacts. However monetisation faces similar criticism: where do the valuations come from, and are they credible? (Mackie, Worsley, and Eliasson 2014). Setting a

credible value can be particularly relevant for externalities of transport that diffuse internationally, since institutions may be reluctant to impose restrictions that bring little or no benefit to their own geographical remit.

166 A particularly intricate example of this is CO2 valuations to be used for transport appraisal, which are nationally and politically determined based on EU guidance. The range of value per ton varies between countries (7.8€/ton in the UK⁵², 10.75-25.4€ / ton in Denmark⁵³ for 2016). But they can vary even more widely depending on whether they are tied to more practical reduction targets (e.g. Marginal

Avoidance/Abatement Costs - MAC) or to attempts to monetise wider and long term, potentially catastrophic social costs (Social Cost of Carbon - SCC). In comparison to the figures above, some guidance suggests

50 This quote is usually attributed to Albert Einstein, but sociologist William Bruce Cameron may have been the original source http://quoteinvestigator.com/2010/05/26/everything-counts-einstein/

51 European Commission climate action for road transport

http://ec.europa.eu/clima/policies/transport/vehicles/cars/index_en.htm

52 UK Government Department of Energy & Climate Change updated short-term traded carbon values used for policy appraisal (2015) https://www.gov.uk/government/publications/updated-short-term-traded-carbon-values-used-for-uk-policy-appraisal-2015

53 Danish Ministry for the Environment values:

http://www2.mst.dk/common/Udgivramme/Frame.asp?http://www2.mst.dk/udgiv/publikationer/2010/978-87-92708-52-6/html/kap03.htm , but for transport models at DTU Transport the suggested value is based on European Emissions Trading System allowances (EU ETS), which are lower and set at a central value of 80kr/ton for 2016 (10.75€):

http://www.modelcenter.transport.dtu.dk/Noegletal/Transportoekonomiske-Enhedspriser (v1.6).

more abrupt carbon values starting at 100€/ton to reflect better the cost of sparing one ton of CO2 today, arguing that the current practice of increasing valuations of carbon up to 2050 effectively signal that action can be delayed (Meunier and Quinet 2015; Maibach et al. 2008)⁵⁴. Such variance and uncertainty in one of the more obvious environmental variables of sustainable transport assessment raises questions about the credibility of conventional assessment methods based on what was originally intended as value-neutral numerical indicators.

167 Aside from measurement and quantification methodology, partiality is also intrinsic in the selection of sets of indicators and in their aggregation. Indices are popular tools to provide an overview of a specific but

complex phenomenon. For example, Denmark finds itself in the top 10 ‘greenest’ countries on the 2015 Trilemma index by the World Energy Council (World Energy Council 2015), it is first on the 2016 World Happiness Index (Helliwell, Layard, and Sachs 2016), yet it is also the Western country with the highest ecological footprint per capita in the world (WWF 2014). While each of these indices claim to provide a more holistic picture, they have in common that they rely on a small set (6 to 8) of individual indicators. But why these and not others? For each index the choice of indicators and the methodologies for quantifying each of them are clearly outlined. However the types of aggregation differ. Often when addressing issues of sustainability it is said that a balanced view is needed. But assigning equal weights or no weights (or as in the case of the Trilemma index, assigning equal weighs to each of three dimensions) is a choice in itself that has potentially significant influence on the final values produced. CBA methods internalise the weighting process in monetary valuations that are pre-set, and indices tend to leave the weighting process up to the index designer. MCA, on the other hand, recognises and explicitly treats the weighting process as a value judgment, and MAMCA allows comparing various stakeholder perspectives in those judgments. It is precisely this concern for weighting that led me to explore further MCA approaches in articles II and III.

168 There is yet to come an internationally recognised, high profile sustainable transport index. There exist however numerous attempts at creating various urban sustainability and sustainable transport indices, but the exercise is fraught with the same difficulties regarding the selection, normalisation, weighting and

aggregation of indicators. It would be interesting to provide a full review here, but that could easily be the topic of an article in itself. Zito and Salvo (2011), Santos and Ribeiro (2013), Jeon et al. (2013), Dur and Yigitcanlar (2015) and Alonso et al. (2015) provide recent reviews and systematic attempts of developing urban passenger transport indices. Yet it is interesting that, for example, Alonso et al. conclude from their Sustainability Composite Indicator that the “richest and largest cities usually have more sustainable transport systems”, while richer countries appear to fall in the most unsustainable end of the Sustainable Transport Space indicator by Holden et al. (2013). Tanguay et al. examined in more detail 17 urban sustainability indicators and conclude that problems related to Sustainable Development Indicators (SDIs) are conceptual and operational: there is no standard interpretation of sustainable development, nor any standard approach to designing SDIs, and that, furthermore, SDI development is often constrained by data availability (2010).

169 The key point made here is that indicators are constructed: the targets they intend to support, the selection of the appropriate indicator(s), the elaboration of the method, their level of aggregation, and their presentation are all choices. I illustrate this partiality of indicators in Figure 10 below by showing how subjective choices

54 See also https://www.gov.uk/government/publications/carbon-valuation-in-uk-policy-appraisal-a-revised-approach for a detailed discussion on the “impossibility of deriving a scientifically valid, ethically sound or policy-useful estimate of the social cost of carbon” (Ekins 2011) on the UK Government web site.

affect the representativity quality of indicators at every step in their design and selection, from the choice of method to the building of indices. As Tanguay et al. point out - also citing Niemejer and de Groot (2008) who reach a similar conclusion: “selection of indicators is invariably subject to arbitrary decisions at one stage of the process or another” (2010:p417). Gallopín also concludes the same 10 years earlier: “value judgements enter the characterization of indicators at different levels”(Gallopín 1996).

Figure 10: Construction of indicators, indicator sets, and indices (“Russian dolls”) and the path to knowledge (“Chinese whispers”) (Lyytimäki, Gudmundsson, and Sørensen 2014); figure adapted from (Waas et al. 2014) and types of indicators from (Lehtonen, Sébastien, and Bauler 2016).

170 I therefore question the suggestion by Heink and Kowarik (2010) of the possibility of purely ‘descriptive’

indicators (although I do keep the term in Figure 10 for illustration). In order to ‘make sense’, indicators need to be compared to a reference value, whether that is “a goal, a target, a norm, a standard or a

benchmark” – which is what distinguishes indicators from simple variables (Waas et al. 2014). For example, a value of 400ppm CO2 concentration becomes relevant when compared to the 350ppm threshold said to be a safe boundary. But to address this problem of partiality in indicators, should the answer be to try to

improve objectivity, or to embrace subjectivity, or perhaps, if possible, to do both? Gallopín (1996) suggests for example to keep value judgments confined to targets, norms or standards, and seek to define performance indicators based on these. Connecting indicators to policy goals has received significant attention in the literature, which I now turn to.

In document Indicators and beyond: Assessing the sustainability of transport projects (Page 67-70)