Normalization - Developing composite indicators

Chapter Three: Methodology

3.4 Developing composite indicators

3.4.2 Normalization

A variety of transformation and/or normalization techniques are available (e.g., Blanc et al., 2008; Freudenberg, 2003; Nardo et al., 2009; Pomerol & Barba-Romero, 2012), but only the five most widely employed techniques (Nardo et al., 2009; Tofallis, 2014) are shown in Table 3.6. These five techniques are ranking, distance to target, z-score, min-max and proportionate normalization. The first four are the most commonly used normalization techniques (Saisana & Saltelli, 2011; OECD, 2008). For example, min-max is used in the Human Development Index. The proportionate normalization technique was considered because of its suitability for the development of composite indicators.

Table 3.6: Selected normalization techniques for this study

Name Formula Explanation

1_Ranking _𝑁

𝑖𝑎𝑠 = 𝑅𝑎𝑛𝑘(𝑋𝑖𝑎𝑠)

Where Nias = Normalized value of indicator 𝑖 for agricultural systems 𝑎𝑠, 𝑋_𝑖𝑎𝑠 = Variable 𝑋 for indicator 𝑖 for

agricultural systems 𝑎𝑠 1_{Distance to}

target 𝑁𝑖𝑎𝑠 =

𝑋𝑖𝑎𝑠

𝑇𝑎𝑟𝑔𝑒𝑡 𝑋_𝑖𝑎𝑠

Where 𝑁_𝑖𝑎𝑠= Normalized value of

indicator 𝑖 for agricultural systems 𝑎𝑠, 𝑋_𝑖𝑎𝑠 = Variable 𝑋 for indicator 𝑖 for

agricultural systems 𝑎𝑠

1_Z-score

(Standardization) 𝑁𝑖𝑎𝑠 = (𝑋𝑖𝑎𝑠− µ)/ 𝜎

Where 𝑁_𝑖𝑎𝑠 = Normalized value of

indicator 𝑖 for agricultural systems 𝑎𝑠, 𝑋_𝑖𝑎𝑠= Variable 𝑋 for indicator 𝑖 for agricultural systems 𝑎𝑠, µ (Mu) = Mean of indicator values, σ (Sigma) = Standard deviation (square root of the variance) of indicators

1_Min-max

𝑁_𝑖𝑎𝑠

= 𝑋𝑖𝑎𝑠− 𝑚𝑖𝑛𝑎𝑠(𝑋𝑖) 𝑚𝑎𝑥_𝑎𝑠(𝑋_𝑖𝑎𝑠) − 𝑚𝑖𝑛_𝑎𝑠(𝑋_𝑖)′

Where 𝑁_𝑖𝑎𝑠 = Normalized value of

indicator 𝑖 for agricultural systems 𝑎𝑠,

𝑋_i= Indicator, 𝑋_𝑖𝑎𝑠= Variable 𝑋 for indicator 𝑖 for agricultural systems 𝑎𝑠, 𝑚𝑎𝑥𝑎𝑠 and 𝑚𝑖𝑛𝑎𝑠are the largest and smallest observed values

2,3_{Proportionate} _𝑁_𝑖𝑎𝑠 ₌ 𝐼𝑖

𝐼_𝑖 _𝑖 0 < 𝑁𝑖𝑎𝑠< 1

Where 𝐼_𝑖 = Indicator value, 𝐼_𝑖 _𝑖 =

Sum of the indicators

3.4.2.1 Ranking normalization

Ranking normalization replaces measurements with their rank. In the rank normalization process each data point is replaced by its rank, that is, by values ranging from 1 (lowest) to N (highest) (Mitchell, 2012). In this system, there is no score, only a rank; the absolute-level information is lost. This technique, while simple, cannot lead to any conclusion about the differences among performances of the indicator being assessed because there is no measure of the distance between values of the indicators (Jacob et al., 2004). Ranking normalization is employed in the “Information and Communications Technology Index” (Freudenberg, 2003) and “Medicare Study on Healthcare Performance across the United States” (Jencks et al., 2003).

3.4.2.2 Distance to target normalization

In the distance to target normalization technique, the indicator’s value is divided by the target value to normalize the indicator (Saisana & Saltelli, 2011) so that the normalized values represent a fraction of the highest value. The highest value of the indicator set or any reference point can be the target value. The results of this technique are easy to handle and understand, but imbalance between scores and rankings remains, and the normalization results are more influenced by outliers than in other techniques. This method is useful for further analysis (e.g., geometric aggregation) since it does not generate any zero values. However, if outliers are chosen as target points, the result can be misleading. The distance to target normalization technique is used in “Eco-indicator 99” and the “Summary Innovation Index” (Saisana & Tarantola, 2002).

3.4.2.3 Z-score normalization

Z-score normalization is calculated by subtracting the mean from an indicator value and then dividing by its standard deviation. If the standard deviation is calculated for a set of variables with a mean of 0 and then all values are divided by the standard deviation, the resulting set of values will have a standard deviation of 1 (Salzman, 2003). After performing normalization, the data have a common scale with a 0 mean and standard deviation of 1. Since all Z-score distributions have the same mean and standard deviation, individual scores from different distributions can be directly compared. The advantage of this technique is that it provides no distortion from the mean, adjusting for different scales and variance. The output is dimensionless, and the relative differences are maintained due to the application of a linear transformation (Mei & Grummer- Strawn, 2007). Z-score is preferred when extreme values exist in the dataset (Nardo et al., 2005; Tate, 2012). Although the technique does not fully adjust for outliers, the minimum and maximum values are not as influential as in other techniques such as distance to target. As an extreme value for an indicator has a greater effect on a composite indicator, this technique is more representative of the original data. It is desirable that exceptional behaviour should be rewarded if excellent performance on a few indicators is considered to be better than other average performances (OECD, 2008; Salzman, 2003). The Z-scores technique is widely employed, such as in the knowledge-based economy index (WB, 2009). The World Health Organization also used it for its child growth standards index (de Onis, 2006).

3.4.2.4 Min-max normalization

The Min-max technique rescales data into different intervals based on minimum and maximum values. The advantage of this method is that boundaries can be set and all indicators have an identical range (0, 1). However, the normalized values do not maintain proportionality, and

normalized values reflect the percentage of the range of 𝑚𝑎𝑥_𝑎𝑠(𝑋_𝑖𝑎𝑠) − 𝑚𝑖𝑛_𝑎𝑠(𝑋_𝑖𝑎𝑠). This technique is based on extreme values (minimum and maximum), but because these two values can be outliers, the range of max and min strongly influences the final output. Another disadvantage is that the difference in variance is not fully eliminated (OECD, 2008). Nevertheless, this technique is very popular and has been applied in the construction of many composite indicators, the best-known of which is the Human Development Index (HDI, UNDP, 2014).

3.4.2.5 Proportionate normalization

In proportionate normalization, the single attribute value is divided by the sum total of the values of attributes (Dailey, 2000; Tofallis, 2014). The normalized values maintain proportionality such that the normalized value reflects the percentage of the sum of the total value of the indicators. Here, values of the indicator are relatively normalized. Normalizing the indicators by dividing them by their sums has a number of attractive properties, including that the normalized values are identical to the original except for a scaling factor and the process is easily understandable. The value differences among indicators become narrow. Dividing by the sum ensures that even the smallest value greater than zero comes out with a positive normalized value (Benini, 2012; Tofallis, 2014). The proportionate normalization technique is frequently used in normalizing census data in ArcView GIS (Geographical Information System, Dailey, 2000). Benini (2012) also suggested using this technique for developing composite measures for disaster impact assessment.

In document Multi-Criteria Decision Analysis (MCDA) for Agricultural Sustainability Assessment (Page 90-93)