Numerical Dataset Analysis - Empirical Case Study

6.3 Empirical Case Study

6.3.1 Numerical Dataset Analysis

The dataset used in this case study is collected based on the qualitative taxonomy for the Unified ICT baskets. The ICT variables to be used here are presented in Table 4.1. These datasets are freely and readily available from the annual reports issued by the organizations mentioned earlier. The set contains various number of economies as reported by the sources, which was each given a score and a rank across the eleven filtered variables for three consecutive years 2009, 2010 and 2011. In the subsequent sections a couple of measures are taken to test to what degree the ICT variables are related and comparable.

6. Unified Macro-Knowledge Competitiveness Framework

6.3.1.1 Correlation

To test the degree of relationship between the filtered variables correlation analysis is conducted initially on the collected data set. The correlation coefficient matrix between the ICT related variables is summarised in Table6.1.1 _{The result}

revealed a “moderate” to “strong” positive relation between almost all indicators. The highest correlation coefficient value is 0.967, which is significant and occurred between the WEF-NRI, Individual Usage - coded as I - and the ITU-IDI, ICT Use - coded as K. The lowest correlation is 0.379 “weak” which resulted between the WEF-NRI, Government Readiness - coded as G - and ITU-IDI, ICT Price - coded as H. Based on these results we can settle that the indicators are correlated and comparable. Several studies including Roessner et al. (1996), Porter et al.

(2009), and Johnson et al. (2010) investigated the relation of high technology competitiveness indicators and concluded a similar result that these indicators complement each other, and their differences are mainly due to the limitations and variations of the traditional methods used to weight and aggregate the in- put variables. Hence, it would be highly desirable to unify such efforts for a full rounding result. Therefore, it is feasible to normalise, and aggregate the efforts of these ICT indicators into a “one for all” solution that would reflect, measure and rank the combined level of ICT and e-services in and between countries.

6.3.1.2 Outlier Detection

To check for any outliers within the collected dataset, the Mahalanobis Distance is used (Maesschalck et al.,2000). It is a distance measure based on correlations between variables to detect any point that has a greater distance from the rest of the sample. The result of Mahalanobis distance test as depicted in Figure 6.2, spotted two points as they are slightly far from the rest of the countries. The

1_{The rest of this text is censored as a Copyright Material which can be retrieved from the}

following article: Ahmad Al Shami, Ahmad Lotfi and Simeon Coleman “Intelligent Synthetic Composite Indicators with Application,” Soft Computing: Volume 17, Issue 12(2013), Page 2349-2364, Springer Berlin Heidelberg, DOI: 10.1007/s00500-013-1098-3, ISSN: 1432-7643.

6. Unified Macro-Knowledge Competitiveness Framework

Figure 6.2: Outliers detection between variables, N=57.

Existence of such outliers is not problematic, therefore it is decided to keep them in the dataset.

6.3.1.3 Multivariate Analysis

PCA is one of the multivariate and inputs reduction methods. The goal of PCA is to reveal how different variables change in relation to each other and how they are associated. PCA is useful when there are two or more variables, and believe that there is some redundancy in those variables. In this case, redundancy means that some of the variables are correlated with one another, possibly because they are measuring the same construct. Because of this redundancy, it should be reasonable to reduce the observed variables into a smaller number of principal components “artificial variables” without significant loss of information. PCA is employed in this study to serve three purposes: first, to test if the eleven variables could be reduced. Second, to reduce the number of indicators to a smaller subset. Third, to forsee the possibility of filtering out the trivial components before is used. The trivial components usually act as noise and could stand in the way of getting a sound and meaningful clustering result. Figure 6.3 shows the result of the PCA analysis: the first component display the highest eigenvalues as it explain 76.25% of the variability in the data, the second 9.42%, the third component accounted for 4.52% and so on. The results of the scree test suggest that only the first two components are meaningful. Therefore, only the first two components

6. Unified Macro-Knowledge Competitiveness Framework

Figure 6.3: PCA result showing the scree plot of eigenvalues of covariance for 57 countries.

were retained. Combined, components 1 and 2 accounted for 85.67% of the variability in the data, which we can retain. The plot levels off after the second component where the rest of eigenvalues that represents the trivial components of 14.33% which can be discard.

6.3.1.4 Variables Standardisation

Normalisation usually is used to transform different measurement units into a uniform unit, so they can form a clear comparable elements, and to avoid problems in mixing measurement units (e.g. money, talent, skills) (Freudenberg, 2003). The issue at hand is not the use of different measurement units, but the scores scale ranges. Hence, to unify the score ranges between the different selected indicators, Min-Max normalisation -as formulated in Equations 5.3- was applied by taking all the different scores ranges collected in the data set and transforming these to a value between 0 and 1, where the lowest (min) value is set to 0 and the highest (max) value is set to 1. In the cases where a high value implies inferior result such as ICT Price, we resort to the reverse Min-Max normalization process as in Equation 5.4, so that, in addition to converting the series into a [0 − 1] range, inverts it, so that 0 implies poor and all the way to 1 as the top possible performances.

6. Unified Macro-Knowledge Competitiveness Framework

In document Computational intelligence for measuring macro-knowledge competitiveness (Page 123-127)