10.6 Application of PCA on speleothem proxy time series - Spatio-temporal
10.6.1 The behaviour of the distribution of r s values for proxy time series
In the previous sections (Sec. 10.4 and 10.5), the distribution of eigenvalues of PCs were used to decide whether a certain PC is significant or not (Preisendorfer’s Rule N). However, another property of a PC is its correlation with the investigated proxy time series. If a computed PC time series (to the corresponding PC) shares common variations of the proxy time series that are researched, the correlation between the PC time series and the proxy time series is significant. The value of the correlation coefficient rs indicates the degree of the correlation - rs is the Spearman’s rank correlation coefficient. This property shall be explained in detail in the following part. Moreover, a further criteria for the significance of a PC (and of the related PC time series) shall be developed on this properties. For this compilations of four artificial AR-1 and WN time series (Sec. 10.4 and 10.5) are used.
Furthermore, a compilation of four speleothem proxy series used; the four time series shall be similar, in order to achieve a perfect correlation. Here, the δ18O time series of
10. Principal component analysis in speleothem science
stalagmite BU-4 is used. A total amount of 1000 MC simulations are performed resulting in ensembles composed of 1000 eigenvalues and time series for each PCs, respectively.
According to Preisendorfer’s Rule N the 1st PC is significant and all PCs of higher orders are not significant, because only the mean eigenvalue of the 1st PC is above the 95 % level of the eigenvalues for the compilation of AR-1 and WN time series, respectively (Fig. 10.19).
However, in addition to the computed eigenvalues of each PC, the Spearman’s rank correlation coefficient (rs) can be calculated between each PC time series and each proxy time series. Hence, for each MC run rs is calculated between the derived PC time series and the proxy time series (in the example conducted here these are four proxy time series).
Consequently, a total amount of 1000 rs values are calculated for the correlation between each PC time series and each proxy time series. This results in n2 distributions of rs values, where n is the number of investigated proxy time series. The distribution of rs-values between the proxy time series and the 1st PC for the WN time series, the AR-1 time series and the single speleothem collection is illustrated in the left panels of Fig. 10.24, 10.25 and 10.26, respectively. Only values for rs are shown if the p-values are smaller than 0.05. For the WN and AR-1 time series the shape of the distribution is bimodal, with a minimum centred at rs=0. The distribution for the single speleothem selection on the other hand depicts a Gaussian distribution with a mean value of c. 0.7. Furthermore, it is visible that the variance of the WN and AR-1 time series’ distribution for each peak is higher compared to the variance of the single speleothem collection. The bimodal distribution of rs values is a result of the up-side-down effect and a result of the mathematical theory behind PCA (Sec. 10.1). The properties of the bimodal distribution of rs values allow to decide whether a PC is significant or not. Moreover, it makes it possible to reveal, which of the investigated proxy time series is correlated with the related PC time series. The following problem might occur: if an ensemble of PC time series is investigated and the distribution of rs values between the PC time series and the speleothem proxy time series is bimodal, only those PC time series are analysed which have a phase relation (exp {i · φ}) of φ= 0 or π. Consequently, only one part of the bimodal distribution for each speleothem proxy time series is considered if the speleothem proxy time series have shared a common signal. For time series that are a priori not correlated with each other (as WN and AR-1 time series) and consequently do not share a common signal, there should not be such an outcome. To proof this assumption, only one part of the bimodal distribution between a PC time series and a selected speleothem is chosen (the positive or the negative) and the behaviour of the distribution of rs values between the selected PC time series and the remaining speleothems is investigated. If the remaining speleothems are correlated with the PC series the distribution of rs values should be sensitive to the selection and not if they are not correlated with the PC time series. This procedure is called the "Fork-tool"
and shall be explained in the following.
For the WN and AR-1 time series the positive part of the bimodal distribution of the WN and AR-1 time series 1, respectively, is selected (Fig. 10.24, 10.25; right column); before the application of the Fork-tool on the bimodal distribution of WN and AR-1 time series 1, 948 and 965 rs values contributed to the distributions, whereas after the application 599 and
111
10.6. Application of PCA on speleothem proxy time series - Spatio-temporal coherence of
Fig. 10.24: Illustrated is the distribution of rs values between WN time series and the computed 1st PC time series for four different time series. The labelling of the y-axis indicates the proxy-age relation of stalagmite BU-4 used as the basis for the WN time series.
The panels on the left side depict the distribution before the application of the Fork-tool and the panels on the right side after the application of the Fork-tool.
10. Principal component analysis in speleothem science
Fig. 10.25: Illustrated is the distribution of rs values between AR-1 time series and the computed 1st PC time series for four different time series. The labelling of the y-axis indicates the proxy-age relation of stalagmite BU-4 used as the basis for the AR-1 time series. The panels on the left side depict the distribution before the application of the Fork-tool and the panels on the right side after the application of the Fork-tool.
113
10.6. Application of PCA on speleothem proxy time series - Spatio-temporal coherence of
Fig. 10.26: Illustrated is the distribution of rs values between identical speleothem proxy time series and the computed 1st PC time series for four different time series. Here, the δ18O time series of BU-4 is used. The panels on the left side depict the distribution before the application of the Fork-tool and the panels on the right side after the application of the Fork-tool.
10. Principal component analysis in speleothem science
Fig. 10.27: Illustrated is the distribution of rs values between identical speleothem proxy time series and the computed 2st PC time series for four different time series. Here, the δ18O time series of BU-4 is used. The panels on the left side depict the distribution before the application of the Fork-tool and the panels on the right side after the application of the Fork-tool.
115
10.6. Application of PCA on speleothem proxy time series - Spatio-temporal coherence of speleothem proxy time series
604 rs values contribute to both distributions, respectively. Similar numbers can be found for the other distributions of time series 2, 3 and 4. However, the shape of the distributions of these time series compared to the distribution of time series 1 is different and maintains the bimodal distribution. From this result, it can be concluded that if proxy time series are