• No results found

Identification of Extreme Events in Climate Data from Multiple Sites

N/A
N/A
Protected

Academic year: 2021

Share "Identification of Extreme Events in Climate Data from Multiple Sites"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Procedia Engineering 125 ( 2015 ) 304 – 310

1877-7058 © 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of organizing committee of The 5th International Conference of Euro Asia Civil Engineering Forum (EACEF-5) doi: 10.1016/j.proeng.2015.11.067

ScienceDirect

The 5th International Conference of Euro Asia Civil Engineering Forum (EACEF-5)

Identification of extreme events in climate data from multiple sites

Heri Kuswanto

a,b,

*, Shofi Andari

b

, Erma Oktania Permatasari

b

aResearch Center for Earth, Disaster and Climate Change, Institut Teknologi Sepuluh Nopember, Sukolilo, Surabaya 60111, Indonesia bDepartment of Statistics, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, Surabaya 60111, Indonesia

Abstract

Extreme weather events have introduced significant losses and it still becomes an important issue in natural disaster management field. This paper proposes a procedure to identify extreme rainfall events occurred in a region, which is an important information for water resource management. The extreme identification can be easily conducted if it involves only climate data recorded in a single site. However, specific treatment has to be defined when the region has multiple sites. It particularly becomes a problem when one needs to study the large scale meteorological pattern dealing with the event. This paper combines the Extreme Value Theory (EVT) with some field considerations for identifying the dates of extreme rainfall events in Indramayu, Indonesia. The results show that extreme rainfall event is defined as rainfall observed in at least 5 stations with extreme amount in each station. © 2015 The Authors. Published by Elsevier Ltd.

Peer-review under responsibility of organizing committee of The 5th International Conference of Euro Asia Civil Engineering Forum (EACEF-5).

Keywords: Extreme; rainfall; multiple sites; EVT.

1.Introduction

Climate change has introduced a complexity in predicting the weather events. Many researches showed that climate change has a significant connection with the occurrence of extreme weather events (see Rosenzweig et al. (2001; Huber and Gulledge, 2011,Scott et al., 2013 among others). Some obvious extreme events are heavy rainfall, heat waves, storm and many others leading to severe disasters such as flooding and drought. Heavy rainfall and heat waves frequently happen in tropical climate. Considering the fact that extreme events may introduce significant looses (e.g. human, resources, capital and psychological), hence predicting the events is an important issue. To forecast the extreme events accurately, forecaster needs to study the nature behavior prior to the occurrence of the event. In this

* Corresponding author. E-mail address: [email protected]

© 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of organizing committee of The 5th International Conference of Euro Asia Civil Engineering Forum (EACEF-5)

(2)

case, the meteorological pattern of the corresponding event has to be identified. Many researches considered a large scale meteorological pattern (LSMP) by looking at some other related fields or variables (Polonski and Basharin, 2008; Siebert, Frank and Formayer, 2007; Schlüter and Schädler, 2010). Grotjahn and Faure (2008) focused on developing composite maps for exploring the LSMP for extreme events. An important issue arises on the projection of surface area with the atmospheric field which covers a very large area.

It is a common case that a region (can be province or district) has many stations to record time series of weather or climate conditions for obtaining more precise and accurate surface data. Dealing with the extreme event study using LSMP, any extreme events detected in a surface site will be connected to a large atmospheric field. The availability of multiple sites or stations in a region with different patterns induced a complication in the analysis. Different in this case refers to a situation that extreme event observed in a station on a certain date might not be observed in another station, although it is located in the same region with similar topography condition. This paper discusses the problem of identifying the extreme dates under this circumstance. The case study is extreme rainfall events in Indramayu,

Indonesia. Despite the availability of the rainfall dataset, Indramayu plays important role in Indonesia’s agriculture.

The district has been well known as the largest producer of rice in Indonesia, contributes to the 60% of total rice production in West Java. Therefore, studying the extreme events in Indramayu is an important work, which would be guidance toward prediction of the event in the future. A lot of studies have been conducted in Indramayu dealing with the rainfall prediction. Sutikno and Bei (2003) detected the extreme rainfall events using monthly dataset. Robertson et al. (2009) study the seasonal predictability of daily rainfall in Indramayu, Rahayu (2013) uses Generalized Extreme Value (GEV) distribution to investigate the climate change over Indramayu using ten days recorded rainfall data. This paper differs to the others by identifying extreme events (dates) based on daily rainfall dataset recorded in multi stations in Indramayu. Similar study has been conducted by Buishand (1991) that estimates the extreme rainfall events in Netherland. However, the study used GEV with annual maxima yielding on inconsistence GEV distribution. The extreme identification in this paper will be determined using Peak Over Threshold (POT). The results in this paper will be a fundamental step to conduct further analysis i.e. studying the LMSP of the extreme rainfall events in Indramayu such as in Kuswanto et al. (2014), which will be an important information for water resource management. 2.Literature Review

The generalized extreme value (GEV) distribution is a generalization of three distributions i.e. Gumbel, Fréchet and the Weibull distributions. It was firstly introduced by Jenkinson (1955), who combined the above three distributions (see Hosking et al., 1985, Galambos, 1987). The GEV distribution has cumulative distribution function (cdf) as follow:

ܩሺݖǢ ߤǡ ߪǡ ߦሻ ൌ ݁ݔ݌ሾെ ቄͳ ൅కሺ௭ିఓሻ ఙ ቅ

ିଵȀక

ሿ (1)

where െλ ൏ ߤ ൏ λǡ ߪ ൐ Ͳǡ െλ ൏ ߦ ൏ λǡdefined as location, scale and shape parameters respectively. The three distribution differs in the tail behavior, and it is characterized by the different shape parameters. The Weibull distribution has negative ߦ, Frechet distribution has positive ߦǡ while Gumbel distribution has ߦ ՜ λǤ

There are two approaches can be applied to fit the GEV distribution i.e. by block maxima and Peak Over Threshold (POT). The block maxima approach takes a single maximum value over the specified block as the extreme data, while the POT uses a specific threshold for filtering the extreme data. Consequently, the number of extreme data will depend on how large the block is. While using POT means that the number of extreme data depends on the threshold. Once the threshold is determined, the extreme data (above threshold) will fit the Generalized Pareto Distribution (GPD) (see Gilleland and Katz (2006) for further detail). In fact, the POT method has been fast developed which make it possible to choose the optimum threshold using Mean Excess Plot or Threshold choice plot. The parameters of the GEV distribution can be estimated using Maximum Likelihood or other estimation procedures such as Method of Moment (MME) or Probability Weighted Moment (PWT). A comprehensive discussion about this can be found in Coles (2011).

(3)

3.Data and Research Methodology

The dataset used in this paper are daily series of rainfall observed in several rain gauges (denoted as stations hereafter) in Indramayu. There are 30 stations in Indramayu, however we end up with 10 sites after conducting pre-processing the data by considering the completeness of the series. Therefore, the extreme rainfall events in Indramayu will be investigated using dataset from these 10 selected stations i.e. Bangkir, Bugel, Cikedung, Jatinyuat, Kroya, Krangkeng, Losarang, Gabus Wetan, Leuweung Semut, Tulang Kacang. All dataset span from 1979 to 2006. The identification of the extreme dates involves of several steps as follows:

x apply Generalized Extreme Value (GEV) for determining the best threshold. This paper applies the Generalized Pareto Distribution (GPD) using Peak Over Threshold (POT) method. The use of Block maxima is considered as inappropriate in this study as the extreme event can happen more than once in any specified block.

x set several thresholds and check the consistency of the GEV parameters using Mean Excess Plot. The optimum threshold is the one that yields on consistent shape and scale parameters.

x estimate the GEV parameters and conduct diagnostic checking x filter the data above the thresholds for each station

x perform the frequency analysis to determine the appropriate selection based on the extreme occurrence i.e. number stations observing extreme.

4.Results and Discussion

We begin the analysis by performing the density plots of the raw series. The following figure. depicts the density of rainfall observed in four stations. We perform only four stations for the sake of space. In fact, the other plots show similar distribution.

Fig. 1. Density plot of rainfall dataset in four stations

Fig. 1 clearly shows that the dataset has positive distribution with long tail indicating some extreme data. Null value dominates the density which means also there is no rain at the day. It is well known that daily rainfall series has tail distribution such as Gamma, Weibull, Gumbel, etc. In general term, it belongs to Generalized Extreme Value Distribution. The idea now is to determine a threshold of extreme for each dataset so that the extreme data will fit the GEV distribution. From figures 2, the optimum thresholds can be determined. For Bangkir, it is very obvious that the optimum threshold is 53 shown by relatively abrupt change of the parameters, followed by stable values of the parameters, both on the threshold choice and mean excess plots. List of the estimated optimum threshold can be seen in Table 1.

Comparing the threshold obtained from those three procedures i.e. POT, 95th and 99th percentiles of the data, it is observed that the POT thresholds lie between percentile values, which is a good choice for rainfall case. The 95th

percentile yields on a very low value, and in it does not necessarily to be extreme case in the reality. BMKG Indonesia defined the heavy rainfall as the rainfall > 50 mm. Therefore, using 95thpercentile as the threshold may lead to misspecification of the result as it consists of too many small values. Furthermore, using 99th percentile as the threshold seems reasonable, but it yields on a very high values and reduces the number of extreme events too much. Clark et al. (2006), Meehl and Tebaldi (2004) used 99th percentile for determining the threshold of extreme heat waves.

(4)

Gilleland and Katz (2006) argued that too high threshold can discard too much data leading to high variance of the estimate, while too low threshold could lead to the bias to satisfy the asymptotic result of the GPD. Therefore, using POT Threshold is a reasonable choice as it still maintains the goodness of fit of the distribution estimated by Maximum Likelihood (MLE) method. Table 1 provides the threshold.

(a) (b) Fig. 2. Threshold Choice Plot (a) and Mean Excees Plot (b) for Bangkir

Table1. Comparison of threshold for each station Station Threshold POT 95% 99% Bangkir 53 30 63.74 Bugel 30 17 45 Cikedung 60 26 63 Gb.Wetar 40 23 54 Krangkeng 50 26 57 Kroya 48 25 57 Losarang 60 25 65 Lw. Sewu 41 23 60.74 Tl.Kacang 50 19 50 Jatinyuat 42 18 50.74

Having determined the thresholds, we filter the raw data using the threshold value for each station. The sample of the values can be seen in Table 2 as an illustration. From Table 2, the coloured cells indicate that the extreme rainfall at the date observed in the corresponding. We see some inconsistency in the table, which means that extreme event observed at a certain station is not necessarily extreme in other station. For instance, on 1 January 1979, extreme rainfall only occurred in Bugel, while other stations have normal rainfall.

Similar results are observed for the other dates. Our analysis shows that extreme occurrences in all stations are observed only on a very few dates. Fig. 3(a) shows the frequency of stations showing extreme occurrence at the same dates. Among 950 dates, there are 630 dates with only one station shows extreme, while only two days in which all stations shows the occurrence of extreme. In related to the study of large scale meteorological pattern, it will introduce some difficulty. Bumbaco et al. (2013) addresses the similar issue to investigate the heat waves in North Pacific. They suggest to exclude some stations from the analysis by considering the correlation coefficients between each station and the average. The station with very low (not significant) correlation should be dropped as it considered as nonhomogeneous. We do the similar procedure for the same purpose, and correlations between each station and average value are listed in Table 3.

The idea of performing two correlation coefficients is as follow. Tables 3 listed the correlation between each station with average all as well as within region respectively. The former means that we simply take daily average of all values and calculate the correlation. It does assume that all regions where the station located are considered homogeneous towards some external factors that may influence the rainfall event. Meanwhile within region means

(5)

that we assume a homogeneity only within region. The easiest way to do it is by grouping the region based on the location. Region 1 consists of Bangkir, Jatinyuat, Krangkeng, while region 2 are for Buleg, Losarang and Tl. Kacang. The rest belongs to the region 3.

Table 2. Sample of extreme dates

Date Bangkir Jatinyuat Krangkeng Bugel Wetar 10011979 33 0 7 35 30 11011979 54 20 17 49 47 16011979 46 5 14 70 35 17011979 44 51 72 25 40 01021979 0 16 0 43 30 (a) (b)

Fig. 3. Number of extreme dates observed in station (a), Distribution of the identified extreme events.(b)

The rainfall in region 1 might be influenced by the wind from the North-West while region 2 is influenced by wind from North-East part of the island. Region 3 located far from the sea which may have specific pattern in the rainfall distribution. In fact, based on the coefficient correlations in Table 3, the values increase significantly. Three kinds of correlation are performed due to the fact that Pearson correlation requires normality assumption of the distribution of data. Therefore this correlation is no more valid for this case. The two others are rank-based nonparametric correlations which hold for any dataset, regardless of the distribution of the data.

All correlations are statistically significant with 95% confidence level. However, Bugel and Wetar have low correlation either with the total average or within region average, and might be considered as inhomogeneous in the region, and can be from the analysis. This fact is important for the future direction of the analysis and may reduce the bias of the result.

As all correlations are significant, all dataset from all stations are used to be further analyzed. Note that Fig. 3(a) has performed the identified extreme dates using POT threshold and there are 950 days in total. Let us focus now on the definition on extreme that may fit to the case. Referring to the number of extreme events displayed in Table 4, it is recommend to using more than 5 occurrences of extreme as the threshold. Therefore, 95.8% of the total identified dates having occurrence less than 5 stations can be discarded. The total number of extreme date is 40 dates. It is interesting to know the fact that the selected extreme event mostly happened in November, December, January and February, as shown in Fig. 4(b). The rainy season in Indonesia has been defined to happen on October – March. Therefore, it is reasonable to have more events within these months. In fact, some extreme dates within April –

September have been discarded. The rainfalls happened in dry season are unusual and it happens due to some causes such as increasing sea surface temperature level leading to intensive evaporation and forming condensation that causes rain. According to the Agency for Meteorology, Climatology and Geophysics (BMKG) Indonesia, it can be influenced by La Nina which yields on air mass accumulation with water evaporation on Indonesia’s atmosphere.

(6)

Table 3. Correlation coefficient between station and average

Station All region Within region

Pearson Spearmann Tau Kendall Pearson Spearmann Tau Kendall Bangkir 0.447 0.317 0.43 0.671 0.488 0.639 Jatinyuat 0.615 0.476 0.634 0.611 0.425 0.56 Krangkeng 0.445 0.327 0.444 0.642 0.465 0.618 Bugel 0.329 0.104 0.138 0.537 0.333 0.426 Losarang 0.505 0.316 0.491 0.69 0.508 0.667 Kacang 0.714 0.532 0.703 0.789 0.522 0.679 Cikedung 0.509 0.444 0.589 0.598 0.459 0.607 Kroya 0.552 0.431 0.582 0.732 0.544 0.724 Semut 0.548 0.399 0.525 0.64 0.464 0.611 Wetar 0.369 0.111 0.149 0.455 0.232 0.298

Table 4. Statistic of the extreme date

Occurrence Freq. Percent Valid Percent Cum. Percent Occurrence Freq. Percent Valid Percent Cum.Percent 1.00 630 66.3 66.3 66.3 6.00 8 0.8 0.8 97.9 2.00 198 20.8 20.8 87.2 7.00 8 0.8 0.8 98.7 3.00 57 6.0 6.0 93.2 8.00 7 0.7 0.7 99.5 4.00 25 2.6 2.6 95.8 9.00 3 0.3 0.3 99.8 5.00 12 1.3 1.3 97.1 10.00 2 0.2 0.2 100.0 5.Conclusion

This paper investigated the appropriate definition for identifying the extreme events for dataset from multiple sites. The procedure of identifying extreme rainfall events has been proposed i.e. using the POT threshold combined with the number of occurrence in the stations. It shows that the procedure work well and can be applied simply. The result shows a reasonable choice of the extreme dates. Most of the detected extreme rainfalls happened on the period of November to January with the following definition of extreme rainfall i.e. rainfall observed in at least 5 stations, in which each station show extreme case.

Acknowledgements

The authors acknowledge the financial support from PEER Science Research Project funded by USAID-NSF Cycle 2. Authors also thank to BMKG Indonesia for providing the data.

References

[1] Buishand, T.A. (1991) Extreme rainfall estimation by combining data from several sites. Hydrologkal Sciences - Journal - des Sciences Hydrologiques, 36 (4),

[2] Bumbaco, K. A., Kathie D. D. and Nicholas A. B., (2013) History of Pacific Northwest Heat Waves: Synoptic Pattern and Trends*. J. Appl. Meteor. Climatol., 52, 1618–1631.

[3] Clark, R. T., Brown, S.J., and J. M. Murphy, (2006) Modeling Northern Hemisphere summer heat extreme changes and their uncertainties using a physics ensemble of climate sensitivity experiments. J. Climate, 19, 4418–4435

[4] Coles, S.G. (2001). An introduction to statistical modeling of extreme values, Springer-Verlag, London. [5] Galambos, J. The asymptotic theory of extreme order statistics, 2nd ed.Melbourne, Florida: Krieger, 1987.

[6] Gilleland, E., and R.W. Katz, (2006) Analyzing seasonal to interannual extreme weather and climate variability with the extremes toolkit (extRemes). 18th Conference on Climate Variability and Change, Atlanta, GA, American Meteorological Society

(7)

[7] Grotjahn, R., and Faure, G. (2008) Composite Predictor Maps of Extraordinary Weather Events in the Sacramento, California, Region. Weather and Forecasting 23 (3), 313-335

[8] Hosking, J.R.M. and Wallis, J. R. (1987) Parameter and quantile estimation for the generalized Pareto distribution,” Technometrics, vol. 29

(3), 339−349, 1987.

[9] Jenkinson, A.F. (1995) “The frequency distribution of the annual maximum (or minimum) values of meteorological elements,” Quart. J. Roy.Meteor. Soc. vol. 81, pp. 158−171.

[10] Kuswanto, H., Grotjahn, R., Shofi, A., Suhermi, N., Oktavia, E. and Wijaya, Y., Aldrian, E. (2014). Large Scale Meteorological Pattern of Extreme Rainfall in Indonesia. Geophysical Research Abstracts. Vol. 16, EGU2014-1338, 2014

[11] Meehl, G. A., and Tebaldi, C. (2004) More intense, more frequent, and longer lasting heat waves in the 21st century. Science, 305, 994–997. [12] Polonski, A.B. and Baharin, D.V. (2013) Large-scale patterns of Eurasian surface meteorological fields influenced by the climate shift of

1976–1977. Russian Meteorology and Hydrology , 33 (5), 280-289

[13] Rahayu, A. (2013) Identification of Climate Change with Generalized Extreme Value (GEV) Distribution Approach. Journal of.Physics:Conference Series 423.

[14] Rosenzweig, C., Iglesius, A., Yang, X. B., Epstein, P.R., and Chivian, E., (2001) Climate change and extreme weather events - Implications for food production, plant diseases, and pests. NASA Publications. Paper 24.

[15] Robertson, A. W., Moron, V. and Swarinoto, Y. (2009), Seasonal predictability of daily rainfall statistics over Indramayu district, Indonesia. Int. J. Climatol., 29, 1449–1462

[16] Schlüter, I. and Schädler, G. (2010) Sensitivity of Heavy Precipitation Forecasts to Small Modifications of Large-Scale Weather Patterns for the Elbe River. J. Hydrometeor, 11, 770–780.

[17] Siebert, P., Frank, A. and Formayer, H. (2007) Synoptic and regional patterns of heavy precipitation in Austria. Theoretical and Applied Climatology , 87 (1-4), 139 – 153.

[18] Scott, P. A., M. Allen, N. Christidis, R. Dole, M. Hoerling, C. Huntingford, P. Pall, J. Perlwitz, and D. A. Stone ( 2013). Attribution of weather and climate-related extreme events. In: Climate Science for Serving Society: Research, Modelling and Prediction Priorities, Eds: G. R. Asrar and J. W. Hurrell. Springer 307-337

[19] Sutikno and Bei, A. (2003), Probability analysis for predicting extreme climate, case study: rainfall distribution in Kab. Karawang and Subang. J.Agrometeorologi, 17 (1-2), 62-67

References

Related documents

This paper investigates the trends of future projections of extreme rainfall at hourly scale with the use of AWE- GEN model in Peninsular Malaysia.. In the next

Data comprising of annual maximum series ( AMS ) of extreme rainfall in Alor Setar were fitted to Generalized Extreme Value ( GEV ) distribution using method of maximum likelihood

distributions of the Gumbel, Fréchet or Weibull distributions which are generalized as the Generalized extreme value distribution (GEV) and known as the Block Maxima (Minima)

of monthly rainfall distribution, significant seasonal trends, changes in variance and extreme monthly values, in order to establish the magnitude of the seasonal climatic

The life cycle of extreme rainfall events over western Saudi Arabia simulated by a regional climate model: Case study of November

The developed hybrid GLM-ANN seasonal rainfall downscaling models were then used to simulate seasonal future rainfall using a set of input variables generated by global

To evaluate if the magnitude of flow peak is likely to increase in the 21st century as a result of the projected climate change, a Generalized Extreme Value (GEV) distribution function

Time series of 99 th percentile daily rainfall values of each station for the two halves of the year were computed to examine trends in extreme rainfall events. The results