Weibull Probability Density Function - Uncertainty of Quantile Estimates

4.2 Uncertainty of Quantile Estimates

5.1.2 Weibull Probability Density Function

Figures 5.1.5 to 5.1.7 all show plots of the histogram of the dataset, together with the Weibull pdf, with the parameters estimated by the method of maximum likelihood, the method of moments, and the method of L-moments, respectively. Each of these figures also shows a close-up of the higher return levels.

Figure 5.1.5: The Weibull pdf, with parameters determined by the method of maximum likelihood, together with the histogram of the dataset. The figure on the right is a close-up of the higher return levels of the figure on the left.

Figure 5.1.8 is a combination of Figures 5.1.5 to 5.1.7. It contains a comparison of the fits of the Weibull probability density functions for the three different methods of parameter estimation.

Based on Figures 5.1.5 to 5.1.8 it seems as though the method of moments is slightly safer for the estimation of Weibull parameters than the other two estimation methods when considering to avoid the under-estimation of the frequency of return levels. As can be seen in Figure 5.1.8, however, the three methods yield very similar fits of the Weibull pdf to the histogram. In each of the three cases the pdf intersects the histogram bar approximately in the middle for the higher return levels. Hence, the use of any of the three parameter estimation methods in conjunction with the Weibull distribution seems to yield reliable results.

In this chapter, the generalized Pareto and the Weibull probability density functions were plotted with histograms of the empirical data. The parameters of the probability density functions were estimated by each of three estimation methods, namely, the method of maximum likelihood, the method of moments, and the method of L-moments. A conclusion that was made, is that when the GPD distribution is used, the method of maximum likelihood is the optimal method for estimating the distribution’s parameters to avoid under-estimation of return levels. In the case of the Weibull distribution, all

Figure 5.1.6: The Weibull pdf, with parameters determined by the method of moments, together with the histogram of the dataset. The figure on the right is a close-up of the higher return levels of the figure on the left.

Figure 5.1.7: The Weibull pdf, with parameters determined by the method of L-moments, together with the histogram of the dataset. The figure on the right is a close-up of the higher return levels of the figure on the left.

three methods for parameter estimation yielded similar results, which all fitted the histograms well. Hence, the conclusion was made the the use of any of the three methods in conjunction with the Weibull distribution seems to yield reliable results.

Figure 5.1.9 contains a comparison of the GPD and Weibull probability distribution functions, when the parameters of both distributions are estimated by the method of

Figure 5.1.8: The Weibull pdf, with parameters determined by the method of maximum likelihood, the method of moments, and the method of L-moments, respectively, together with the histogram of the dataset.

Figure 5.1.9: The GPD and Weibull probability distributions, with parameters determined by the method of maximum likelihood, together with the histogram of the dataset.

maximum likelihood. It can be seen that both distributions produce similar results, with the GPD curve being only slightly above the curve of the Weibull distribution. The higher the return levels become, the closer the curves (i.e., the probability distribution functions) move together. It can therefore be concluded that the use of both the GPD and Weibull distribution, in conjunction with the method of maximum likelihood as parameter estimation method, yield reliable return level estimations. The GPD pdf gave only slightly lower estimations than the Weibull pdf.

missing values. In the next chapter, the handling of these missing values (i.e., “gaps”) will be discussed.

Chapter 6

Gaps in the Dataset

Many datasets have periods where data are absent which are referred to as gaps in the data (Burke (2001)). In particular, the dataset used in this study (i.e., the Slangkop dataset) has gaps due to the Datawell bouy not having been able to take measurements at certain points in time (for example when the bouy was removed for maintenance). Figure 6.0.1 shows plots of the Hmos for the years 2001 to 2003. The red circles in the plot of the year 2001 indicate positions where absent data values are clearly visible. The other years also have absent data, but they do not have as many consecutive missing values as to make it visible when considering their Hmo plots.

All the analyses done on the dataset therefore far, was done with the replacement of the absent data (or gaps) by zeros. This was done in order to keep the duration of the dataset fixed. Different results would have been obtained if the zeros were removed and the length of the dataset shortened. However, the spesific values of estimations based on the dataset were not of particular interest therefore far, but the focus was rather on the methods used to make these estimations.

There are, however, much more sophisticated methods for the treatment of missing data. These methods includes casewise deletion, mean substitution and imputation and will be discussed briefly in the following sections.

6.1 Treatment of Gaps in the Dataset

In document Analysis of Extreme Events in the Coastal Engineering Environment (Page 82-86)