In this work we propose a new estimator for Zenga’s inequality measure in **heavy** **tailed** populations. The new estimator is based on the Weissman estimator for high quantiles. We will show that, under fairly general conditions, it has asymptotic normal distribution. Further we present the results of a simulation study where we compare confidence intervals based on the new estimator with those based on the plug-in esti- mator.

21 Read more

In Chapter 3, we incorporate the **heavy**-**tailed** distributions into the renewal risk model based on the two dependent assumptions, namely, dependence among claim sizes and de- pendence between claims and their interarrivals. We give results on the ruin probability and large deviations of sums of random variables according to different dependent assumptions. In Chapter 4, we introduce the renewal risk model with dependence structure and pro- vides another approach to study ruin probability. In this case, an equation for the tail probability of maximal present value of aggregate net loss is derived, and hence some in- sights into the ruin probability can be obtained. Chapter 4 is based on the joint work of Chen et al. (2013).

Show more
82 Read more

We consider large scale network data sets from different disciplines, namely social networks, collaboration networks, citation networks, web graphs, biological networks, product co-purchasing networks, temporal networks, communication networks, ground-truth networks, and brain networks. We study several individual data sets from each discipline. These data sets are publicly available at http://snap.stanford.edu/data/ index.html. These are most standard network data sets which have **heavy**-**tailed** behaviors and used for modeling in the statistical paradigm [19], [8], [20]. Previous studies focused on using standard single statistical distributions, namely power- law, Lomax (Pareto Type-II), Exponential, Log-normal for modeling this wide variety of network data sets [3], [9]. But these models fail in capturing the lower-degree nodes while modeling the degree distributions. To overcome the drawback, we consider a new family of the proposed GLM family of dis- tributions. Note that the proposed family of **heavy**-**tailed** GLM distributions can model these large scale network data sets in the whole range. An overview of these publicly available network data sets is presented in Table III. Some statistical measures, for example, mean, standard deviation (s.d.), and calculated CV corresponding to the degree distributions of each network data sets, are also given in Table III. It is important to note that the empirical CV for all the data sets is greater than one, as reported in Table III.

Show more
12 Read more

This paper is structured as follows. In section 2 we provide a discussion on independent **heavy**- **tailed** random variables. This theory is now at an advanced level and well-understood. Several alternative definitions for **heavy** tails are reviewed and their relations and main properties are studied. In particular, we pay attention to the rich class of subexponential distributions and we discuss how its defining property provides a useful insight into the occurrence of large values of a sum —a characteristic behavior known as the principle of the single big jump. Moreover, it has been recognized that the subexponential property goes beyond the independent case and it is now an area of active research. One of the main contributions of this author is in this front. The main result in [4] states that a sum of lognormals possesses the subexponential property even when the involved random variables are correlated via a Gaussian dependence structure. A general overview of Monte Carlo methods is provided in Section 3 with a particular emphasis in the area known as Rare Event Simulation. The notions of rare event and efficient estimator are formalized here in order to provide the proper framework for analyzing Monte Carlo estimators for rare event probabilities. We discuss the classical tools such as importance sampling, exponential change of measure and conditional Monte Carlo. We discuss briefly the limitations of some standard methods when applied in a **heavy**-**tailed** setting. Section 4 is devoted exclusively to the approximation of tail probabilities of sums of **heavy**-**tailed** random variables; a recount of available methods is given there followed by a more detailed exposition on a set of estimators based on the Conditional Monte Carlo; these are known as the Asmussen-Binswanger [5] and the Asmussen-Kroese [6]. In particular, we provide extensions to the non-independent case and prove the efficiency of these estimators. We stress the fact that Theorems 4.1–4.3 are original contributions and their efficiency proofs can be found in the Appendix in Section 6. Finally, Section 5 contains some concluding remarks.

Show more
22 Read more

Suppose the tallest person you have ever seen was 2 meters (6 feet 8 inches); someday you may meet a taller person, how tall do you think that person will be, 2.1 meters (7 feet)? What is the probability that the ﬁrst person you meet taller than 2 meters will be more than twice as tall, 13 feet 4 inches? Surely that probability is inﬁnitesimal. The tallest person in the world, Bao Xishun of Inner Mongolia,China is 2.36 m or 7 ft 9 in. Prior to 2005 the most costly Hurricane in the US was Hurricane Andrew (1992) at $41.5 billion USD(2011). Hurricane Katrina was the next record hurricane, weighing in at $91 billion USD(2011) 1 . People’s height is a ”thin **tailed**” distribution, whereas hurricane damage is ”fat **tailed**” or ”**heavy** **tailed**”. The ways in which we reason from historical data and the ways we think about the future are - or should be - very diﬀerent depending on whether we are dealing with thin or fat **tailed** phenomena. This monograph gives an intuitive introduction to fat **tailed** phenomena, followed by a rigorous mathematical treatment of many of these intuitive features. A major goal is to provide a deﬁnition of Obesity that applies equally to ﬁnite data sets and to parametric distribution functions.

Show more
65 Read more

A topical area of Markov Chain Monte Carlo (MCMC) in theoretical statistics is around the following problem: given a fixed “target” density or distribution known up to a constant multiplier – a normalizing constant – how to construct a (Markov) process which would have this density as a (unique) invariant one and which would converge to this invariant one with a rate that could be theoretically evaluated? In particular, a permanent great interest in recent decades was about dealing with “**heavy**-**tailed**” densities with a polynomial decay at infinity. With this problem in mind, let us consider a polynomially decreasing probability density π on the line R 1 ; in the precise setting it will be restricted to the half-line R 1

Show more
20 Read more

tunnel study of scalar dispersion within plant canopy turbulence. Equally good agreement with this data is obtained using Thomson’s (1987) Gaussian model. This bolsters confidence in the application of this simple model to the prediction of spore dispersal within plant canopy turbulence. Contact distributions—the probability distribution function for the distance of viable fungal spore movement until deposition—are predicted to have “**heavy**” inverse power-law tails. It is known that **heavy**-**tailed** contact distributions also characterize the dispersal of spores which pass through the canopy turbulence and enter into the overlying atmospheric boundary-layer. Plant disease epidemics due to the airborne dispersal of fungal spores are therefore predicted to develop as accelerating waves over a vast range of scales—from the within field scale to intercontinental scales. This prediction is consistent with recent analyses of field and historical data for rusts in wheat. Such plant disease epidemics are shown to be governed by space-fractional diffusion equations and by Lévy flights.

Show more
unpredictable. These complex systems possess both resilience against change and a capacity to transform in unanticipated ways as local reactions interact with each other and lead to an emergent response. [9] Although **heavy**-**tailed** distributions are known to arise in complex systems, the reason for this is not yet clear [30]. Recent work suggests that **heavy**-**tailed** distributions may offer an efficient distribution (in information theoretic terms) in respect of members of a group of items, in contrast to a population of individual items [31]. Systems whose group membership follows a **heavy** **tailed** distribution may represent an optimal trade-off between robustness and adaptability [32].

Show more
In this paper, we suggest using LSE for the estimation of a GARCH (1,1) model. The estimator is based on the log transformation of the squared data. We establish the consistency and asymptotic normality of the proposed esti- mator. Our results have been obtained under mild regularity conditions that allow for **heavy** **tailed** error distributions that can be of particular interest in financial applications. Its finite sample properties have been investigated via a simulation study, which shows that, in the presence of extreme non- normality, the proposed LSE can allow for some efficiency gains with respect to the QMLE. We also provide empirical evidence that applying the LSE can yield better volatility forecasts than the standard QMLE. Our estimates also fit quite well the autocorrelation function of the squared returns.

Show more
41 Read more

We prove the uniqueness of linear i.i.d. representations of **heavy**-**tailed** processes whose dis- tribution belongs to the domain of attraction of an α-stable law, with α < 2. This shows the possibility to identify nonparametrically both the sequence of two-sided moving average coeffi- cients and the distribution of the **heavy**-**tailed** i.i.d. process.

12 Read more

This paper is constructed as follows. Firstly the probability model for **heavy**-**tailed** noises and the robust wavelet threshold technique are introduced and analyzed. Then based on an appropriate effectiveness measure for removing **heavy**-**tailed** noise, the classical threshold method and the robust wavelet threshold method are compared by Monte Carlo simulation. Moreover, the soft threshold and hard threshold for robust wavelet threshold technique are also analyzed and compared. An application in gas sensor signal denoising shows the prospect of the robust wavelet threshold technique. Finally, some conclusions are given as well as the hints for future work.

Show more
Our main contribution is to show that it is indeed possible to design a throughput optimal scheduling policy that guarantees light-**tailed** response times for the light-**tailed** flow. Our design entails a careful choice of inter-queue scheduling policy, as well as intra-queue scheduling policies. The inter-queue scheduling policy determines which queue to serve in each slot, given the current queue lengths and connectivity state, whereas the intra-queue scheduling policies specify which waiting packet to serve from the queue selected for service by the inter-queue scheduling policy. We consider inter-queue scheduling policies from a class of generalized max-weight policies, which guarantee throughput optimality, while providing a relative priority to the light-**tailed** flow. Our analysis highlights how much relative priority the inter-queue policy needs to award to the light-**tailed** flow to make light-**tailed** response times possible. Importantly, our results suggest that the response time tail of the **heavy**-**tailed** flow remains unaffected in this process; we prove this formally for the special case in which both queues are always connected to the server. Additionally, our analysis reveals that the correct choice of intra-queue scheduling policies is crucial in order to obtain good response time tail behavior.

Show more
117 Read more

123 Read more

The empirical facts presented in Section 1.1 show that we should use **heavy** **tailed** alternatives to the Gaussian law in order to obtain acceptable estimates of market losses. In this section we apply the techniques discussed so far to two samples of financial data: the Dow Jones Industrial Average (DJIA) index and the Polish WIG20 index. Both are blue chip stock market indexes. DJIA is composed of 30 major U.S. companies that trade on NYSE and NASDAQ. WIG20 consists of 20 major Polish companies listed on the Warsaw Stock Exchange. We use daily closing index values from the period January 3, 2000 – December 31, 2009. Eliminating missing values (mostly U.S. and Polish holidays) we end up with 2494 (log-)returns for each index, see the top left panels in Figures 1.5 and 1.6.

Show more
37 Read more

More details on the method can be found in Grahovac et al. (2015). It is important to note that the estimation does not depend on the particular form of the underlying distribution and the only assumption is that the sample comes from the class of **heavy**-**tailed** distributions, which in particular also includes Student’s distribution. The empirical scaling function computed on the sample of CROBEX log-returns R 1 , ...., R n with n = T = 2523 is shown in figure 4(b). A clear departure from the line q/2 confirms that the log-returns are **heavy**-**tailed**. The scaling function has a shape of the broken line and breaks at around value 5. Computing the estimator by equa- tion (10) gives the value α ̑ = 4.827. The estimated value appears as a break in the plot of the scaling function in figure 4(a). The plot of τ ∞

Show more
20 Read more

Second, we explore consequences for probabilistic limit laws and prove three the- orems on precise large and moderate deviations for sums of independent identically distributed (i.i.d.) random variables that are **heavy**-**tailed** [7] and integer-valued. The theorems generalize results on stretched exponential laws by A. V. Nagaev [15] which have recently attracted interest in the context of the zero-range process [2]. They are close in spirit to results by S. V. Nagaev [17], however with more concrete conditions on the domain of validity of the theorems, and provide deviations results “on the whole axis” [20]. Our assumptions are more restrictive than one may wish from a probabilistic perspective; in return, they allow for sharp results and may

Show more
41 Read more

In this chapter the partition function and its asymptotic properties are presented ﬁrst. The result shows that the limit behaviour of the partition function is strongly aﬀected by the existence of moments for the underlying distribution. Moreover, the existence of moments is directly indicated by the tail index of **heavy**-**tailed** data. This motivates to apply the partition function to the context of **heavy**-**tailed** analysis. Therefore, a graphical method via the scaling function to detect **heavy** tails is proposed. Indeed, the plot of the scaling function not only informs the existence of **heavy** tails but also reﬂects the tail behaviour at a single point. More precisely, there is a reﬂection between the breakpoint of the scaling function and the tail index α, which allows us to establish estimation methods of α.

Show more
132 Read more

likelihood ratio merit function are derived in K-distributed clutter background for Swerling target of 87.. types 1, the implementation issues of the merit function are also d[r]

12 Read more

Abstract. Stochastic daily precipitation models are com- monly used to generate scenarios of climate variability or change on a daily timescale. The standard models consist of two components describing the occurrence and intensity series, respectively. Binary logistic regression is used to fit the occurrence data, and the intensity series is modeled us- ing a continuous-valued right-skewed distribution, such as gamma, Weibull or lognormal. The precipitation series is then modeled using the joint density, and standard software for generalized linear models can be used to perform the computations. A drawback of these precipitation models is that they do not produce a sufficiently **heavy** upper tail for the distribution of daily precipitation amounts; they tend to underestimate the frequency of large storms. In this study, we adapted the approach of Furrer and Katz (2008) based on hybrid distributions in order to correct for this short- coming. In particular, we applied hybrid gamma–generalized Pareto (GP) and hybrid Weibull–GP distributions to develop a stochastic precipitation model for daily rainfall at Ihtiman in western Bulgaria. We report the results of simulations de- signed to compare the models based on the hybrid distribu- tions and those based on the standard distributions. Some po- tential difficulties are outlined.

Show more
15 Read more

In this article we describe a method for carrying out Bayesian inference for the double Pareto lognormal (dPlN) distribution which has recently been proposed as a model for **heavy**-**tailed** phenomena. We apply our approach to inference for the dPlN/M/1 and M/dPlN/1 queueing systems. These systems cannot be analyzed using standard techniques due to the fact that the dPlN distribution does not posses a Laplace transform in closed form. This difficulty is overcome using some recent approximations for the Laplace transform for the Pareto/M/1 system. Our procedure is illustrated with applications in internet traffic analysis and risk theory.

Show more
17 Read more