Auto dependence structure of arch models: tail dependence coefficients

(1)

BIROn - Birkbeck Institutional Research Online

Brummelhuis, R. (2006) Auto-dependence structure of arch-models: tail

dependence coefficients. Working Paper. Birkbeck, University of London,

London, UK.

Downloaded from:

Usage Guidelines:

Please refer to usage guidelines at

or alternatively

(2)

ISSN 1745-8587

Birkbeck Workin

g

Pa

p

ers in Economics & Finance

School of Economics, Mathematics and Statistics

BWPEF 0605

Auto-Dependence Structure of

Arch-Models: Tail Dependence Coefficients

Raymond Brummelhuis

May 2006

(3)

DEPENDENCE COEFFICIENTS

RAYMOND BRUMMELHUIS

Abstract. We study autodependence in ARCH-models by computing the

auto-lower tail dependence coefficients and certain generalizations thereof, for both stationary and non-stationary time series. This study is inspired by financial risk-management issues, and our results are relevant for estimating probabilities of consecutive value-at-risk violations.

1. Introduction

One of the striking aspects of empirical financial return-series is volatility-clustering: although successive returns are roughly uncorrelated, in the well-known phrase of Mandelbrot [19], ‘large changes tend to be followed by large changes - of either sign - and small changes tend to be followed by small changes’. Auto-dependence in time series is mostly studied in terms of auto-correlation, and for financial returns the squared returns typically exhibit an auto-correlation pattern which is remines-cent of an ARMA-process. A popular class of financial econometrical models that capture this kind of behavior is that of the GARCH-models introduced by Engle [13] and Bollerslev [7]; see also [8], [14].

Although auto-correlation of squared returns provides a satisfactory qualitative explanation for volatility clustering, and for Mandelbrot’s observation cited above, it does not always easily provide answers to questions related to the multi-variate distribution functions associated to the process, like for example the behavior of conditional quantiles at lags greater than 1 for a GARCH(1, 1) - cf. [9]. Motivated by recent developments in risk-management (see Embrechts, McNeil and Strau-mann [12],), we propose to study auto-dependence in time-series from the point of view of alternative, copula-based, dependence measures like rank-correlations, concordance measures or, in this paper, tail dependence coefficients. Indeed, as in risk-management, where linear correlation is the natural dependence measure for multi-variate normal, or more generally elliptic, linear models, but is less suitable for non-linear models or for more general classes of multi-variate distributions, we ar-gue that autocorrelations, while adequate for linear processes, should, for non-linear processes like the GARCH, be supplemented by other measures of auto-dependence. As a further motivation to consider alternative auto-dependence measures in time series, linear autocorrelations are not always defined: in the case of a stationary GARCH only for a limited parameter range, and not at all if the i.i.d. innovations which drive the process do not posses a finite variance. Similarly, if one is interested in the squared process they should at least have a finite fourth moment. (This does not preclude studying sample autocorrelation functions of such processes; cf. [4] and its references).

As already mentioned, the particular dependence measure we will study in this paper is the lower tail dependence coefficient, and certain generalizations thereoff. The lower tail dependence coefficient λX|Y of two random variables X and Y is

(4)

than the α-th quantile of X, given that Y will be below its ownα-th quantile; cf. [12]. As we will see, this measure will pick up non-trivial auto-dependence in a stationary ARCH, but, somewhat surprisingly, not in a non-stationary one. For this reason we will introduce certain generalizations, which we will callgeneralized lower tail dependence coefficients, a typical example of which would be the limit, forαtending to 0, of the probability thatX will be smaller than its√α-th quantile given thatY than itsα-th quantile. Here, the square root can be replaced by more general functions ofαtending to 0 at a suitably slower rate thanαitself.

The reasons for considering these particular dependence measures instead of others like Kendall’s τ or Spearman’sρare, firstly, that since these are asymptotic quantities, they are more amenable to a complete theoretical analysis: in terms of copulas, to determineλX|Y and its generalizations, we only need to understand the

behavior of the copula ofX andY near the lower left corner point of its domain of definition, while τ and ρrequire a global knowledge of the copula. Secondly, and from the point of view of applications more importantly, the lower tail dependence coefficient has a direct relevance for financial risk-management, since it can be interpreted in terms of value-at-risk: if X = (Xn)n is a stationary time-series,

representing the daily returns of an investment portfolio, the (unconditional) daily value-at-risk at a confidence level of 1−α is simply the (absolute value of) the lowerα-th quantile ofXn. The financial interpretation is that with a probability of

1−α, daily (percentage) losses will be less than this value-at-risk. Under the Basle rules, for a financial institution to suffer losses on its market portfolio exceeding the value-at-risk at some specified confidence level has regulatory consequences; cf [3]. The lower tail dependence coefficientλXn+1|Xn will provide information on the probability of violating one’s value-at-risk limit on two consecutive days, and similarly for other time lags.

Value-at-risk, as a risk-measure, has been criticized for two, closely related, rea-sons: it does not give an estimate of the size of the loss when it occurs, and, when considering more than one risky asset or portfolio of risky assets, it can fail to be sub-additive. In the terminology of [2], it is not coherent. (It is coherent when restricted to portfolios made up of jointly elliptically distributed assets, cf.[12]). The failure of subadditivity is less relevant if (Xn)n models a financial institutions

entire market portfolio. We note in this respect that Berkowitz and O’Brien [5] re-port that a simple univariate ARMA + GARCH-model of the total re-portfolio return is at least as effective, if not more, for value-at-risk forecasting than the detailed structural value-at-risk models commonly used by banks. As regards the loss-size, this can be estimated by the expected shortfall, which is basically the expectation of Xn given that it is smaller than its α-th quantile. If properly defined for

non-continuous distributions, expected shortfall can be shown to be coherent, cf. [1], [21]. We won’t however consider expected shortfall here, but limit attention to quantiles and value-at-risk, this being at the moment the industry standard.

To investigate the relevance of lower tail dependence as auto-dependence mea-sure for non-linear time-series, we do in this paper a detailed study of the particular, but representative, example of an ARCH(1)-model, the simplest of the GARCH-models. Our results are expected to extend to more general GARCH-processes, and in particular to GARCH(1, 1)’s. If (Xn)n≥0 is an ARCH(1), we will compute the lower tail dependence coefficientλXn+k|Xnboth for stationary processes (taking for X0 a stationary solution of the ARCH(1)-equation), and for non-stationary ones having an a.s. initial value X0 = x0 ∈ R. The latter are of interest for

condi-tional value-at-risk estimation given today’s return; cf. [15]. Under the assumption that n is symmetric we will give, in theorem 2.2 below, an explicit expression

(5)

auto-regressive ARCH(1)-parameter a1and the Pareto exponent of the stationary distribution (which can be computed from the former two), but no other parame-ters. In particular, more detailed knowledge of the stationary distribution itself is not required.

The non-stationary case is more delicate, and requires additional hypotheses on n. We will suppose that f, and its first two derivatives have tails with a

Pareto-type inverse power decay. This class of f was singled out because of its

importance in empirical modelling: see e.g. [15]. Somewhat surprisingly, the tail dependence coefficientsλXn+k|Xnnow turn out to be 0, and to still be able to detect a non-trivial auto-dependence in the tails, we have to use the generalized coefficient mentioned above (see definition 2.5 below). The generalized lower tail dependence coefficients turn out to be all equal to 1/2, both for stationary and non-stationary processes. Their value is in particular independent of the ARCH(1)-parameters, which contrasts with the traditional picture provided by the linear auto-correlations of the squared process.

The paper is organized as follows: in section 2 we recall a number of basic definitions and facts, define the generalized tail dependence coefficients and state our main results, which are theorems 2.2 and 2.7 for stationary ARCH(1)’s, and theorems 2.4 and 2.6 for non-stationary ones. In section 3 we prove the results for stationary ARCH(1)’s. Sections 4 to 6 are devoted to the technically more complicated non-stationary case. We first determine, in section 4, the asymptotic behavior of the probability density function ofXngiven an a.s. initial value forX0. The main technical tool in this section is the Mellin transform. The asymptotics of section 4 are first used in section 5 to derive the asymptotics of the quantile function, and then, in section 6, the asymptotic behavior of the tail dependence function. Using these, our results for non-stationary ARCH(1)’s quickly follow. Finally, in section 7 we use Monte Carlo simulations to study lower tail dependence at small but non-zero values of the confidence parameterα. An appendix summa-rizes, for convenience of the reader, the relationship between Mellin transform and asymptotic expansions needed in sections 4 and 6.

2. _{Main Results}

All random variables will be defined on some common probability space (Ω,F,_P). Recall that the left-inverse F← of an increasing functionF :_R→_Ris defined by F←(α) = inf{x : FX(x) ≥ α} for α ∈ R;, cf. [6], [11]. If FX(x) = P(X ≤ x),

x ∈ R is the cumulative distribution function of a random variable X, then the

α-th quantileqX(α) can defined as as the left-inverse ofFX; explicitly,

qX(α) =FX←(α) = inf{x∈R:P(X ≤x)≥α}, α∈[0,1].

All probability distributions we will encounter in this paper, or the functions by which we will be approximate them, will be continuous and strictly increasing in the regions of interest, so that the left-inverse reduces locally to the ordinary one, at least locally.

Let (Xn)n≥0 be an ARCH(1)-process, defined by

(1) Xn+1=

p

a0+a1X2

nn,

wherea0, a1>0, and (n)n≥1is an i.i.d. sequence of random variables with mean 0. We will mostly suppose, for symplicity, thatnis symmetric. We will consider both

stationary and non-stationary ARCH(1)’s. The stationary case amounts to taking X0=dX∞, whereX∞is the stationary solution of (1). Stationary solutions exists if a0>0 and E loga121

(6)

have a Pareto-tailed probability distribution, cf. [18], [17], [16], and [11] for a text-book treatment of the case of normally distributed n. Suppose that =d n has

a positive density such that for someh0 ∈(0,∞], _E(||h₎_<_∞_{for all} _{h < h0} _and

E(||h0_{) =}_∞_{, and let}_κ

∞= 2k∞, where k∞>0 is the unique positive solution to

(2) _E (a12)k∞

= 1.

Then the probability distribution of the stationary solutionX∞ has a Pareto-type inverse power decay of the tails: there exists a constantc∞>0 such that

(3) FX∞(x)'

c∞

|x|κ∞, x→ −∞,

and similarly for the right tail FX∞(x) = 1−FX∞(x). This result was recently

generalized to arbitrary GARCH(p, q) in [4].

We next turn to tail dependence.

Definition 2.1. (cf. [12]) Let (X, Y) be a random vector. The(lower) tail depen-dence functionofX onY is the function λX|Y : [0,1]→[0,1] defined by

(4) λX|Y(α) :=P(X ≤qX(α)|Y ≤qY(α)), α∈[0,1].

Thecoefficient of lower tail dependence of X on Y is defined by

(5) λX|Y = lim

α→0λX|Y(α), provided the limit exists.

If the limit (5) does not exist, one can consider instead the lim sup and the lim inf, and interpret these as an upper, respectively lower bound on the dependence ofX on Y in the extreme lower tail. If X and Y are independent, thenλX|Y(α) = α

and λX|Y = 0, and if X andY areco-monotonic or counter-monotonic (meaning

that one is an increasing respectively decreasing function of the other),λX|Y(α) =

λX|Y = 1 respectively -1.

In the case of continuously distributed X and Y, λX|Y(α) will only depend on

thecopulaCX,Y ofX andY, which in this case can be characterized as the unique

function CX,Y : [0,1]×[0,1]→ [0,1] such that the joint probability distribution

FX,Y can be written as

FX,Y(x, y) =CX,Y(FX(x), FY(y)).

Since for continuous increasingF,F◦F←=id, it follows thatλX|Y(α) =α−1CX,Y(α, α).

The lower tail dependence coefficient λX|Y can therefore be interpreted as the

di-rectional derivative, in the direction of the diagonal, of CX,Y in (0,0).

In financial modelling terms, thinking of X and Y as percentage returns of financial assets over some fixed common time-period,−qX(α) is interpreted as the

value-at-risk, VaRα, with confidence 1−αover the time period under consideration

(to be multiplied by the amount of capital invested, to be precise). In practice, α will be small,α= 0.05 or 0.01, andqα(X) will be negative; by convention, losses are

recorded as non-negative numbers, whence the minus-sign. With this terminology λX|Y(α) is the probability of the losses ofX exceeding VaRα(X), given that the

losses of Y already exceed VaRα(Y). We refer to [12] for a detailed discussion of

copulas and dependence measures, and their applications to risk management.

(7)

Theorem 2.2. Let(Xn)n≥0be a strictly stationary ARCH(1) having symmetrically

distributed innovationsn =dwith densityf such that the stationary distribution

has a Pareto-type tail decay (3). Letκ∞be defined by (2) and letF be the

cumula-tive distribution function of =dn. Then the coefficient of lower tail dependence

of Xn+p on Xn is given by

λXn+p|Xn=κ∞ Z −1

−∞ Z

Rp−1

F −

1

ap/₁ 2|x1| · · · |xp−1||z| ! p−1

Y

j=1

f(xj)|z|−κ∞−1dxdz,

where dx=dx1· · ·dxp−1. In particular,λXn+p|Xn6= 0.

The integral on the right is convergent, sinceFis bounded by 1 andκ∞>0. It is easily verified that its value is between 0 and 1. Note thatλXn+p|Xn only depends onF,a1and κ∞. In particular, it does not depend on any further characteristics of the, in general unknown, stationary distributionFX∞, not even on the constant

c∞ in (3).

We next turn to non-stationary ARCH(1)’s starting with an a.s. initial condition X0 =x0 ∈R. Let Px0 _{be the conditional probability}

Px0 ₌

P(·|X0 =x0) and let Fx0

X , q x0

X(α) and λ x0

X|Y denote, respectively, the cumulative distribution function,

quantile and lower tail dependence coefficient with respect to Px0_{. In the}

non-stationary case we will suppose that =d n has a twice differentiable symmetric

probability density f whose derivatives up to order 2 satisfy a Pareto-type decay

condition:

Condition 2.3. There exists constantsκ, c>0 such that

(6) f(x) =

c

|x|κ+1 +ρ(x), x6= 0, with remainderρ satisfying

(7) |ρ|, |x

d dxρ|,|x

2 d2 dx2ρ| ≤

C

|x|κ+1+η0, |x|>0,

for suitable constantsC, η0>0.

We use κ+ 1 as Pareto exponent in (6) since this corresponds to an exponent of

κ forF. The hypotheses onf cover cases like the Student distributions and the

Pareto-type distributions originating from extreme value theory; cf. [12], [15]. It follows from (6) and (7) that the j-th derivative satisfiesf(j)(x) =cj|x|−κ−j−1+

O |x|−κ−j−1−η0_{, for}_j_{= 0}_,₁_,_{2. We then have:}

Theorem 2.4. Suppose the pdf f of n is symmetric and satisfies condition 2.3.

Then the conditional tail dependence coefficients all vanish: λx0

Xn+k|Xn = 0 for all x0∈_R,n, k≥1.

This is a somewhat unexpected result, whose apparent contradiction with theorem 2.2 as n → ∞ will be clarified in sections 5 and 6 below, cf. remark 6.2. There does not seem to be anything special about the n’s satisfying condition 2.3: using

the results of [9], theorem 2.4 can be shown to be also true for GARCH(1, 1)’s with normal innovations

(8)

Definition 2.5. let (X, Y) be a real random vector, and letψ: (0,1]→(0,1], such that limα→0ψ(α) = 0. We define the generalized lower tail dependence coefficient

of X onY,λψ_X_|_Y, by

(8) λψ_X_|_Y := lim

α→0P(X ≤qX(ψ(α))|Y ≤qY(α)), assuming the limit exists.

Only the behavior ofψat 0 matters andψonly needs to be defined on some small sub-interval (0, δ), δ > 0. It is easily seen that for continuous FX and FY, λ

ψ X|Y

again only depends on the copula CX,Y, and that

λψ_X_|_Y = lim

α→0 1

αCX,Y(ψ(α), α),

the directional derivative of CX,Y in (0,0) along the curve α → (ψ(α), α). We

specifically allow curves which are tangential to one of the axes of the copula’s domain of definition, [0,1]×[0,1]. If X and Y are independent, then clearly

P(X < qX(ψ(α))|Y < qY(α)) = ψ(α), and λψ_X_|_Y = 0. Embrechts, McNeil and

Strauman [12] call two random variables X andY asymptotically independent in the lower tail ifλX|Y = 0. More generally, one might callX andY strongly

asymp-totically independent(in the lower tail) if λψ_X_|_Y = 0, for all positiveψ on (0,1] for which limα→0ψ(α) = 0. Theorem 2.4 showed that consecutive valuesXn, Xn+p of

a conditional ARCH(1) with a.s. initial condition are asymptotically independent in the lower-tail. The next theorem shows that they are far from being strongly asymptotically independent. Letλx0,ψ

X|Y be the twisted lower tail dependence

coeffi-cient (8) computed with respect to the measurePx0_.

Theorem 2.6. Under the hypotheses of theorem 2.4, we have that

(9) λx0,ψ

Xn+p|Xn= 1 2,

for all functions ψ: (0,1]→(0,1]satisfying

(10) lim

α→0ψ(α) = limα→0

α(logψ(α)−1₎p−1 ψ(α) = 0.

There is an analogue of theorem 2.6 for stationary ARCH(1)’s:

Theorem 2.7. Under the conditions of theorem 2.2, if (Xn)n≥0 is a stationary

ARCH(1), then λψ_X

n+p|Xn = 1

2, for allψ=ψ(α)satisfying lim

α→0ψ(α) = limα→0 α ψ(α) = 0.

Typical examples of functions ψ(α) satisfying the conditions of theorems 2.6 and 2.7 areψ(α) =α1−ε, for any∈(0,1].

It is instructive to compare these results with the picture provided by classical correlation. A straightforward computation shows that, if (Xn)n is a stationary

ARCH(1) with E(2) = 1 such thata21E(4)<1 (which implies existence of fourth moments), then the linear correlation of Xn2 and Xn2+p exists and decays as a

p

1 if a1→0. To quantify auto-dependence and heteroscedasticity in an ARCH if fourth moments do not exists one has to look for alternatives to linear correlation. The lower tail dependence coefficientλXn+p|Xnwill always exist, and also converges to 0 asa1→0, by theorem 2.2 and the dominated convergence theorem - the exact order of convergence is less obvious, though. The generalized tail dependence coefficients λψ_X

(9)

It would be interesting to determine the precise behavior of λXn+p|Xn for a stationary ARCH(1) as eithera1→0 orp→ ∞.

Theorems 2.6 and 2.7 are asymptotic results forαtending to 0, and it is reason-able to ask how indicative they are for what happens at a small but non-zero α. Taking for exampleα= 0.01 and ψ(α) =√α, how close to 50% is the probability that Xn+1 will be below its 10% quantile, given that Xn will have been below its

1%-one? In risk management terms, what is the probability that a portfolio will, on day n+ 1, incur a loss exceeding the 10% value-at-risk if on daynit will have suffered a loss greater than its 1% VaR? In section 7 we briefly investigate this question using Monte Carlo simulation. As we will see, depending on the value of a1, this generalized lower tail dependence at sayα= 0.01 can be very close to its asymptotic value of 0.5, and even for values ofa1as small as 0.1 the difference with the i.i.d. case (corresponding to ana1 equal to 0) is significant.

Another interesting question for future research is whether empirical time series of financial returns behave like in theorems 2.2, 2.6 and 2.7. We note in this respect that these theorems provide an attractive quantitative reformulation of Mandelbrot’s observation quoted in the first sentence of the introduction.

3. Lower tail dependence for stationary ARCH(1)

In this section we will prove theorems 2.2, 2.7 and 2.6. Let us put

FXn+p(x|Xn=y) =P(Xn+p≤x|Xn =y).

Lemma 3.1. Given y∈R andn, p∈N, p≥1, define functions sp(y;·) :Rp−1 → R≥0, by

sp(y;x) :=

(a0+a0a1x2

p−1+a0a21x2p−2x2p−1+· · ·+a0a

p−1

1 x21· · ·x2p−1+a

p

1x21· · ·x2p−1y2)1/2,

were x= (x1,· · ·, xp−1). Then for all n≥0 andp≥1,

(11) FXn+p(x|Xn =y) = Z

Rp−1

F

x sp(y;x)

p−1 Y

j=1

f(xj)dx.

Proof. Straightforward induction onp, observing that

P(Xn+1≤x|Xn=y) =F

x p

a0+a1y2 !

,

and using

P(Xn+p+1≤x|Xn=y) =

Z

R

F



 x q

a0+a1x2

p



fXn+p(xp|Xn=y)dxp,

for the induction step. Here, fXn+p(x|Xn =y) stands for the conditional density, d

dxF 0

Xn+p(x|Xn=y) (whose existence also follows by induction). QED Corollary 3.2. Let x <0 be arbitrary. Then the functiony →FXn+p(x|Xn=y)

is decreasing on {y <0}.

Proof. Fixx <0, and lety≤y0 <0. Theny2 _≥_y02 _{and therefore 0}_≤_s

p(y0,x)≤

sp(y;x), for allx∈R. Sincex <0, it follows thatsp(y0;x)−1x≤sp(y;x)−1xand

hence

F

x sp(y0;x)

≤F

x sp(y;x)

.

(10)

One can also give a proof by straightforward differentiation. Let qn(α) be theα-th

quantile of FXn.

Lemma 3.3. Suppose thatqn(α)<0 and letx <0. Then

(12) FXn+p(x|Xn=qn(α))≤P(Xn+p≤x|Xn≤qn(α))≤ 1 2.

Proof. We have that

(13) P(Xn+p≤x|Xn≤qn(α)) =

1 α

Z qn(α)

−∞

FXn+p(x|Xn=y)fXn(y)dy,

where fXn = F 0

Xn, the probability density. Integrating by parts, and using that FXn(qn(α)) =α, we see that (13) equals

FXn+p(x|Xn=qn(α))− Z qn(α)

−∞ d

dyFXn+p((x|Xn =y)

FXn(y) α dy

≥ FXn+p(x|Xn=qn(α)),

by corollary 3.2. The other inequality in (3.3) follows from

P(Xn+p≤0|Xn≤qn(α)) =

1 α

Z qn(α)

−∞

FXn+p(0|Xn=y)fXn(y)dy

= 1

2,

by (11), the symmetry of and the definition ofqn(α). QED

Recall the definition of the twisted tail-dependence coefficients from the intro-duction. We then can state:

Corollary 3.4. Letψ=: (0,1]→(0,1]be defined in a neighborhood of0 such that

(14) lim

α→0

qn+p(ψ(α))

qn(α)

= 0.

Thenλψ_X

n+p|Xn= 1/2.

Proof. We first observe that (14) implies that qn+p(ψ(α))

sp(x, qn(α))

→0, α→0.

By (11) and dominated convergence,FXn+p(qn+p(α)|Xn=qn(α))→ 1

2. The

corol-lary now follows from (12). QED

Proof of theorem 2.7. If (Xn)n≥0 is stationary, with X0 = X∞, then qn(α) =

F_X←_∞(α)' (c∞/α)1/κ∞, as α→0, for all n. It follows that (14) is equivalent to

α=o(ψ(α)),α→0. QED

Proof of theorem 2.2. Takingx==qn(α) =FX←∞(α) =:q∞(α) in lemma 3.1 (where

X∞denotes the stationary ARCH(1)) and using thatqn+p(α) =q∞(α) also, lemma 3.1 implies that

(15) λXn+p|Xn(α) = 1 α

Z q∞(α)

−∞ Z

Rp−1

F

_q

∞(α) sp(y;x)

p−1 Y

j=1

(11)

(compare (13), whereF∞(y) =FX∞(y), the distribution function ofX∞.

Integrat-ing by parts with respect to y, and using thatF∞(q∞(α)) =α, we find that

λXn+p|Xn(α) = Z

Rp−1

F

_q

∞(α) sp(q∞(α);x)

dx (16)

+q∞(α) α

Z q∞(α)

−∞ Z

Rp−1

f

q∞(α) sp(y;x)

_ap

1x21· · ·x2p−1y sp(y;x)3

p−1 Y

j=1

f(xj)F∞(y)dxdy.

As α→0, the integrand of the first term on the right tends to

F

_q

∞(α) sp(q∞(α);x)

→F

−1 ap/₁ 2|x1| · · · |xp−1|

! .

We next change variables in the second integral on the right hand side of (16): y = |q∞(α)|z . Since F∞(x) ' c∞|x|−κ∞ and |q∞(α)| ' (c∞/α)1/κ∞, it follows that

α−1F∞(|q∞(α)|z)→ |z|−κ∞, α→0. Since |q∞(α)|3/sp(q∞(α)z,x)3→a−13p/2|x1|−

3_{· · · |}_x

p−1|−3|z|−3. We therefore find that, momentarily writing A=ap/₁ 2|x1| · · · |xp−1|,

λXn+p|Xn = Z

Rp−1

F

−1

A p−1

Y

j=1

f(xj)dx

−

Z −1

−∞ Z

Rp−1

f

− 1

A|z|

1 A

z

|z|3

p−1 Y

j=1

f(xj)|z|−κ∞dxdz

= κ∞ Z −1

−∞ Z

Rp−1

F

− 1

A|z|

p−1 Y

j=1

f(xj)|z|−κ∞−1dxdz,

after a second integration by parts, which proves (6). QED

Remark 3.5. The proof would simplify if we would know thatX∞has a density F_∞0 (x) with the differentiated asymptotics κ∞c∞|x|−κ∞−1, |x| → ∞, since then the partial integrations would no longer be necessary.

4. Asymptotic probabilities for non-stationary ARCH(1)’s

Let (Xn)n∈N be the ARCH(1)-process (1), starting with an a.s. initial value

X0=x0∈R. We assume thatf is symmetric and satisfies condition 2.3: letting

κ0=κ+ 1, then

(17) f(x) =

c

xκ0 +ρ(x),

where (18) xd dx ν ρ(x)

≤ C

xκ0₊_η₀, ν= 0,1,2,

for suitable constantsη0, C >0, for allx >0.The successive realizationsXnof the

ARCH(1) will then have a density fXn which, because of the Markov property of the ARCH(1), will be given by

(19) fXn+1(x) = Z

R

fε

x p

a0+a1y2 !

fXn(y) dy p

a0+a1y2,

with fX1 = σ

−1_f

(x/σ1), σ1 =

p

a0+a1x2

(12)

Theorem 4.1. With the above notations we have that

(20) fXn(x) =

1

σ1a(₁n−1)/2 ϕn

x

σ1a(₁n−1)/2 !

,

with the function ϕn having a truncated expansion

(21) ϕn(x) =

n−1 X

ν=0 cν;n

(logx)ν

xκ0 +rn(x), x >0,

whose error rn(x)satisfies, for any positiveη < η0, an estimate

(22) |rn(x)| ≤

Cη

xκ0₊_η.

The coefficients cν;n and the remainder-function rn may depend on a0, a1 and σ1,

but both the cν;n and the constants Cη in (22) will remain bounded as long as

max(σ−₁1a−₁1/2a0,· · · , σ₁−1a−₁(n−1)/2a0) remains bounded. Moreover, the top order coefficient is given by

(23) cn−1;n= 2n−1

cn

(n−1)!,

and is therefore independent of a0, a1 andσ1.

We obtain a corresponding result for the cumulative distribution function FXn by integration. We limit ourselves to the top-order asymptotics. Recalling that κ=κ0−1, a simple integration by parts gives:

Corollary 4.2. We have that

(24) FXn(x) = Φn

x

σ1a(₁n−1)/2 !

,

where, asx→ −∞,

(25) Φn(x) =Cn

(log|x|)n−1

|x|κ +Rn(x),

with Cn=cn−1;n/κ= 2n−1cn/κ(n−1)!, and errorRn(x)satisfying

(26) |Rn(x)| ≤C

|log|x||n−2

|x|κ , x <0,

with uniformly bounded constantCwhenevermax(σ−1_a−1/2

1 a0,· · ·, σ −1 1 a

−(n−1)/2

1 a0)

stays bounded.

Remark 4.3. The asymptotics forxfixed, n→ ∞are different from those forn fixed |x| → ∞.It would be interesting to derive joint asymptotics in (x, n).

(13)

4.1. The case a1 = 1. We first take both a1 and σ1 equal to 1, and look for a truncated asymptotic expansion of fXn which is uniform in the parameter a0 for bounded a0: a0 ∈ [0,1] say (we might have let a0 be restricted to any bounded interval in [0,∞)). Define the operatorF onL1_([0_,_∞_{)) by}

(27) F(v)(x) =

Z ∞

0 fε

x p

a0+y2 !

v(y)_p dy a0+y2;

F is a positivity-preserving operator on L1([0,∞)) of norm 1. Let || · ||1 be the L1-norm with respect to Lebesgue-measure on [0,∞).

Lemma 4.4. Let {v(· ;a0) ;a0 ≥ 0} be a family of non-negative functions in C1_([0_,_∞₎₎_{, such that} _||_v₍_· _;_a

0)||1 = 1, for all a0, and such that v(x;a0) has the

truncated asymptotic expansion

(28) v(x;a0) =

n−1 X

ν=0 cν

(logx)ν

xκ0 +r(x), x >0,

with constants cν = cν(v;a0) uniformly bounded in a0 ≤ 1, and with remainder r(x) =r(x;v, a0)satisfying

(29)

x d dx

ν r(x)

≤ C

xκ0₊_η, x >0, ν= 0,1,

for suitable positive constants C and η with η < 1, uniformly for a0 ≤ 1. Let

u=u(·;a0)be given by

u(x;a0) =F(v(·;a0)) (x).

Thenu∈C1([0,∞)),u≥0, anduhas a truncated expansion

(30) u(x;a0) =

n

X

ν=0

c0_ν(logx)

ν

xκ0 +r

0₍_x₎_{, x >}₀_,

with the coefficients c0_ν =cν(u;a0)again uniformly bounded in a0≤1 andr0(x) = r0(x;v, a0) satisfying estimates (29) with η replaced by any smaller η0 < η, also uniformly in a0≤1. Moreover

(31) c0n=

ccn−1 n .

The truncated expansion (28) is compatible withv(·;a0) being inL1 _since_κ0_>₁_.

Proof. The idea is to write uas a Mellin convolution (see Appendix), by making the change of variables z=py2₊_a0_{. This gives}

u(x) = f∗ev(x)

=

Z ∞

0 f

x

z

e v(z)dz

z ,

with

(32) _ev(z) = √ z

z2₋_a 0

vpz2₋_a0₁

[√a0,∞)(z),

1A being the indicator function of a set A. However, in doing so we introduce

a singularity at z = √a0, which would spoil the decay properties of the Mellin transform needed later, and which we will therefore eliminate by a preliminary smooth cut-off. Let χ ∈C∞([0,∞)) be such that 0≤χ≤1, supp (χ)⊂[A,∞), χ= 1 on [2A,∞), where A >0 will be chosen below, and writev=v(x;a0) as (33) v=v0+v1, v1=χv, v0= (1−χ)v.

(14)

First of all, by the hypothesis (17) onf, we have that

F(v0)(x) = R∞

0 f

x

√

y2₊_a 0

v0(y)√dy

a0+y2

=x−κ0c

R∞ 0 (y

2₊_a0₎κ0 −1

2 v0(y)dy+R∞

0 (y

2₊_a0₎−1/2_ρ

(y2+a0)−1/2x

v0(y)dy

=: c00 xκ0 +r

0 0(x),

with

(34) c00=c00(a0) :=c

Z ∞

0

(y2+a0)κ0 −21_v0₍_y₎_dy,

and

(35) r0₀(x) =r₀0(x;a0) =

Z ∞

0 ρ

x p

y2₊_a0 !

dy p

y2₊_a0.

Since κ0 > 1 and v0 is supported in [0,2A], it follows that c00(a0) is uniformly bounded, for a0≤1, by

c(4A2+ 1) κ0 −1

2 ||v0||₁=c(4A2+ 1)

κ0 −1 2 ,

since 0≤(1−χ)≤1 and||v||1= 1.

Likewise, forr₀0(x;a0) we estimate, using (18) (withν= 0):

|r₀0(x;a0)| ≤ C

xκ0₊_η

0

Z A

0

(y2+a0)κ

0₊_η

0−1

2 v0(y)dy

≤ C

0

xκ0₊_η

0,

withC0=C(4A2₊_a0₎(κ0+η0−1)/2_._{This proves that}_F₍_v0_{) has a truncated}

asymp-totic series of the desired form (30), (29).

We next turn tou1=F(v1). First rewriteu1as a Mellin transform,u1=f∗v1_e,

with_ev1defined by (32) , withvreplaced byv1. Observe that if we takeA >1≥a0, then v1_e will be C1 _{on [0}_,_∞_{). From (28), (29), and remembering that} _{η <} _{1, we} easily obtain that

e

v1(x) = (χ v)px2₋_a0_√ x x2₋_a0

=

n−1 X

ν=0 cν

(logx)ν

xκ0 +er1(x),

with _er1 = er1(x;a0) satisfying (29), uniformly in a0 ≤ 1 (with the same η but possibly a differentC). It follows that the Mellin transform_ev₁#=_ev₁#is holomorphic on{s∈C:−κ0<Res <0}, and meromorphic on{s:−κ0−η <Re (s)<0}with

(36) _ev₁#=

n

X

ν=1 a−ν

(s+κ0₎ν +h(s),

with a−ν = (ν−1)!cν−1 andh=h(s) holomorphic on{s:−κ0−η <Re (s)<0} (cf. the Appendix). It is easily verified that (28) and (29), together with the differentiability ofv, imply that the truncated series (28) can be differentiated, and that

xd

dxev1(x) =

n−1 X

ν=0

c00_ν(logx)

ν

xκ0 +x

(15)

for suitable constantsc00_ν, which are simply obtained by term-by term differentiation of the truncated expansion for _ev1. (A similar remark applies to (17) and (18).) Taking Mellin transforms, we conclude that, given that_er1satisfies (29) withν= 1,

s_ev₁#(s) =

n−1 X

ν=0 Aν

(s+κ0₎ν +g(s),

with g =g(s) holomorphic on{s :−κ0−η <Re (s)<0}, and bounded on each closed sub-strip {s : −κ0−η+ε ≤Re (s)≤ −ε}, ε > 0. From this we conclude that

(i) Ifσ∈(−κ0−η,0) andσ6=−κ0, then

(37) t→_ev₁#(σ+it) ∈ L2(R),

and

(ii) For any closed sub-interval [a, b]⊂(−κ0−η,0), we have that

(38) max

σ∈[a,b] ev

#

1(σ±iR)

→0 asR→ ∞.

Similar assertions can be proved, in the same way, forf#₍_s_{), starting from (17)} and (18) with ν = 0 and 1.It follows that for arbitrary ρ∈(−κ0,0) we can apply the inversion formula for the Mellin transform to write u1(x) =F(v1)(x) as

(39) u1(x) = 1

2πi

Z σ+i∞

σ−i∞

f#(s)_ev₁#(s)xsds,

(the integral converges since the productf#₍_s₎ e

v#₁(s) is inL1 _{of the line}_{_{Re (}_s_{) =} σ}). Moreover, the right conditions are met to shift the integration path to{Res=

−κ−η0}, where η0 < η is arbitrary. In doing so, we will pick up the residue in s=−κ0:

1

2πiRess=−κ0

_c

s+κ0

n

X

ν=1

(ν−1)!cν−1 (s+κ0₎ν +h(s)

! xs

!

=

n

X

ν=0

c0_ν(logx)

ν

xκ0 ,

(40)

as one sees by writing xs₌_x−κ0_{exp ((}_s₊_κ0_{) log}_x_{). It follows in particular that}

(41) c0_n= ccn−1

n . Hence

u1(x) =

n

X

ν=0

c0_ν(logx)

ν

xκ0 +r

0₍_x₎_,

where the remainder term is given by the line-integral over Re s=−κ−η0:

r0(x) = 1 2πi

Z

Res=−κ−η0

f#(s)_ev#₁(s)xsds,

which, by an obvious estimate, will satisfy

(42) |r0(x)| ≤C/xκ0+η0.

To show thatx(d/dx)r0 satisfies the same estimate, we first differentiateu1, to find

x d dxu1=

xdf

dx

(16)

Now sincexdfε/dxsatisfies (17) and (18) forν= 0,1, we can repeat the argument

above to conclude that there exist constants c00_ν such that

x d dxu1=

c

X

ν=0

c00_ν(logx)

ν

xκ0 +r

00₍_x₎_,

with r00(x) satisfying an estimate of the type (42). That is, x(d/dx)u1 will have a truncated asymptotic series of the same type asu1. A moment’s thought then shows that, since the differential operatorxd/dx, formally applied to the expansion ofu1, yields an expansion of the same form, we necessarily must have thatr00_{= (}_xd/dx₎_r0_. This shows thatr0 satisfies (29) (withη0 instead ofη), and completes the proof of

the Main Lemma. QED

4.2. Proof of theorem 4.1. Lemma 4.4 allows us to quickly prove theorem 4.1, using induction on n. Fix an initial value x0 ∈ R and put σ1 = pa0+a1x20. Our induction hypothesis is that fXn(x) is of the form (20), with ϕn(x) having a truncated expansion (21), with the cν;n and rn satisfying the stated uniformity

conditions. For n= 1, this is satisfied withϕ1=f, by the hypothesis onf.

LetFa0,a1 be the norm-preserving positive operator onL

1₍

R≥0 defined by

(43) Fa0,a1(v)(x) = 2

Z ∞

0 fε

x p

a0+a1y2 !

v(y)_p dy a0+a1y2

,

so that theF defined by (27) equalsFa0,1.Then we have, by the Markov property

of our ARCH(1)-process, and the symmetry of f, that

fXn+1(x) = 2Fa0,a1(fXn) (x)

= 2Fa0,a1

1

σ1a(₁n−1)/2 ϕn

y

σ1a(₁n−1)/2 !!

(x)

= 2

σ1an/₁ 2

F_σ−1_a−n/2

1 a0,1(ϕn)

x

σ1an/₁ 2 !

,

by an easy change of variables in the integral definingFa0,a1.Put

ϕn+1:=F_σ−1_a−n/2

1 a0,1(ϕn).

Then by lemma 4.4, applied tov=ϕn, we find that

ϕn+1(x) =

n

X

ν=0 cν;n+1

(logx)ν

xκ0 +rn+1(x), withrn+1satisfying (29), uniformly in max(σ1−1a

−1/2

1 a0,· · ·, σ −1 1 a

−n/2

1 a0), and uni-formly bounded coefficients cν;n+1 for these values of the parameters. Moreover, cn;n+1= 2ccn−1;n/nand therefore, sincec0;1=cε,cn−1;n= 2n−1cnε/(n−1)!.QED

5. Asymptotic behavior of the non-stationary VaR

For the computation of the α→0 asymptotics of the tail-dependence function λx0

Xn+p|Xn(α) we need to know the asymptotic behavior of the quantile functions qx0

Xn(α) =F x0,←

Xn (α). This can be computed from corollary 4.2: Lemma 5.1. Asα→0,

(44) qx0

Xn(α)'σ1a (n−1)/2 1

_C

n

α

1/κ _log(_α−1_C

n)

κ

n−κ1 ,

(17)

Proof. Since it is not true in general that the map F → F← is continuous with respect to the uniform topologies, we proceed cautiously.

First of all, sinceFx0

Xn(x) = Φn(x/σ1a (n−1)/2

1 ), we immediately have that

(45) qx0

Xn(α) =σ1a (n−1)/2

1 Φ

← n(α).

From corollary 4.2 we see that for any ε > 0 we can find a positive real number R(ε) such that ifx <−R(ε), then

(Cn−ε)

(log|x|)n−1

|x|κ ≤ΦXn(x)≤(Cn+ε)

(log|x|)n−1

|x|κ .

Call the two functions on the left and right of this inequality Φ±ε

n . Since these are

strictly increasing, and since Φ−ε

n (x)≤Φn(x)≤Φεn(x), we will have that

(Φε_n)−1(α)≤Φ←_n (α)≤(Φ−_nε)−1(α),

at least for those αfor which the biggest of these three numbers is≤ −R(ε). (By theorem 4.1, Φ0

n(x) =ϕn(x) is strictly positive ifxis sufficiently small, so that we

could have replaced Φ←

n by its ordinary inverse also). To compute (Φεn)−1(α) we

use the following elementary lemma.

Lemma 5.2. Leta >0 and letx=x(a)be the (unique) positive solution to

(logx)n−1 xκ =a.

Then

x(a) =a−1/κ

_log_a−1

κ

n−κ1 g(a),

where g(a)→1 asa→0.

Proof. Put logx=y,x=x(a). Theny has to solve y−n−1

κ logy= log(a

−1/κ_{) =:}_A.

If we try a solution of the form y=A+n−_κ1logA+y0, theny0 =y0(A) will have to solve:

y0 = n−1 κ log

1 +n−1 κ

logA+y0 A

≤

_n₋₁ κ

2_log_A₊_y0

A

.

Hence we find that1

0 < y0 ≤

_n₋₁ κ

2 _log_A/A

1− n−1

κ

2 A−1

≤ 2

_n₋₁ κ

2_log_A

A ,

provided that A≥2κ−2₍_n₋₁₎2_{. Exponentiating, and putting}_g₍_a_{) := exp}_y0₍_A_), we find that

x(a) =a−1/κ

_log_a−1

κ

(n−1)/κ g(a),

where

(46) 1≤g(a)≤exp 2(n−1)2logA/κ2A ,

1_{observe that if}_y

0:=A+ (n−1) logA/κ, theny0−(n−1) logy0/κ≤A, so thaty0 will have

(18)

if loga−1/κ_≥₂_κ−2₍_n₋₁₎2_{. Hence}_g₍_a₎_→_{1 as}_a_→₀_. _QED Returning to the proof of lemma 5.1, it follows that

Φ±_nε−1 =

₍_C

n±ε)

α

1/κ_log(_α−1₍_C

n±ε))

κ

(n−1)/κ

g±ε(α),

with g±ε(α) → 1 as α → 0, uniformly in 0 ≤ ε ≤ε0 < Cn (as a glance at (46)

shows). Putting

Qn(α) =

_C

n

α

1/κ_log(_α−1_C

n)

κ

(n−1)/κ ,

we see that

_C

n−ε

Cn

log(α−1_C

n−ε)

log(α−1_C

n)

(n−1)/κ g−ε(α)

≤Φ

← n (α)

Qn(α)

≤

Cn+ε

Cn

log(α−1Cn+ε)

log(α−1_C

n)

(n−1)/κ gε(α).

Lettingα→0, we conclude that for all sufficiently smallε >0,

(1− ε

Cn

)≤lim inf

α→0

Φ←_n(α) Qn(α)

≤lim sup

α→0

Φ←_n (α) Qn(α)

≤(1 + ε Cn

),

and the lemma follows by letting ε→0. QED

6. Asymptotics of the non-stationary tail-dependence function

The asymptotic behavior of the conditional lower tail dependence function can be determined quite precisely, and turns out to beuniversalfor the class of ARCH(1)’s whose innovations (n)n satisfy condition 2.3 with a givenκ and c, in the sense

that it does not depend on either a0, a1 or the initial valuex0, nor on the further details of f:

Theorem 6.1. Let (Xn)n∈N be an ARCH(1) with a.s. initial value X0 = x0

and a1 >0 strictly positive. Suppose that n has a twice differentiable symmetric

probability density satisfying condition 2.3. Then for all n, p≥1,

(47) λx0

Xn+p|Xn(α)'γn,p

log(logα−1₎p

(logα−1₎p , α→0,

where

(48) γn,p=

22n+2p−3 p!(n−1)!(n+p−1)!

c2n+2p

κ2

.

In particular, λx0

Xn+p|Xn =limα→0λ x0

Xn+p|Xn(α) = 0.

Remarks 6.2. (i) Observe that the conditional lower tail dependence function (47) tends to 0 at an exponentially slower rate than for independent random variables. For values of a1 such that the ARCH(1) has a stationary solution, the different conclusions of theorems 2.2 and 6.1 may appear strange, in view of the geometric ergodicity of the process (cf. [4] and its references). The explanation is that, in case of a stationary process,qXn(α) =qX∞(α) =qXn+p(α) for allnandp, while for a conditional ARCH(1) and n, p≥1,qXn+p(α)/qXn(α)→ ∞ as α→0 (although slowly), as follows from lemma 5.1.

(ii) The results of [9] can be used to derive the asymptotics of λx0

(19)

Proof of theorem 6.1 To simplify notations, we will writeqn(α) forqXx0n(α), and we will generally leave off the super-index x0. Since the joint probability density of (Xn+p, Xn) is a product of the conditional density of Xn+p|Xn and the density of

Xn, we see as before that

P(Xn+p≤qn+p(α)|Xn≤qn(α)

= 1 α

Z qn(α)

−∞

FXn+p(qn+p(α)|Xn =y)fXn(y)dy

= 1 α

Z qn(α)

−∞ Φp

qn+p(α)

a(₁p−1)/2pa0+a1y2 !

ϕn

y

σ1a(₁n−1)/2 !

dy

σ1a(₁n−1)/2 ,

= 1 α

Z ∞

1

Φp −

qn+p(α)/qn(α)

a(₁p−1)/2pa0+a1y2 !

ϕn

qn(α)y

σ1a(₁n−1)/2 !

qn(α)

σ1a(₁n−1)/2_; dy,

where we have used (24) for FXn+p(·|Xn = y), withn instead of p and Xn = y insteadX0=x0. Sinceqn(α)/σ1a

(n−1)/2

1 = Φ←n (α), we have that

qn+p(α)/qn(α)

a(₁p−1)/2pa0+a1y2 =

Φ←_n₊_p(α)/Φ←_n (α) p

e a0+y2

,

where we have put _ea0 := a−₁1a0. In what follows we will, without further com-ment, replace various quantities by their asymptotic equivalents: this can be made rigorous by standard estimates. First of all, by theorem 4.1,

P(Xn+p≤qn+p(α)|Xn≤qn(α))

(49)

'cn−1;n

α (Φ ← n (α))

−κZ ∞

1 Φp

Φ←

n+p(α)/Φ←n (α)

p e a0+y2

!

(log Φ←_n (α)y)n−1 yκ+1 dy.

By lemma 5.1,

(Φ←_n ) (α))−κ' α

Cn

_log_α−1_C

n

κ

−(n−1) ,

where the α will cancel the 1/α in front of (49). To simplify notations, we put λ:=α−1_C

n and Λ := log Φ←n (α)'κ−1logλ. Then, using lemma 5.1 once more,

we see that

Φ←_n₊_p(α)/Φ←_n (α)'γ(logλ)p/κ, γ=γn,p:= (Cn−1Cn+p)1/κ

and (49) equals

cn−1;n

Cn

_κ

logλ

n−1 n−1 X

ν=0

n−1 ν

·

(50)

Λn−1−ν

Z ∞

1

Φp −

γ(logλ)p/κ

p e a0+y2

!

(logy)ν

yκ

dy y .

If we make the change of variables z = p_ea0+y2_{, we recognize the integrals as} being the Mellin convolution, evaluated inγ(logλ)p/κ), of Φp(z)1(−∞,0)with

gν(z) :=

(log√z2₋ e a0)ν (z2₋

ea0) (κ+1)/2

z

√

z2₋ e a0

1[ea0+1,∞)(z),

(20)

Res <0}, 0< η <1, with a unique pole of orderpin s=−κand principal part Pp

ν=0a−ν(s+κ)

−ν_{, with}_a

p= (p−1)!Cp. The Mellin transform ofgν equals

g_ν#(s) =

Z ∞

ea0+1

(logy)ν

yκ (y

2₊ e a0)−s/2

dy y

=

Z ∞

1

(logy)ν

yκ y −s

1 +ea0

y2

s/2_dy y

= ν!

(s+κ)ν+1 +H(s), where

H(s) =

Z ∞

1

1 + ea0 y2

s/2

−1 !

(logy)ν yκ+s+1dy,

is holomorphic on {−κ−1 < Res < 0}, as follows easily form the asymptotic behavior of (1 +_ea0y−2)s/2−1 as y → ∞. Using the arguments of section 3, one finds that

Λn−1−ν·

Z ∞

1

Φp −

γ(logλ)p/κ

p e a0+y2

!

(logy)ν

yκ

dy y

' Λn−1−ν·Constν·

log(γ(logλ)p/κ₎p+ν γκ_(log_λ₎p

= O

(logλ)n−1−ν·(log(logλ))

p+ν

(logλ)p

.

It follows that the dominant term in (50) is the one with ν = 0. Computing the constant Const0, which givesCp/p, we find that (49) is asymptotically equivalent

to

cn−1;nCp

p Cn

_κ

logλ n−1

Λn−1 (log logλ)

p

(Cn+p/Cn)(logλ)p

' cn−1;nCpCp+n

p

(log logα−1)n−1 (logα−1₎p .

This proves theorem 6.1. QED

Proof of theorem 2.4. Immediate from (47). QED

Proof of theorem 2.6. We may use lemma 3.3 and corollary 3.4., withP,qn(α) and

λψ_X_|_Y replaced by _Px0_, _qx0

n (α), and λ ψ,x0

Xn+p|Xn. By lemma 5.1, (44), we see that, modulo an immaterial constant,

qx0

n+p(ψ(α))

qx0 n (α)

'

_α

ψ(α)

1/κ _log_ψ₍_α₎−1_C

n+p

(n+p−1)/κ

(logα−1_C

n)(n−1)/κ

,

forα→0. Ifψ(α)→0 andα(logψ(α)−1₎p_/ψ₍_α₎_→_{0, then certainly}_α/ψ₍_α₎_→₀

and consequently logψ(α)−1 ₌ _O_(log_α−1_{) (using that} _ψ₍_α₎ _< _{1 for sufficiently} small α). It follows that the expression on the right hand side will be dominated

by a constant times α(logψ(α)−1₎p_/ψ₍_α₎1/κ

and hence tends to 0 withα. QED

7. Generalized lower tail dependence functions at non-zero α

Consider a stationary ARCH(1) (Xn)n≥0 and let

(51) λψ(α) :=P Xn+p< qXn+p(ψ(α))|Xn< qXn(α)

,

(21)

conditions onψ. To see what happens at a small but non-zeroαwe have computed some λψ₍_{α, p}_{) using Monte Carlo simulations. What follows is not intended as a}

complete simulation study into the behavior of these functions, but as a simple ex-ploratory investigation. Its main aim is to illustrate that our theorems are relevant at values of α equal to 0.01 or 0.05, which are the values typically used in risk management. We have not, in particular, tried to improve the precision by using any variance reduction or importance sampling tecniques, leaving these for future work. We will also limit ourselves toψ(α) =√α, as a typical example of a function ψ satisfying the conditions in theorems 2.6 and 2.7.

It is important to note that what is important at non-zeroαis not the conditional tail probability (51) itself, but rather its difference with ψ(α), the tail probability for the independent case. Remembering that we took ψ(α) = √α, we therefore introduce the (generalized) excess lower tail probability

(52) e(α, p) :=λ

√

·₍_{α, p}₎₋√_α.

We first look at stationary processes. Taking a0 = 0.001, we have simulated 250 ARCH(1)’s of length 104 _{with Student} _t

4 innovations n and a1 ranging from 0 to 1.6 with step-size 0.1. We note that such an ARCH(1) is strictly stationary if a1 <

√

e ' 1.6489. For each Monte Carlo run we computed the empirical conditional probabilities (51) and the difference (52), and taken their averages over the different runs, to be denoted by ˆλ

√

·₍_{α, p}_{), ˆ}_e₍_{α, p}_{), as well as their standard}

deviation of, to assess the significance of the estimated values. The results, rounded off to two significant figures, are given in the following table.

a1 ˆλ √

·₍₀_.₀₅_,₁₎ _ˆ_e₍₀_.₀₅_,₁₎ _std _λˆ√·₍₀_.₀₁_,₁₎ _ˆ_e₍₀_.₀₁_,₁₎ _std

0 0.23 0 0.02 0.1 0 0.03

0.1 0.29 0.06 0.02 0.23 0.13 0.04

0.2 0.32 0.1 0.02 0.3 0.2 0.05

0.3 0.35 0.13 0.02 0.34 0.24 0.05

0.4 0.37 0.15 0.02 0.38 0.28 0.05

0.5 0.39 0.17 0.02 0.40 0.30 0.05

0.6 0.41 0.19 0.02 0.43 0.33 0.05

0.7 0.43 0.20 0.02 0.44 0.34 0.05

0.8 0.44 0.22 0.02 0.45 0.35 0.05

0.9 0.45 0.23 0.02 0.47 0.37 0.05

1.0 0.47 0.24 0.02 0.48 0.38 0.05

1.1 0.47 0.25 0.02 0.49 0.39 0.05

1.2 0.48 0.26 0.02 0.49 0.39 0.05

1.3 0.49 0.26 0.02 0.49 0.39 0.05

1.4 0.49 0.27 0.02 0.5 0.4 0.05

1.5 0.49 0.27 0.02 0.49 0.39 0.05

1.6 0.49 0.27 0.02 0.5 0.4 0.05

Table 7.1 - Stationary tail probabilities as function of a1.

The table shows for example that for a value of a1 = 0.5 there is a close to 40% change thatXn+1 will exceed its 95% VaR, given thatXn will have done so. This

conditional probability is about the same for the 99% VaR, and in that case about 30% bigger than what it would have been had succesive returns been independent. The effect becomes more pronounced with increasing values ofa1.

(22)

p ˆe(p)(std),a1= 0.5 eˆ(p)(std),a1= 1 ˆe(p)(std),a1= 1.5

1 0.31 (0.05) 0.38 (0.05) 0.40 (0.05)

2 0.20 (0.05) 0.35 (0.06) 0.39 (0.05)

3 0.13 (0.05) 0.30 (0.06) 0.38 (0.05)

4 0.07 (0.04) 0.26 (0.06) 0.37 (0.06)

5 0.05 (0.04) 0.22 (0.06) 0.36 (0.05)

6 0.03 (0.04) 0.19 (0.06) 0.35 (0.06)

7 0.02 (0.03) 0.16 (0.06) 0.34 (0.06)

8 0.01 (0.04) 0.14 (0.06) 0.32 (0.06)

9 0.01 (0.03) 0.11 (0.05) 0.31 (0.07)

10 0 (0.03) 0.10 (0.05) 0.29 (0.07)

Table 7.2 - Stationary excess tail probabilities as function of the lag p.

As was to be expected, the excess tail probabilities decrease with p. This decrease is slower the largera1is (and the more pronounced therefore the ARCH-effect): for a1 = 0.5, our estimate of e(0.01, p) is no longer significantly different from 0 from p = 6 onwards, while for a1 = 1.5, we found for example that ˆe(0.01, p = 20) = 0.16(0.08), which indicates a long persistence of return shocks at time 1.

We have repeated these computations for non-stationary ARCH(1)’s with a.s. initial valueX0=x0. Let

(53) λ

√

·;x0₍_α_;_{k, k}₊_p_{) :=}

Px0 Xn+p< qXn+p(

√

α)|Xn< qXn(α)

andex0₍_α_;_{k, k}₊_p_{) =}_λ

√

·;x0₍_α_;_{k, k}₊_p₎₋√_α_{. We will limit ourselves to}_k_{= 1, since}

this would seem to be the most relevant case for day-to-day risk-management: one would want to know the after-effects of a possible value-at-risk violation tomorrow. Note that for large values of k it is to be expected thatex0₍_α_;_{k, k}₊_p₎_'_e₍_{α, p}_),

at least for such values ofa1for which the ARCH(1) has a unique (in distribution) stationary solution.

We estimated ex0₍_α_{; 1}_,_{2) for} _a1_{’s between 0 and 2, with}_x0_{= 1 and}_a0_{= 0}_.₀₀₁

and ∼t4 as before. It turns out that ˆex0₍_α_{; 1}_,_{2) now increases very rapidly for}

a1’s between 0 and 0.1, and then remains practically constant:

a1 eˆx0=1₍₀_._{05; 1}_,₂₎ _std _ˆ_ex0=1₍₀_._{01; 1}_,₂₎ _std

0 0 0.02 0 0.03

0.01 0.06 0.02 0.12 0.04

0.02 0.11 0.02 0.2 0.05

0.03 0.14 0.02 0.23 0.05

0.04 0.16 0.02 0.25 0.05

0.05 0.17 0.02 0.26 0.05

0.06 0.18 0.02 0.28 0.04

0.07 0.18 0.02 0.27 0.04

0.08 0.19 0.02 0.28 0.05

0.09 0.19 0.02 0.28 0.05

0.1 0.19 0.02 0.28 0.05

Table 7.3 - Non-stationary excess tail probabilities as function of a1.

(23)

or ˆπx0₍₀_._{01; 1}_,_{2) = ˆ}_ex0₍₀_._{01; 1}_,_{2) + 0}_._{1 come as close to the}_α_→_{0-limit of 0.5 as}

in the stationary case, for the range of a1’s considered.

We next look at the dependence of e1₍₁_,_{1 +}_p_{) :=}_ex0=1₍_α_{; 1}_,_{1 +}_p_{) on the lag}

p, forα= 0.01 anda1= 0.05,0.1,0.2 and 0.5; as before, the number in brackets is the standard deviation of our Monte Carlo estimate.

p\ˆe1₍₁_,_{1 +}_p₎ _a

1= 0.05 a1= 0.1 a1= 0.2 a1= 0.5 1 0.27 (0.05) 0.28 (0.05) 0.29 (0.05) 0.29 (0.05) 2 0.1 (0.04) 0.17 (0.04) 0.21 (0.04) 0.22 (0.05) 3 0.01 (0.03) 0.07 (0.04) 0.14 (0.04) 0.18 (0.05) 4 0.01 (0.03) 0.02 (0.03) 0.07 (0.04) 0.15 (0.04)

Table 7.4 - Non-stationary excess tail probabilites as function of the lag.

The excess probabilities start at roughly the same level, but their decay with in-creasingpis slower the largera1is (fora1= 0.05 they are already indistinguishable from 0, within the precision of our MC calculations, fromp= 3 onwards).

We finally take a look at the dependence onx0. The next table gives an example, again withα= 0.01,k= 1 and lagp= 1, and for three values ofa1:

x0\eˆx0₍₁_,₂ _a1_{= 0}_.₀₅ _a1_{= 0}_.₁ _a1_{= 0}_.₁₅

0 0.07 (0.04) 0.11 (0.04) 0.14 (0.04) 0.1 0.1 (0.04) 0.16 (0.04) 0.18 (0.05) 0.2 0.14 (0.04) 0.21 (0.05) 0.23 (0.05) 0.3 0.18 (0.04) 0.24 (0.05) 0.26 (0.05) 0.4 0.20 (0.05) 0.25 (0.05) 0.27 (0.05) 0.5 0.22 (0.05) 0.26 (0.05) 0.27 (0.05) 0.6 0.24 (0.05) 0.28 (0.05) 0.28 (0.05) 0.7 0.25 (0.05) 0.28 (0.05) 0.28 (0.05) 0.8 0.25 (0.05) 0.28 (0.05) 0.28 (0.05) 0.9 0.26 (0.05) 0.28 (0.05) 0.28 (0.05) 1 0.27 (0.05) 0.28 (0.05) 0.29 (0.05)

Table 7.5 - Non-stationary excess tail probabilities as function of x0.

For small x0 there is a notable effect on the size of the excess probability, with initially an roughly linear increase which for large values of x0 flattens out.

Appendix A. Asymptotic expansions Mellin transforms

We recall some basic facts regarding the Mellin transform, and its relation with asymptotic expansions of integrals. Ifu: [0,∞)→Ris a locally integrable function

which is bounded near 0, and has polynomial decay|u(x)| ≤C|x|−k_{, then its Mellin}

transformu#₍_s_{), defined by}

(54) u#(s) =

Z ∞

0

u(x)x−s−1dx, Res <0.

The integral is absolutely convergent, and defines a holomorphic function on {s= σ+iη∈C:−k < σ <0}, which is bounded on each sub-strip{−k+ε < σ <−ε},

ε >0.

If u#(s) is integrable on the line (σ−i∞, σ+i∞), −k < σ < 0, then we can recuperateufromu# using Mellin’s inversion formula:

(55) u(x) = 1

2πi

Z σ+i∞

σ−i∞

(24)

Integrability and other decay-properties ofu#₍_s_{) will generally follow from} smooth-ness properties ofu, using integration by parts. A very useful property of the Mellin transform is that it turns convolution on the multiplicative group_R>0into a prod-uct: if

(56) u∗v(x) :=

Z ∞

0

u(y)v(y−1x)dy y ,

then

(57) (u∗v)#(s) =u#(s)v#(s), whenever both sides make sense.

The (classical) connection between the Mellin transform ofuand the asymptotics of u(x) forx→ ∞is based on the following observation: suppose that, under the above hypothesis, u#₍_s_), _s ₌ _σ₊_iη_{, extends to a meromorphic function on a} slightly larger strip {−k0 < σ < 0} (where k0 > k), and has a single simple pole in s=−k∈R<0. Then by shifting the integration path of (55) from Res=σto Res =−k−ε, ε < k0−k, and formally applying Cauchy’s residue theorem, we obtain that

u(x) = Ress=−ku#(s)x−s+ (2πi)−1

Z

Re(s)=−k−ε

u#(s)x−sds

= cx−k+O(x−k−ε),

with

c= Res s=−k

u#(s)= lim

s→−k(s+k)u

#₍_s₎_.

This can easily be made rigorous under the condition that u#(s) is integrable on the line Re (s) = −k−ε and that |u#(σ+iη)| → 0 as η → ±∞, uniformly for s[−k−ε, k+ε].Multiple poles can be handled similarly, but will introduce logaritmic terms: if u#(s) has a pole of orderpins=−k, then

Ress=−k

u#(s)x−s = lim

s→−k

dp−1

dsp−1 (s+k)

p_u#₍_s₎_xs (58)

=

p−1 X

ν=0

cν(logx)νx−k.

The coefficients cν can be computed by noting that if

u#(s) = a−p (s+k)p +

a−(p−1)

(s+k)p−1 +· · ·+ a−1

s+k +a0+a1(s+k) +· · ·, is the Laurent expansion of u#(s) around s = −k, then the residue in (58) is the coefficient of (s+k)−1 in the Laurent expansion of the product u#(s)xs = x−k_u#₍_s₎_e(s+k) logx_{. This gives}

(59) cν=

a₋(ν+1) ν! .

In case there is more than one pole, one simply adds the contributions the individual poles. than one pole, we simply add up the contribution of each of the poles to the asymptotics. In particular, ifu#₍_σ₊_iη_{) extends merorphically to}_{−_k₋_{N < σ <} 0}, with poles We summarize this discussion in the following lemma:

Lemma A.1. (i) Suppose that, for suitablep,cν ∈Candη >0, we have

(60) u(x) =

p−1 X

ν=1

(25)

Thenu# _{is of the form}

(61) u#(s) =

p

X

j=1

a−j(s+k)−j+h(s),

with h=h(s)holomorphic on−k−η <Re s <0.

(ii) Conversely, suppose that u#(s)is given by (61), with t→u#(σ+it) inte-grable, and |u#(σ+it)| → 0 as t → ±∞, uniformly for σ in compact subsets of

(−k−η,0). Then we have that, for all ε >0, u(x)has the truncated asymptotic expansion (60), with η replaced by η−ε.

Proof. (i) This follows from

u#(s) =

p−1 X

ν=0

Z ∞

1 cν

(logx)ν

xk+s+1dx+h(s),

with h(s) holomorphic on−k−α <Res <0, and

Z ∞

1

(logx)ν

xν+1 x

−s_dx_{= (}₋₁₎ν

_d

ds νZ ∞

1 dx xk+s+1 =

ν! s+k.

(ii) By shifting the integration path, as explained above. QED

References

[1] Acerbi, C., Tasche, D., 2002. On the coherence of expected shortfall, J. of Banking and Finance, 26, 1487-1503.

[2] Artzner, P., Delbaen, F., Eber , J. - M., Heath, D., 1999. Coherent measures of risk, Math. Finance, 9, 203-228.

[3] Bank for International Settlements website http://www.bis.org/bcbs/

[4] Basrak, B., Davis, R. A., Mikosch, T.,2002. Regular variation of GARCH processes, Stoch. Process. Appl., 99, 95-115.

[5] Berkowitz, J., O’Brien, J., 2002. How accurate are value-at-risk models at commercial banks? J. Finance LVII, 1093-1111.

[6] Bingham, N. H., Goldie, G. M., Teugels, J. L.,1987.Regular Variation, Cambridge University Press, Cambridge.

[7] Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity, J. of Econo-metrics 31, 307 - 325.

[8] Bollerslev, T., Engle, R. F., Nelson, D. B., 1994. ARCH models, in: Engle, R. F., McFadden, D., eds.,Handbook of Econometrics, vol. 4, North Holland, Amsterdam.

[9] Brummelhuis, R. , Gu´egan, D. , 2001. Multi-period conditional distribution functions for conditionally normal GARCH(1, 1)-models, Univ. of Reims preprint; to appear in J. Appl. Probability, 2005.

[10] Campbell, J. Y., Lo, A. W., MacKinlay, A. C. (1997).The Econometrics of Financial Mar-kets, Princeton University Press, Princeton, New Jersey.

[11] Embrechts, P., Kl¨uppelberg, C., Mikosch, T., 1997.Modelling Extremal Events for Insurance and Finance, Springer, Berlin.

[12] Embrechts, P., McNeil, A., Straumann, D.: Correlation and dependence in risk manage-ment: properties and pitfalls In: Risk Managemanage-ment: Value at Risk and Beyond, ed. M.A.H. Dempster, Cambridge University Press, Cambridge, pp. 176-223

[13] Engle, R. F., 1982. Autoregressive conditional heteroskedacity with estimates of the variance of United Kingdom inflation, Econometrica 50, 907 - 1007.

[14] Engle, R. F. (Editor) (2000).ARCH - Selected Readings, Advanced Texts in Econometrics, Oxford University Press.

[15] Frey, R., McNeil, A., 2000. Estimation of tail-related risk measures for heteroscedastic finan-cial time series : an extreme value approach, J. Empirical Finance, 7, 271 - 300.

[16] Goldie, C. M. , 1991. Implicit renewal theory and tails of solutions of random equations, Ann. Appl. Prob. 1, 126 - 166.

(26)

[18] Kesten, H., 1973. Random difference equations and renewal theory for products of random variables, Acta Math. 131, 207-248.

[19] Mandelbrot, B. (1963). The variation of certain speculative prices, J. Business, 36, 394 - 419. [20] Nelson, D. B., 1990. Stationarity and persistence in the GARCH(1, 1) model, Econometric

Theory, 6, 318-334.

[21] Rockafellar, A. T., Uryasev, S. (2002). Conditional value-at-risk for general loss distributions, J. Banking Finance, 26, 1443-1471.