BIROn - Birkbeck Institutional Research Online
Brummelhuis, R. (2006) Auto-dependence structure of arch-models: tail
dependence coefficients. Working Paper. Birkbeck, University of London,
London, UK.
Downloaded from:
Usage Guidelines:
Please refer to usage guidelines at
or alternatively
ISSN 1745-8587
Birkbeck Workin
g
Pa
p
ers in Economics & Finance
School of Economics, Mathematics and Statistics
BWPEF 0605
Auto-Dependence Structure of
Arch-Models: Tail Dependence Coefficients
Raymond Brummelhuis
May 2006
DEPENDENCE COEFFICIENTS
RAYMOND BRUMMELHUIS
Abstract. We study autodependence in ARCH-models by computing the
auto-lower tail dependence coefficients and certain generalizations thereof, for both stationary and non-stationary time series. This study is inspired by financial risk-management issues, and our results are relevant for estimating probabilities of consecutive value-at-risk violations.
1. Introduction
One of the striking aspects of empirical financial return-series is volatility-clustering: although successive returns are roughly uncorrelated, in the well-known phrase of Mandelbrot [19], ‘large changes tend to be followed by large changes - of either sign - and small changes tend to be followed by small changes’. Auto-dependence in time series is mostly studied in terms of auto-correlation, and for financial returns the squared returns typically exhibit an auto-correlation pattern which is remines-cent of an ARMA-process. A popular class of financial econometrical models that capture this kind of behavior is that of the GARCH-models introduced by Engle [13] and Bollerslev [7]; see also [8], [14].
Although auto-correlation of squared returns provides a satisfactory qualitative explanation for volatility clustering, and for Mandelbrot’s observation cited above, it does not always easily provide answers to questions related to the multi-variate distribution functions associated to the process, like for example the behavior of conditional quantiles at lags greater than 1 for a GARCH(1, 1) - cf. [9]. Motivated by recent developments in risk-management (see Embrechts, McNeil and Strau-mann [12],), we propose to study auto-dependence in time-series from the point of view of alternative, copula-based, dependence measures like rank-correlations, concordance measures or, in this paper, tail dependence coefficients. Indeed, as in risk-management, where linear correlation is the natural dependence measure for multi-variate normal, or more generally elliptic, linear models, but is less suitable for non-linear models or for more general classes of multi-variate distributions, we ar-gue that autocorrelations, while adequate for linear processes, should, for non-linear processes like the GARCH, be supplemented by other measures of auto-dependence. As a further motivation to consider alternative auto-dependence measures in time series, linear autocorrelations are not always defined: in the case of a stationary GARCH only for a limited parameter range, and not at all if the i.i.d. innovations which drive the process do not posses a finite variance. Similarly, if one is interested in the squared process they should at least have a finite fourth moment. (This does not preclude studying sample autocorrelation functions of such processes; cf. [4] and its references).
As already mentioned, the particular dependence measure we will study in this paper is the lower tail dependence coefficient, and certain generalizations thereoff. The lower tail dependence coefficient λX|Y of two random variables X and Y is
than the α-th quantile of X, given that Y will be below its ownα-th quantile; cf. [12]. As we will see, this measure will pick up non-trivial auto-dependence in a stationary ARCH, but, somewhat surprisingly, not in a non-stationary one. For this reason we will introduce certain generalizations, which we will callgeneralized lower tail dependence coefficients, a typical example of which would be the limit, forαtending to 0, of the probability thatX will be smaller than its√α-th quantile given thatY than itsα-th quantile. Here, the square root can be replaced by more general functions ofαtending to 0 at a suitably slower rate thanαitself.
The reasons for considering these particular dependence measures instead of others like Kendall’s τ or Spearman’sρare, firstly, that since these are asymptotic quantities, they are more amenable to a complete theoretical analysis: in terms of copulas, to determineλX|Y and its generalizations, we only need to understand the
behavior of the copula ofX andY near the lower left corner point of its domain of definition, while τ and ρrequire a global knowledge of the copula. Secondly, and from the point of view of applications more importantly, the lower tail dependence coefficient has a direct relevance for financial risk-management, since it can be interpreted in terms of value-at-risk: if X = (Xn)n is a stationary time-series,
representing the daily returns of an investment portfolio, the (unconditional) daily value-at-risk at a confidence level of 1−α is simply the (absolute value of) the lowerα-th quantile ofXn. The financial interpretation is that with a probability of
1−α, daily (percentage) losses will be less than this value-at-risk. Under the Basle rules, for a financial institution to suffer losses on its market portfolio exceeding the value-at-risk at some specified confidence level has regulatory consequences; cf [3]. The lower tail dependence coefficientλXn+1|Xn will provide information on the probability of violating one’s value-at-risk limit on two consecutive days, and similarly for other time lags.
Value-at-risk, as a risk-measure, has been criticized for two, closely related, rea-sons: it does not give an estimate of the size of the loss when it occurs, and, when considering more than one risky asset or portfolio of risky assets, it can fail to be sub-additive. In the terminology of [2], it is not coherent. (It is coherent when restricted to portfolios made up of jointly elliptically distributed assets, cf.[12]). The failure of subadditivity is less relevant if (Xn)n models a financial institutions
entire market portfolio. We note in this respect that Berkowitz and O’Brien [5] re-port that a simple univariate ARMA + GARCH-model of the total re-portfolio return is at least as effective, if not more, for value-at-risk forecasting than the detailed structural value-at-risk models commonly used by banks. As regards the loss-size, this can be estimated by the expected shortfall, which is basically the expectation of Xn given that it is smaller than its α-th quantile. If properly defined for
non-continuous distributions, expected shortfall can be shown to be coherent, cf. [1], [21]. We won’t however consider expected shortfall here, but limit attention to quantiles and value-at-risk, this being at the moment the industry standard.
To investigate the relevance of lower tail dependence as auto-dependence mea-sure for non-linear time-series, we do in this paper a detailed study of the particular, but representative, example of an ARCH(1)-model, the simplest of the GARCH-models. Our results are expected to extend to more general GARCH-processes, and in particular to GARCH(1, 1)’s. If (Xn)n≥0 is an ARCH(1), we will compute the lower tail dependence coefficientλXn+k|Xnboth for stationary processes (taking for X0 a stationary solution of the ARCH(1)-equation), and for non-stationary ones having an a.s. initial value X0 = x0 ∈ R. The latter are of interest for
condi-tional value-at-risk estimation given today’s return; cf. [15]. Under the assumption that n is symmetric we will give, in theorem 2.2 below, an explicit expression
auto-regressive ARCH(1)-parameter a1and the Pareto exponent of the stationary distribution (which can be computed from the former two), but no other parame-ters. In particular, more detailed knowledge of the stationary distribution itself is not required.
The non-stationary case is more delicate, and requires additional hypotheses on n. We will suppose that f, and its first two derivatives have tails with a
Pareto-type inverse power decay. This class of f was singled out because of its
importance in empirical modelling: see e.g. [15]. Somewhat surprisingly, the tail dependence coefficientsλXn+k|Xnnow turn out to be 0, and to still be able to detect a non-trivial auto-dependence in the tails, we have to use the generalized coefficient mentioned above (see definition 2.5 below). The generalized lower tail dependence coefficients turn out to be all equal to 1/2, both for stationary and non-stationary processes. Their value is in particular independent of the ARCH(1)-parameters, which contrasts with the traditional picture provided by the linear auto-correlations of the squared process.
The paper is organized as follows: in section 2 we recall a number of basic definitions and facts, define the generalized tail dependence coefficients and state our main results, which are theorems 2.2 and 2.7 for stationary ARCH(1)’s, and theorems 2.4 and 2.6 for non-stationary ones. In section 3 we prove the results for stationary ARCH(1)’s. Sections 4 to 6 are devoted to the technically more complicated non-stationary case. We first determine, in section 4, the asymptotic behavior of the probability density function ofXngiven an a.s. initial value forX0. The main technical tool in this section is the Mellin transform. The asymptotics of section 4 are first used in section 5 to derive the asymptotics of the quantile function, and then, in section 6, the asymptotic behavior of the tail dependence function. Using these, our results for non-stationary ARCH(1)’s quickly follow. Finally, in section 7 we use Monte Carlo simulations to study lower tail dependence at small but non-zero values of the confidence parameterα. An appendix summa-rizes, for convenience of the reader, the relationship between Mellin transform and asymptotic expansions needed in sections 4 and 6.
2. Main Results
All random variables will be defined on some common probability space (Ω,F,P). Recall that the left-inverse F← of an increasing functionF :R→Ris defined by F←(α) = inf{x : FX(x) ≥ α} for α ∈ R;, cf. [6], [11]. If FX(x) = P(X ≤ x),
x ∈ R is the cumulative distribution function of a random variable X, then the
α-th quantileqX(α) can defined as as the left-inverse ofFX; explicitly,
qX(α) =FX←(α) = inf{x∈R:P(X ≤x)≥α}, α∈[0,1].
All probability distributions we will encounter in this paper, or the functions by which we will be approximate them, will be continuous and strictly increasing in the regions of interest, so that the left-inverse reduces locally to the ordinary one, at least locally.
Let (Xn)n≥0 be an ARCH(1)-process, defined by
(1) Xn+1=
p
a0+a1X2
nn,
wherea0, a1>0, and (n)n≥1is an i.i.d. sequence of random variables with mean 0. We will mostly suppose, for symplicity, thatnis symmetric. We will consider both
stationary and non-stationary ARCH(1)’s. The stationary case amounts to taking X0=dX∞, whereX∞is the stationary solution of (1). Stationary solutions exists if a0>0 and E loga121
have a Pareto-tailed probability distribution, cf. [18], [17], [16], and [11] for a text-book treatment of the case of normally distributed n. Suppose that =d n has
a positive density such that for someh0 ∈(0,∞], E(||h)<∞for all h < h0 and
E(||h0) =∞, and letκ
∞= 2k∞, where k∞>0 is the unique positive solution to
(2) E (a12)k∞
= 1.
Then the probability distribution of the stationary solutionX∞ has a Pareto-type inverse power decay of the tails: there exists a constantc∞>0 such that
(3) FX∞(x)'
c∞
|x|κ∞, x→ −∞,
and similarly for the right tail FX∞(x) = 1−FX∞(x). This result was recently
generalized to arbitrary GARCH(p, q) in [4].
We next turn to tail dependence.
Definition 2.1. (cf. [12]) Let (X, Y) be a random vector. The(lower) tail depen-dence functionofX onY is the function λX|Y : [0,1]→[0,1] defined by
(4) λX|Y(α) :=P(X ≤qX(α)|Y ≤qY(α)), α∈[0,1].
Thecoefficient of lower tail dependence of X on Y is defined by
(5) λX|Y = lim
α→0λX|Y(α), provided the limit exists.
If the limit (5) does not exist, one can consider instead the lim sup and the lim inf, and interpret these as an upper, respectively lower bound on the dependence ofX on Y in the extreme lower tail. If X and Y are independent, thenλX|Y(α) = α
and λX|Y = 0, and if X andY areco-monotonic or counter-monotonic (meaning
that one is an increasing respectively decreasing function of the other),λX|Y(α) =
λX|Y = 1 respectively -1.
In the case of continuously distributed X and Y, λX|Y(α) will only depend on
thecopulaCX,Y ofX andY, which in this case can be characterized as the unique
function CX,Y : [0,1]×[0,1]→ [0,1] such that the joint probability distribution
FX,Y can be written as
FX,Y(x, y) =CX,Y(FX(x), FY(y)).
Since for continuous increasingF,F◦F←=id, it follows thatλX|Y(α) =α−1CX,Y(α, α).
The lower tail dependence coefficient λX|Y can therefore be interpreted as the
di-rectional derivative, in the direction of the diagonal, of CX,Y in (0,0).
In financial modelling terms, thinking of X and Y as percentage returns of financial assets over some fixed common time-period,−qX(α) is interpreted as the
value-at-risk, VaRα, with confidence 1−αover the time period under consideration
(to be multiplied by the amount of capital invested, to be precise). In practice, α will be small,α= 0.05 or 0.01, andqα(X) will be negative; by convention, losses are
recorded as non-negative numbers, whence the minus-sign. With this terminology λX|Y(α) is the probability of the losses ofX exceeding VaRα(X), given that the
losses of Y already exceed VaRα(Y). We refer to [12] for a detailed discussion of
copulas and dependence measures, and their applications to risk management.
Theorem 2.2. Let(Xn)n≥0be a strictly stationary ARCH(1) having symmetrically
distributed innovationsn =dwith densityf such that the stationary distribution
has a Pareto-type tail decay (3). Letκ∞be defined by (2) and letF be the
cumula-tive distribution function of =dn. Then the coefficient of lower tail dependence
of Xn+p on Xn is given by
λXn+p|Xn=κ∞ Z −1
−∞ Z
Rp−1
F −
1
ap/1 2|x1| · · · |xp−1||z| ! p−1
Y
j=1
f(xj)|z|−κ∞−1dxdz,
where dx=dx1· · ·dxp−1. In particular,λXn+p|Xn6= 0.
The integral on the right is convergent, sinceFis bounded by 1 andκ∞>0. It is easily verified that its value is between 0 and 1. Note thatλXn+p|Xn only depends onF,a1and κ∞. In particular, it does not depend on any further characteristics of the, in general unknown, stationary distributionFX∞, not even on the constant
c∞ in (3).
We next turn to non-stationary ARCH(1)’s starting with an a.s. initial condition X0 =x0 ∈R. Let Px0 be the conditional probability
Px0 =
P(·|X0 =x0) and let Fx0
X , q x0
X(α) and λ x0
X|Y denote, respectively, the cumulative distribution function,
quantile and lower tail dependence coefficient with respect to Px0. In the
non-stationary case we will suppose that =d n has a twice differentiable symmetric
probability density f whose derivatives up to order 2 satisfy a Pareto-type decay
condition:
Condition 2.3. There exists constantsκ, c>0 such that
(6) f(x) =
c
|x|κ+1 +ρ(x), x6= 0, with remainderρ satisfying
(7) |ρ|, |x
d dxρ|,|x
2 d2 dx2ρ| ≤
C
|x|κ+1+η0, |x|>0,
for suitable constantsC, η0>0.
We use κ+ 1 as Pareto exponent in (6) since this corresponds to an exponent of
κ forF. The hypotheses onf cover cases like the Student distributions and the
Pareto-type distributions originating from extreme value theory; cf. [12], [15]. It follows from (6) and (7) that the j-th derivative satisfiesf(j)(x) =cj|x|−κ−j−1+
O |x|−κ−j−1−η0, forj= 0,1,2. We then have:
Theorem 2.4. Suppose the pdf f of n is symmetric and satisfies condition 2.3.
Then the conditional tail dependence coefficients all vanish: λx0
Xn+k|Xn = 0 for all x0∈R,n, k≥1.
This is a somewhat unexpected result, whose apparent contradiction with theorem 2.2 as n → ∞ will be clarified in sections 5 and 6 below, cf. remark 6.2. There does not seem to be anything special about the n’s satisfying condition 2.3: using
the results of [9], theorem 2.4 can be shown to be also true for GARCH(1, 1)’s with normal innovations
Definition 2.5. let (X, Y) be a real random vector, and letψ: (0,1]→(0,1], such that limα→0ψ(α) = 0. We define the generalized lower tail dependence coefficient
of X onY,λψX|Y, by
(8) λψX|Y := lim
α→0P(X ≤qX(ψ(α))|Y ≤qY(α)), assuming the limit exists.
Only the behavior ofψat 0 matters andψonly needs to be defined on some small sub-interval (0, δ), δ > 0. It is easily seen that for continuous FX and FY, λ
ψ X|Y
again only depends on the copula CX,Y, and that
λψX|Y = lim
α→0 1
αCX,Y(ψ(α), α),
the directional derivative of CX,Y in (0,0) along the curve α → (ψ(α), α). We
specifically allow curves which are tangential to one of the axes of the copula’s domain of definition, [0,1]×[0,1]. If X and Y are independent, then clearly
P(X < qX(ψ(α))|Y < qY(α)) = ψ(α), and λψX|Y = 0. Embrechts, McNeil and
Strauman [12] call two random variables X andY asymptotically independent in the lower tail ifλX|Y = 0. More generally, one might callX andY strongly
asymp-totically independent(in the lower tail) if λψX|Y = 0, for all positiveψ on (0,1] for which limα→0ψ(α) = 0. Theorem 2.4 showed that consecutive valuesXn, Xn+p of
a conditional ARCH(1) with a.s. initial condition are asymptotically independent in the lower-tail. The next theorem shows that they are far from being strongly asymptotically independent. Letλx0,ψ
X|Y be the twisted lower tail dependence
coeffi-cient (8) computed with respect to the measurePx0.
Theorem 2.6. Under the hypotheses of theorem 2.4, we have that
(9) λx0,ψ
Xn+p|Xn= 1 2,
for all functions ψ: (0,1]→(0,1]satisfying
(10) lim
α→0ψ(α) = limα→0
α(logψ(α)−1)p−1 ψ(α) = 0.
There is an analogue of theorem 2.6 for stationary ARCH(1)’s:
Theorem 2.7. Under the conditions of theorem 2.2, if (Xn)n≥0 is a stationary
ARCH(1), then λψX
n+p|Xn = 1
2, for allψ=ψ(α)satisfying lim
α→0ψ(α) = limα→0 α ψ(α) = 0.
Typical examples of functions ψ(α) satisfying the conditions of theorems 2.6 and 2.7 areψ(α) =α1−ε, for any∈(0,1].
It is instructive to compare these results with the picture provided by classical correlation. A straightforward computation shows that, if (Xn)n is a stationary
ARCH(1) with E(2) = 1 such thata21E(4)<1 (which implies existence of fourth moments), then the linear correlation of Xn2 and Xn2+p exists and decays as a
p
1 if a1→0. To quantify auto-dependence and heteroscedasticity in an ARCH if fourth moments do not exists one has to look for alternatives to linear correlation. The lower tail dependence coefficientλXn+p|Xnwill always exist, and also converges to 0 asa1→0, by theorem 2.2 and the dominated convergence theorem - the exact order of convergence is less obvious, though. The generalized tail dependence coefficients λψX
It would be interesting to determine the precise behavior of λXn+p|Xn for a stationary ARCH(1) as eithera1→0 orp→ ∞.
Theorems 2.6 and 2.7 are asymptotic results forαtending to 0, and it is reason-able to ask how indicative they are for what happens at a small but non-zero α. Taking for exampleα= 0.01 and ψ(α) =√α, how close to 50% is the probability that Xn+1 will be below its 10% quantile, given that Xn will have been below its
1%-one? In risk management terms, what is the probability that a portfolio will, on day n+ 1, incur a loss exceeding the 10% value-at-risk if on daynit will have suffered a loss greater than its 1% VaR? In section 7 we briefly investigate this question using Monte Carlo simulation. As we will see, depending on the value of a1, this generalized lower tail dependence at sayα= 0.01 can be very close to its asymptotic value of 0.5, and even for values ofa1as small as 0.1 the difference with the i.i.d. case (corresponding to ana1 equal to 0) is significant.
Another interesting question for future research is whether empirical time series of financial returns behave like in theorems 2.2, 2.6 and 2.7. We note in this respect that these theorems provide an attractive quantitative reformulation of Mandelbrot’s observation quoted in the first sentence of the introduction.
3. Lower tail dependence for stationary ARCH(1)
In this section we will prove theorems 2.2, 2.7 and 2.6. Let us put
FXn+p(x|Xn=y) =P(Xn+p≤x|Xn =y).
Lemma 3.1. Given y∈R andn, p∈N, p≥1, define functions sp(y;·) :Rp−1 → R≥0, by
sp(y;x) :=
(a0+a0a1x2
p−1+a0a21x2p−2x2p−1+· · ·+a0a
p−1
1 x21· · ·x2p−1+a
p
1x21· · ·x2p−1y2)1/2,
were x= (x1,· · ·, xp−1). Then for all n≥0 andp≥1,
(11) FXn+p(x|Xn =y) = Z
Rp−1
F
x sp(y;x)
p−1 Y
j=1
f(xj)dx.
Proof. Straightforward induction onp, observing that
P(Xn+1≤x|Xn=y) =F
x p
a0+a1y2 !
,
and using
P(Xn+p+1≤x|Xn=y) =
Z
R
F
x q
a0+a1x2
p
fXn+p(xp|Xn=y)dxp,
for the induction step. Here, fXn+p(x|Xn =y) stands for the conditional density, d
dxF 0
Xn+p(x|Xn=y) (whose existence also follows by induction). QED Corollary 3.2. Let x <0 be arbitrary. Then the functiony →FXn+p(x|Xn=y)
is decreasing on {y <0}.
Proof. Fixx <0, and lety≤y0 <0. Theny2 ≥y02 and therefore 0≤s
p(y0,x)≤
sp(y;x), for allx∈R. Sincex <0, it follows thatsp(y0;x)−1x≤sp(y;x)−1xand
hence
F
x sp(y0;x)
≤F
x sp(y;x)
.
One can also give a proof by straightforward differentiation. Let qn(α) be theα-th
quantile of FXn.
Lemma 3.3. Suppose thatqn(α)<0 and letx <0. Then
(12) FXn+p(x|Xn=qn(α))≤P(Xn+p≤x|Xn≤qn(α))≤ 1 2.
Proof. We have that
(13) P(Xn+p≤x|Xn≤qn(α)) =
1 α
Z qn(α)
−∞
FXn+p(x|Xn=y)fXn(y)dy,
where fXn = F 0
Xn, the probability density. Integrating by parts, and using that FXn(qn(α)) =α, we see that (13) equals
FXn+p(x|Xn=qn(α))− Z qn(α)
−∞ d
dyFXn+p((x|Xn =y)
FXn(y) α dy
≥ FXn+p(x|Xn=qn(α)),
by corollary 3.2. The other inequality in (3.3) follows from
P(Xn+p≤0|Xn≤qn(α)) =
1 α
Z qn(α)
−∞
FXn+p(0|Xn=y)fXn(y)dy
= 1
2,
by (11), the symmetry of and the definition ofqn(α). QED
Recall the definition of the twisted tail-dependence coefficients from the intro-duction. We then can state:
Corollary 3.4. Letψ=: (0,1]→(0,1]be defined in a neighborhood of0 such that
(14) lim
α→0
qn+p(ψ(α))
qn(α)
= 0.
ThenλψX
n+p|Xn= 1/2.
Proof. We first observe that (14) implies that qn+p(ψ(α))
sp(x, qn(α))
→0, α→0.
By (11) and dominated convergence,FXn+p(qn+p(α)|Xn=qn(α))→ 1
2. The
corol-lary now follows from (12). QED
Proof of theorem 2.7. If (Xn)n≥0 is stationary, with X0 = X∞, then qn(α) =
FX←∞(α)' (c∞/α)1/κ∞, as α→0, for all n. It follows that (14) is equivalent to
α=o(ψ(α)),α→0. QED
Proof of theorem 2.2. Takingx==qn(α) =FX←∞(α) =:q∞(α) in lemma 3.1 (where
X∞denotes the stationary ARCH(1)) and using thatqn+p(α) =q∞(α) also, lemma 3.1 implies that
(15) λXn+p|Xn(α) = 1 α
Z q∞(α)
−∞ Z
Rp−1
F
q
∞(α) sp(y;x)
p−1 Y
j=1
(compare (13), whereF∞(y) =FX∞(y), the distribution function ofX∞.
Integrat-ing by parts with respect to y, and using thatF∞(q∞(α)) =α, we find that
λXn+p|Xn(α) = Z
Rp−1
F
q
∞(α) sp(q∞(α);x)
dx (16)
+q∞(α) α
Z q∞(α)
−∞ Z
Rp−1
f
q∞(α) sp(y;x)
ap
1x21· · ·x2p−1y sp(y;x)3
p−1 Y
j=1
f(xj)F∞(y)dxdy.
As α→0, the integrand of the first term on the right tends to
F
q
∞(α) sp(q∞(α);x)
→F
−1 ap/1 2|x1| · · · |xp−1|
! .
We next change variables in the second integral on the right hand side of (16): y = |q∞(α)|z . Since F∞(x) ' c∞|x|−κ∞ and |q∞(α)| ' (c∞/α)1/κ∞, it follows that
α−1F∞(|q∞(α)|z)→ |z|−κ∞, α→0. Since |q∞(α)|3/sp(q∞(α)z,x)3→a−13p/2|x1|−
3· · · |x
p−1|−3|z|−3. We therefore find that, momentarily writing A=ap/1 2|x1| · · · |xp−1|,
λXn+p|Xn = Z
Rp−1
F
−1
A p−1
Y
j=1
f(xj)dx
−
Z −1
−∞ Z
Rp−1
f
− 1
A|z|
1 A
z
|z|3
p−1 Y
j=1
f(xj)|z|−κ∞dxdz
= κ∞ Z −1
−∞ Z
Rp−1
F
− 1
A|z|
p−1 Y
j=1
f(xj)|z|−κ∞−1dxdz,
after a second integration by parts, which proves (6). QED
Remark 3.5. The proof would simplify if we would know thatX∞has a density F∞0 (x) with the differentiated asymptotics κ∞c∞|x|−κ∞−1, |x| → ∞, since then the partial integrations would no longer be necessary.
4. Asymptotic probabilities for non-stationary ARCH(1)’s
Let (Xn)n∈N be the ARCH(1)-process (1), starting with an a.s. initial value
X0=x0∈R. We assume thatf is symmetric and satisfies condition 2.3: letting
κ0=κ+ 1, then
(17) f(x) =
c
xκ0 +ρ(x),
where (18) xd dx ν ρ(x)
≤ C
xκ0+η0, ν= 0,1,2,
for suitable constantsη0, C >0, for allx >0.The successive realizationsXnof the
ARCH(1) will then have a density fXn which, because of the Markov property of the ARCH(1), will be given by
(19) fXn+1(x) = Z
R
fε
x p
a0+a1y2 !
fXn(y) dy p
a0+a1y2,
with fX1 = σ
−1f
(x/σ1), σ1 =
p
a0+a1x2
Theorem 4.1. With the above notations we have that
(20) fXn(x) =
1
σ1a(1n−1)/2 ϕn
x
σ1a(1n−1)/2 !
,
with the function ϕn having a truncated expansion
(21) ϕn(x) =
n−1 X
ν=0 cν;n
(logx)ν
xκ0 +rn(x), x >0,
whose error rn(x)satisfies, for any positiveη < η0, an estimate
(22) |rn(x)| ≤
Cη
xκ0+η.
The coefficients cν;n and the remainder-function rn may depend on a0, a1 and σ1,
but both the cν;n and the constants Cη in (22) will remain bounded as long as
max(σ−11a−11/2a0,· · · , σ1−1a−1(n−1)/2a0) remains bounded. Moreover, the top order coefficient is given by
(23) cn−1;n= 2n−1
cn
(n−1)!,
and is therefore independent of a0, a1 andσ1.
We obtain a corresponding result for the cumulative distribution function FXn by integration. We limit ourselves to the top-order asymptotics. Recalling that κ=κ0−1, a simple integration by parts gives:
Corollary 4.2. We have that
(24) FXn(x) = Φn
x
σ1a(1n−1)/2 !
,
where, asx→ −∞,
(25) Φn(x) =Cn
(log|x|)n−1
|x|κ +Rn(x),
with Cn=cn−1;n/κ= 2n−1cn/κ(n−1)!, and errorRn(x)satisfying
(26) |Rn(x)| ≤C
|log|x||n−2
|x|κ , x <0,
with uniformly bounded constantCwhenevermax(σ−1a−1/2
1 a0,· · ·, σ −1 1 a
−(n−1)/2
1 a0)
stays bounded.
Remark 4.3. The asymptotics forxfixed, n→ ∞are different from those forn fixed |x| → ∞.It would be interesting to derive joint asymptotics in (x, n).
4.1. The case a1 = 1. We first take both a1 and σ1 equal to 1, and look for a truncated asymptotic expansion of fXn which is uniform in the parameter a0 for bounded a0: a0 ∈ [0,1] say (we might have let a0 be restricted to any bounded interval in [0,∞)). Define the operatorF onL1([0,∞)) by
(27) F(v)(x) =
Z ∞
0 fε
x p
a0+y2 !
v(y)p dy a0+y2;
F is a positivity-preserving operator on L1([0,∞)) of norm 1. Let || · ||1 be the L1-norm with respect to Lebesgue-measure on [0,∞).
Lemma 4.4. Let {v(· ;a0) ;a0 ≥ 0} be a family of non-negative functions in C1([0,∞)), such that ||v(· ;a
0)||1 = 1, for all a0, and such that v(x;a0) has the
truncated asymptotic expansion
(28) v(x;a0) =
n−1 X
ν=0 cν
(logx)ν
xκ0 +r(x), x >0,
with constants cν = cν(v;a0) uniformly bounded in a0 ≤ 1, and with remainder r(x) =r(x;v, a0)satisfying
(29)
x d dx
ν r(x)
≤ C
xκ0+η, x >0, ν= 0,1,
for suitable positive constants C and η with η < 1, uniformly for a0 ≤ 1. Let
u=u(·;a0)be given by
u(x;a0) =F(v(·;a0)) (x).
Thenu∈C1([0,∞)),u≥0, anduhas a truncated expansion
(30) u(x;a0) =
n
X
ν=0
c0ν(logx)
ν
xκ0 +r
0(x), x >0,
with the coefficients c0ν =cν(u;a0)again uniformly bounded in a0≤1 andr0(x) = r0(x;v, a0) satisfying estimates (29) with η replaced by any smaller η0 < η, also uniformly in a0≤1. Moreover
(31) c0n=
ccn−1 n .
The truncated expansion (28) is compatible withv(·;a0) being inL1 sinceκ0>1.
Proof. The idea is to write uas a Mellin convolution (see Appendix), by making the change of variables z=py2+a0. This gives
u(x) = f∗ev(x)
=
Z ∞
0 f
x
z
e v(z)dz
z ,
with
(32) ev(z) = √ z
z2−a 0
vpz2−a01
[√a0,∞)(z),
1A being the indicator function of a set A. However, in doing so we introduce
a singularity at z = √a0, which would spoil the decay properties of the Mellin transform needed later, and which we will therefore eliminate by a preliminary smooth cut-off. Let χ ∈C∞([0,∞)) be such that 0≤χ≤1, supp (χ)⊂[A,∞), χ= 1 on [2A,∞), where A >0 will be chosen below, and writev=v(x;a0) as (33) v=v0+v1, v1=χv, v0= (1−χ)v.
First of all, by the hypothesis (17) onf, we have that
F(v0)(x) = R∞
0 f
x
√
y2+a 0
v0(y)√dy
a0+y2
=x−κ0c
R∞ 0 (y
2+a0)κ0 −1
2 v0(y)dy+R∞
0 (y
2+a0)−1/2ρ
(y2+a0)−1/2x
v0(y)dy
=: c00 xκ0 +r
0 0(x),
with
(34) c00=c00(a0) :=c
Z ∞
0
(y2+a0)κ0 −21v0(y)dy,
and
(35) r00(x) =r00(x;a0) =
Z ∞
0 ρ
x p
y2+a0 !
dy p
y2+a0.
Since κ0 > 1 and v0 is supported in [0,2A], it follows that c00(a0) is uniformly bounded, for a0≤1, by
c(4A2+ 1) κ0 −1
2 ||v0||1=c(4A2+ 1)
κ0 −1 2 ,
since 0≤(1−χ)≤1 and||v||1= 1.
Likewise, forr00(x;a0) we estimate, using (18) (withν= 0):
|r00(x;a0)| ≤ C
xκ0+η
0
Z A
0
(y2+a0)κ
0+η
0−1
2 v0(y)dy
≤ C
0
xκ0+η
0,
withC0=C(4A2+a0)(κ0+η0−1)/2.This proves thatF(v0) has a truncated
asymp-totic series of the desired form (30), (29).
We next turn tou1=F(v1). First rewriteu1as a Mellin transform,u1=f∗v1e,
withev1defined by (32) , withvreplaced byv1. Observe that if we takeA >1≥a0, then v1e will be C1 on [0,∞). From (28), (29), and remembering that η < 1, we easily obtain that
e
v1(x) = (χ v)px2−a0√ x x2−a0
=
n−1 X
ν=0 cν
(logx)ν
xκ0 +er1(x),
with er1 = er1(x;a0) satisfying (29), uniformly in a0 ≤ 1 (with the same η but possibly a differentC). It follows that the Mellin transformev1#=ev1#is holomorphic on{s∈C:−κ0<Res <0}, and meromorphic on{s:−κ0−η <Re (s)<0}with
(36) ev1#=
n
X
ν=1 a−ν
(s+κ0)ν +h(s),
with a−ν = (ν−1)!cν−1 andh=h(s) holomorphic on{s:−κ0−η <Re (s)<0} (cf. the Appendix). It is easily verified that (28) and (29), together with the differentiability ofv, imply that the truncated series (28) can be differentiated, and that
xd
dxev1(x) =
n−1 X
ν=0
c00ν(logx)
ν
xκ0 +x
for suitable constantsc00ν, which are simply obtained by term-by term differentiation of the truncated expansion for ev1. (A similar remark applies to (17) and (18).) Taking Mellin transforms, we conclude that, given thater1satisfies (29) withν= 1,
sev1#(s) =
n−1 X
ν=0 Aν
(s+κ0)ν +g(s),
with g =g(s) holomorphic on{s :−κ0−η <Re (s)<0}, and bounded on each closed sub-strip {s : −κ0−η+ε ≤Re (s)≤ −ε}, ε > 0. From this we conclude that
(i) Ifσ∈(−κ0−η,0) andσ6=−κ0, then
(37) t→ev1#(σ+it) ∈ L2(R),
and
(ii) For any closed sub-interval [a, b]⊂(−κ0−η,0), we have that
(38) max
σ∈[a,b] ev
#
1(σ±iR)
→0 asR→ ∞.
Similar assertions can be proved, in the same way, forf#(s), starting from (17) and (18) with ν = 0 and 1.It follows that for arbitrary ρ∈(−κ0,0) we can apply the inversion formula for the Mellin transform to write u1(x) =F(v1)(x) as
(39) u1(x) = 1
2πi
Z σ+i∞
σ−i∞
f#(s)ev1#(s)xsds,
(the integral converges since the productf#(s) e
v#1(s) is inL1 of the line{Re (s) = σ}). Moreover, the right conditions are met to shift the integration path to{Res=
−κ−η0}, where η0 < η is arbitrary. In doing so, we will pick up the residue in s=−κ0:
1
2πiRess=−κ0
c
s+κ0
n
X
ν=1
(ν−1)!cν−1 (s+κ0)ν +h(s)
! xs
!
=
n
X
ν=0
c0ν(logx)
ν
xκ0 ,
(40)
as one sees by writing xs=x−κ0exp ((s+κ0) logx). It follows in particular that
(41) c0n= ccn−1
n . Hence
u1(x) =
n
X
ν=0
c0ν(logx)
ν
xκ0 +r
0(x),
where the remainder term is given by the line-integral over Re s=−κ−η0:
r0(x) = 1 2πi
Z
Res=−κ−η0
f#(s)ev#1(s)xsds,
which, by an obvious estimate, will satisfy
(42) |r0(x)| ≤C/xκ0+η0.
To show thatx(d/dx)r0 satisfies the same estimate, we first differentiateu1, to find
x d dxu1=
xdf
dx
Now sincexdfε/dxsatisfies (17) and (18) forν= 0,1, we can repeat the argument
above to conclude that there exist constants c00ν such that
x d dxu1=
c
X
ν=0
c00ν(logx)
ν
xκ0 +r
00(x),
with r00(x) satisfying an estimate of the type (42). That is, x(d/dx)u1 will have a truncated asymptotic series of the same type asu1. A moment’s thought then shows that, since the differential operatorxd/dx, formally applied to the expansion ofu1, yields an expansion of the same form, we necessarily must have thatr00= (xd/dx)r0. This shows thatr0 satisfies (29) (withη0 instead ofη), and completes the proof of
the Main Lemma. QED
4.2. Proof of theorem 4.1. Lemma 4.4 allows us to quickly prove theorem 4.1, using induction on n. Fix an initial value x0 ∈ R and put σ1 = pa0+a1x20. Our induction hypothesis is that fXn(x) is of the form (20), with ϕn(x) having a truncated expansion (21), with the cν;n and rn satisfying the stated uniformity
conditions. For n= 1, this is satisfied withϕ1=f, by the hypothesis onf.
LetFa0,a1 be the norm-preserving positive operator onL
1(
R≥0 defined by
(43) Fa0,a1(v)(x) = 2
Z ∞
0 fε
x p
a0+a1y2 !
v(y)p dy a0+a1y2
,
so that theF defined by (27) equalsFa0,1.Then we have, by the Markov property
of our ARCH(1)-process, and the symmetry of f, that
fXn+1(x) = 2Fa0,a1(fXn) (x)
= 2Fa0,a1
1
σ1a(1n−1)/2 ϕn
y
σ1a(1n−1)/2 !!
(x)
= 2
σ1an/1 2
Fσ−1a−n/2
1 a0,1(ϕn)
x
σ1an/1 2 !
,
by an easy change of variables in the integral definingFa0,a1.Put
ϕn+1:=Fσ−1a−n/2
1 a0,1(ϕn).
Then by lemma 4.4, applied tov=ϕn, we find that
ϕn+1(x) =
n
X
ν=0 cν;n+1
(logx)ν
xκ0 +rn+1(x), withrn+1satisfying (29), uniformly in max(σ1−1a
−1/2
1 a0,· · ·, σ −1 1 a
−n/2
1 a0), and uni-formly bounded coefficients cν;n+1 for these values of the parameters. Moreover, cn;n+1= 2ccn−1;n/nand therefore, sincec0;1=cε,cn−1;n= 2n−1cnε/(n−1)!.QED
5. Asymptotic behavior of the non-stationary VaR
For the computation of the α→0 asymptotics of the tail-dependence function λx0
Xn+p|Xn(α) we need to know the asymptotic behavior of the quantile functions qx0
Xn(α) =F x0,←
Xn (α). This can be computed from corollary 4.2: Lemma 5.1. Asα→0,
(44) qx0
Xn(α)'σ1a (n−1)/2 1
C
n
α
1/κ log(α−1C
n)
κ
n−κ1 ,
Proof. Since it is not true in general that the map F → F← is continuous with respect to the uniform topologies, we proceed cautiously.
First of all, sinceFx0
Xn(x) = Φn(x/σ1a (n−1)/2
1 ), we immediately have that
(45) qx0
Xn(α) =σ1a (n−1)/2
1 Φ
← n(α).
From corollary 4.2 we see that for any ε > 0 we can find a positive real number R(ε) such that ifx <−R(ε), then
(Cn−ε)
(log|x|)n−1
|x|κ ≤ΦXn(x)≤(Cn+ε)
(log|x|)n−1
|x|κ .
Call the two functions on the left and right of this inequality Φ±ε
n . Since these are
strictly increasing, and since Φ−ε
n (x)≤Φn(x)≤Φεn(x), we will have that
(Φεn)−1(α)≤Φ←n (α)≤(Φ−nε)−1(α),
at least for those αfor which the biggest of these three numbers is≤ −R(ε). (By theorem 4.1, Φ0
n(x) =ϕn(x) is strictly positive ifxis sufficiently small, so that we
could have replaced Φ←
n by its ordinary inverse also). To compute (Φεn)−1(α) we
use the following elementary lemma.
Lemma 5.2. Leta >0 and letx=x(a)be the (unique) positive solution to
(logx)n−1 xκ =a.
Then
x(a) =a−1/κ
loga−1
κ
n−κ1 g(a),
where g(a)→1 asa→0.
Proof. Put logx=y,x=x(a). Theny has to solve y−n−1
κ logy= log(a
−1/κ) =:A.
If we try a solution of the form y=A+n−κ1logA+y0, theny0 =y0(A) will have to solve:
y0 = n−1 κ log
1 +n−1 κ
logA+y0 A
≤
n−1 κ
2logA+y0
A
.
Hence we find that1
0 < y0 ≤
n−1 κ
2 logA/A
1− n−1
κ
2 A−1
≤ 2
n−1 κ
2logA
A ,
provided that A≥2κ−2(n−1)2. Exponentiating, and puttingg(a) := expy0(A), we find that
x(a) =a−1/κ
loga−1
κ
(n−1)/κ g(a),
where
(46) 1≤g(a)≤exp 2(n−1)2logA/κ2A ,
1observe that ify
0:=A+ (n−1) logA/κ, theny0−(n−1) logy0/κ≤A, so thaty0 will have
if loga−1/κ≥2κ−2(n−1)2. Henceg(a)→1 asa→0. QED Returning to the proof of lemma 5.1, it follows that
Φ±nε−1 =
(C
n±ε)
α
1/κlog(α−1(C
n±ε))
κ
(n−1)/κ
g±ε(α),
with g±ε(α) → 1 as α → 0, uniformly in 0 ≤ ε ≤ε0 < Cn (as a glance at (46)
shows). Putting
Qn(α) =
C
n
α
1/κlog(α−1C
n)
κ
(n−1)/κ ,
we see that
C
n−ε
Cn
log(α−1C
n−ε)
log(α−1C
n)
(n−1)/κ g−ε(α)
≤Φ
← n (α)
Qn(α)
≤
Cn+ε
Cn
log(α−1Cn+ε)
log(α−1C
n)
(n−1)/κ gε(α).
Lettingα→0, we conclude that for all sufficiently smallε >0,
(1− ε
Cn
)≤lim inf
α→0
Φ←n(α) Qn(α)
≤lim sup
α→0
Φ←n (α) Qn(α)
≤(1 + ε Cn
),
and the lemma follows by letting ε→0. QED
6. Asymptotics of the non-stationary tail-dependence function
The asymptotic behavior of the conditional lower tail dependence function can be determined quite precisely, and turns out to beuniversalfor the class of ARCH(1)’s whose innovations (n)n satisfy condition 2.3 with a givenκ and c, in the sense
that it does not depend on either a0, a1 or the initial valuex0, nor on the further details of f:
Theorem 6.1. Let (Xn)n∈N be an ARCH(1) with a.s. initial value X0 = x0
and a1 >0 strictly positive. Suppose that n has a twice differentiable symmetric
probability density satisfying condition 2.3. Then for all n, p≥1,
(47) λx0
Xn+p|Xn(α)'γn,p
log(logα−1)p
(logα−1)p , α→0,
where
(48) γn,p=
22n+2p−3 p!(n−1)!(n+p−1)!
c2n+2p
κ2
.
In particular, λx0
Xn+p|Xn =limα→0λ x0
Xn+p|Xn(α) = 0.
Remarks 6.2. (i) Observe that the conditional lower tail dependence function (47) tends to 0 at an exponentially slower rate than for independent random variables. For values of a1 such that the ARCH(1) has a stationary solution, the different conclusions of theorems 2.2 and 6.1 may appear strange, in view of the geometric ergodicity of the process (cf. [4] and its references). The explanation is that, in case of a stationary process,qXn(α) =qX∞(α) =qXn+p(α) for allnandp, while for a conditional ARCH(1) and n, p≥1,qXn+p(α)/qXn(α)→ ∞ as α→0 (although slowly), as follows from lemma 5.1.
(ii) The results of [9] can be used to derive the asymptotics of λx0
Proof of theorem 6.1 To simplify notations, we will writeqn(α) forqXx0n(α), and we will generally leave off the super-index x0. Since the joint probability density of (Xn+p, Xn) is a product of the conditional density of Xn+p|Xn and the density of
Xn, we see as before that
P(Xn+p≤qn+p(α)|Xn≤qn(α)
= 1 α
Z qn(α)
−∞
FXn+p(qn+p(α)|Xn =y)fXn(y)dy
= 1 α
Z qn(α)
−∞ Φp
qn+p(α)
a(1p−1)/2pa0+a1y2 !
ϕn
y
σ1a(1n−1)/2 !
dy
σ1a(1n−1)/2 ,
= 1 α
Z ∞
1
Φp −
qn+p(α)/qn(α)
a(1p−1)/2pa0+a1y2 !
ϕn
qn(α)y
σ1a(1n−1)/2 !
qn(α)
σ1a(1n−1)/2; dy,
where we have used (24) for FXn+p(·|Xn = y), withn instead of p and Xn = y insteadX0=x0. Sinceqn(α)/σ1a
(n−1)/2
1 = Φ←n (α), we have that
qn+p(α)/qn(α)
a(1p−1)/2pa0+a1y2 =
Φ←n+p(α)/Φ←n (α) p
e a0+y2
,
where we have put ea0 := a−11a0. In what follows we will, without further com-ment, replace various quantities by their asymptotic equivalents: this can be made rigorous by standard estimates. First of all, by theorem 4.1,
P(Xn+p≤qn+p(α)|Xn≤qn(α))
(49)
'cn−1;n
α (Φ ← n (α))
−κZ ∞
1 Φp
Φ←
n+p(α)/Φ←n (α)
p e a0+y2
!
(log Φ←n (α)y)n−1 yκ+1 dy.
By lemma 5.1,
(Φ←n ) (α))−κ' α
Cn
logα−1C
n
κ
−(n−1) ,
where the α will cancel the 1/α in front of (49). To simplify notations, we put λ:=α−1C
n and Λ := log Φ←n (α)'κ−1logλ. Then, using lemma 5.1 once more,
we see that
Φ←n+p(α)/Φ←n (α)'γ(logλ)p/κ, γ=γn,p:= (Cn−1Cn+p)1/κ
and (49) equals
cn−1;n
Cn
κ
logλ
n−1 n−1 X
ν=0
n−1 ν
·
(50)
Λn−1−ν
Z ∞
1
Φp −
γ(logλ)p/κ
p e a0+y2
!
(logy)ν
yκ
dy y .
If we make the change of variables z = pea0+y2, we recognize the integrals as being the Mellin convolution, evaluated inγ(logλ)p/κ), of Φp(z)1(−∞,0)with
gν(z) :=
(log√z2− e a0)ν (z2−
ea0) (κ+1)/2
z
√
z2− e a0
1[ea0+1,∞)(z),
Res <0}, 0< η <1, with a unique pole of orderpin s=−κand principal part Pp
ν=0a−ν(s+κ)
−ν, witha
p= (p−1)!Cp. The Mellin transform ofgν equals
gν#(s) =
Z ∞
ea0+1
(logy)ν
yκ (y
2+ e a0)−s/2
dy y
=
Z ∞
1
(logy)ν
yκ y −s
1 +ea0
y2
s/2dy y
= ν!
(s+κ)ν+1 +H(s), where
H(s) =
Z ∞
1
1 + ea0 y2
s/2
−1 !
(logy)ν yκ+s+1dy,
is holomorphic on {−κ−1 < Res < 0}, as follows easily form the asymptotic behavior of (1 +ea0y−2)s/2−1 as y → ∞. Using the arguments of section 3, one finds that
Λn−1−ν·
Z ∞
1
Φp −
γ(logλ)p/κ
p e a0+y2
!
(logy)ν
yκ
dy y
' Λn−1−ν·Constν·
log(γ(logλ)p/κ)p+ν γκ(logλ)p
= O
(logλ)n−1−ν·(log(logλ))
p+ν
(logλ)p
.
It follows that the dominant term in (50) is the one with ν = 0. Computing the constant Const0, which givesCp/p, we find that (49) is asymptotically equivalent
to
cn−1;nCp
p Cn
κ
logλ n−1
Λn−1 (log logλ)
p
(Cn+p/Cn)(logλ)p
' cn−1;nCpCp+n
p
(log logα−1)n−1 (logα−1)p .
This proves theorem 6.1. QED
Proof of theorem 2.4. Immediate from (47). QED
Proof of theorem 2.6. We may use lemma 3.3 and corollary 3.4., withP,qn(α) and
λψX|Y replaced by Px0, qx0
n (α), and λ ψ,x0
Xn+p|Xn. By lemma 5.1, (44), we see that, modulo an immaterial constant,
qx0
n+p(ψ(α))
qx0 n (α)
'
α
ψ(α)
1/κ logψ(α)−1C
n+p
(n+p−1)/κ
(logα−1C
n)(n−1)/κ
,
forα→0. Ifψ(α)→0 andα(logψ(α)−1)p/ψ(α)→0, then certainlyα/ψ(α)→0
and consequently logψ(α)−1 = O(logα−1) (using that ψ(α) < 1 for sufficiently small α). It follows that the expression on the right hand side will be dominated
by a constant times α(logψ(α)−1)p/ψ(α)1/κ
and hence tends to 0 withα. QED
7. Generalized lower tail dependence functions at non-zero α
Consider a stationary ARCH(1) (Xn)n≥0 and let
(51) λψ(α) :=P Xn+p< qXn+p(ψ(α))|Xn< qXn(α)
,
conditions onψ. To see what happens at a small but non-zeroαwe have computed some λψ(α, p) using Monte Carlo simulations. What follows is not intended as a
complete simulation study into the behavior of these functions, but as a simple ex-ploratory investigation. Its main aim is to illustrate that our theorems are relevant at values of α equal to 0.01 or 0.05, which are the values typically used in risk management. We have not, in particular, tried to improve the precision by using any variance reduction or importance sampling tecniques, leaving these for future work. We will also limit ourselves toψ(α) =√α, as a typical example of a function ψ satisfying the conditions in theorems 2.6 and 2.7.
It is important to note that what is important at non-zeroαis not the conditional tail probability (51) itself, but rather its difference with ψ(α), the tail probability for the independent case. Remembering that we took ψ(α) = √α, we therefore introduce the (generalized) excess lower tail probability
(52) e(α, p) :=λ
√
·(α, p)−√α.
We first look at stationary processes. Taking a0 = 0.001, we have simulated 250 ARCH(1)’s of length 104 with Student t
4 innovations n and a1 ranging from 0 to 1.6 with step-size 0.1. We note that such an ARCH(1) is strictly stationary if a1 <
√
e ' 1.6489. For each Monte Carlo run we computed the empirical conditional probabilities (51) and the difference (52), and taken their averages over the different runs, to be denoted by ˆλ
√
·(α, p), ˆe(α, p), as well as their standard
deviation of, to assess the significance of the estimated values. The results, rounded off to two significant figures, are given in the following table.
a1 ˆλ √
·(0.05,1) ˆe(0.05,1) std λˆ√·(0.01,1) ˆe(0.01,1) std
0 0.23 0 0.02 0.1 0 0.03
0.1 0.29 0.06 0.02 0.23 0.13 0.04
0.2 0.32 0.1 0.02 0.3 0.2 0.05
0.3 0.35 0.13 0.02 0.34 0.24 0.05
0.4 0.37 0.15 0.02 0.38 0.28 0.05
0.5 0.39 0.17 0.02 0.40 0.30 0.05
0.6 0.41 0.19 0.02 0.43 0.33 0.05
0.7 0.43 0.20 0.02 0.44 0.34 0.05
0.8 0.44 0.22 0.02 0.45 0.35 0.05
0.9 0.45 0.23 0.02 0.47 0.37 0.05
1.0 0.47 0.24 0.02 0.48 0.38 0.05
1.1 0.47 0.25 0.02 0.49 0.39 0.05
1.2 0.48 0.26 0.02 0.49 0.39 0.05
1.3 0.49 0.26 0.02 0.49 0.39 0.05
1.4 0.49 0.27 0.02 0.5 0.4 0.05
1.5 0.49 0.27 0.02 0.49 0.39 0.05
1.6 0.49 0.27 0.02 0.5 0.4 0.05
Table 7.1 - Stationary tail probabilities as function of a1.
The table shows for example that for a value of a1 = 0.5 there is a close to 40% change thatXn+1 will exceed its 95% VaR, given thatXn will have done so. This
conditional probability is about the same for the 99% VaR, and in that case about 30% bigger than what it would have been had succesive returns been independent. The effect becomes more pronounced with increasing values ofa1.
p ˆe(p)(std),a1= 0.5 eˆ(p)(std),a1= 1 ˆe(p)(std),a1= 1.5
1 0.31 (0.05) 0.38 (0.05) 0.40 (0.05)
2 0.20 (0.05) 0.35 (0.06) 0.39 (0.05)
3 0.13 (0.05) 0.30 (0.06) 0.38 (0.05)
4 0.07 (0.04) 0.26 (0.06) 0.37 (0.06)
5 0.05 (0.04) 0.22 (0.06) 0.36 (0.05)
6 0.03 (0.04) 0.19 (0.06) 0.35 (0.06)
7 0.02 (0.03) 0.16 (0.06) 0.34 (0.06)
8 0.01 (0.04) 0.14 (0.06) 0.32 (0.06)
9 0.01 (0.03) 0.11 (0.05) 0.31 (0.07)
10 0 (0.03) 0.10 (0.05) 0.29 (0.07)
Table 7.2 - Stationary excess tail probabilities as function of the lag p.
As was to be expected, the excess tail probabilities decrease with p. This decrease is slower the largera1is (and the more pronounced therefore the ARCH-effect): for a1 = 0.5, our estimate of e(0.01, p) is no longer significantly different from 0 from p = 6 onwards, while for a1 = 1.5, we found for example that ˆe(0.01, p = 20) = 0.16(0.08), which indicates a long persistence of return shocks at time 1.
We have repeated these computations for non-stationary ARCH(1)’s with a.s. initial valueX0=x0. Let
(53) λ
√
·;x0(α;k, k+p) :=
Px0 Xn+p< qXn+p(
√
α)|Xn< qXn(α)
andex0(α;k, k+p) =λ
√
·;x0(α;k, k+p)−√α. We will limit ourselves tok= 1, since
this would seem to be the most relevant case for day-to-day risk-management: one would want to know the after-effects of a possible value-at-risk violation tomorrow. Note that for large values of k it is to be expected thatex0(α;k, k+p)'e(α, p),
at least for such values ofa1for which the ARCH(1) has a unique (in distribution) stationary solution.
We estimated ex0(α; 1,2) for a1’s between 0 and 2, withx0= 1 anda0= 0.001
and ∼t4 as before. It turns out that ˆex0(α; 1,2) now increases very rapidly for
a1’s between 0 and 0.1, and then remains practically constant:
a1 eˆx0=1(0.05; 1,2) std ˆex0=1(0.01; 1,2) std
0 0 0.02 0 0.03
0.01 0.06 0.02 0.12 0.04
0.02 0.11 0.02 0.2 0.05
0.03 0.14 0.02 0.23 0.05
0.04 0.16 0.02 0.25 0.05
0.05 0.17 0.02 0.26 0.05
0.06 0.18 0.02 0.28 0.04
0.07 0.18 0.02 0.27 0.04
0.08 0.19 0.02 0.28 0.05
0.09 0.19 0.02 0.28 0.05
0.1 0.19 0.02 0.28 0.05
Table 7.3 - Non-stationary excess tail probabilities as function of a1.
or ˆπx0(0.01; 1,2) = ˆex0(0.01; 1,2) + 0.1 come as close to theα→0-limit of 0.5 as
in the stationary case, for the range of a1’s considered.
We next look at the dependence of e1(1,1 +p) :=ex0=1(α; 1,1 +p) on the lag
p, forα= 0.01 anda1= 0.05,0.1,0.2 and 0.5; as before, the number in brackets is the standard deviation of our Monte Carlo estimate.
p\ˆe1(1,1 +p) a
1= 0.05 a1= 0.1 a1= 0.2 a1= 0.5 1 0.27 (0.05) 0.28 (0.05) 0.29 (0.05) 0.29 (0.05) 2 0.1 (0.04) 0.17 (0.04) 0.21 (0.04) 0.22 (0.05) 3 0.01 (0.03) 0.07 (0.04) 0.14 (0.04) 0.18 (0.05) 4 0.01 (0.03) 0.02 (0.03) 0.07 (0.04) 0.15 (0.04)
Table 7.4 - Non-stationary excess tail probabilites as function of the lag.
The excess probabilities start at roughly the same level, but their decay with in-creasingpis slower the largera1is (fora1= 0.05 they are already indistinguishable from 0, within the precision of our MC calculations, fromp= 3 onwards).
We finally take a look at the dependence onx0. The next table gives an example, again withα= 0.01,k= 1 and lagp= 1, and for three values ofa1:
x0\eˆx0(1,2 a1= 0.05 a1= 0.1 a1= 0.15
0 0.07 (0.04) 0.11 (0.04) 0.14 (0.04) 0.1 0.1 (0.04) 0.16 (0.04) 0.18 (0.05) 0.2 0.14 (0.04) 0.21 (0.05) 0.23 (0.05) 0.3 0.18 (0.04) 0.24 (0.05) 0.26 (0.05) 0.4 0.20 (0.05) 0.25 (0.05) 0.27 (0.05) 0.5 0.22 (0.05) 0.26 (0.05) 0.27 (0.05) 0.6 0.24 (0.05) 0.28 (0.05) 0.28 (0.05) 0.7 0.25 (0.05) 0.28 (0.05) 0.28 (0.05) 0.8 0.25 (0.05) 0.28 (0.05) 0.28 (0.05) 0.9 0.26 (0.05) 0.28 (0.05) 0.28 (0.05) 1 0.27 (0.05) 0.28 (0.05) 0.29 (0.05)
Table 7.5 - Non-stationary excess tail probabilities as function of x0.
For small x0 there is a notable effect on the size of the excess probability, with initially an roughly linear increase which for large values of x0 flattens out.
Appendix A. Asymptotic expansions Mellin transforms
We recall some basic facts regarding the Mellin transform, and its relation with asymptotic expansions of integrals. Ifu: [0,∞)→Ris a locally integrable function
which is bounded near 0, and has polynomial decay|u(x)| ≤C|x|−k, then its Mellin
transformu#(s), defined by
(54) u#(s) =
Z ∞
0
u(x)x−s−1dx, Res <0.
The integral is absolutely convergent, and defines a holomorphic function on {s= σ+iη∈C:−k < σ <0}, which is bounded on each sub-strip{−k+ε < σ <−ε},
ε >0.
If u#(s) is integrable on the line (σ−i∞, σ+i∞), −k < σ < 0, then we can recuperateufromu# using Mellin’s inversion formula:
(55) u(x) = 1
2πi
Z σ+i∞
σ−i∞
Integrability and other decay-properties ofu#(s) will generally follow from smooth-ness properties ofu, using integration by parts. A very useful property of the Mellin transform is that it turns convolution on the multiplicative groupR>0into a prod-uct: if
(56) u∗v(x) :=
Z ∞
0
u(y)v(y−1x)dy y ,
then
(57) (u∗v)#(s) =u#(s)v#(s), whenever both sides make sense.
The (classical) connection between the Mellin transform ofuand the asymptotics of u(x) forx→ ∞is based on the following observation: suppose that, under the above hypothesis, u#(s), s = σ+iη, extends to a meromorphic function on a slightly larger strip {−k0 < σ < 0} (where k0 > k), and has a single simple pole in s=−k∈R<0. Then by shifting the integration path of (55) from Res=σto Res =−k−ε, ε < k0−k, and formally applying Cauchy’s residue theorem, we obtain that
u(x) = Ress=−ku#(s)x−s+ (2πi)−1
Z
Re(s)=−k−ε
u#(s)x−sds
= cx−k+O(x−k−ε),
with
c= Res s=−k
u#(s)= lim
s→−k(s+k)u
#(s).
This can easily be made rigorous under the condition that u#(s) is integrable on the line Re (s) = −k−ε and that |u#(σ+iη)| → 0 as η → ±∞, uniformly for s[−k−ε, k+ε].Multiple poles can be handled similarly, but will introduce logaritmic terms: if u#(s) has a pole of orderpins=−k, then
Ress=−k
u#(s)x−s = lim
s→−k
dp−1
dsp−1 (s+k)
pu#(s)xs (58)
=
p−1 X
ν=0
cν(logx)νx−k.
The coefficients cν can be computed by noting that if
u#(s) = a−p (s+k)p +
a−(p−1)
(s+k)p−1 +· · ·+ a−1
s+k +a0+a1(s+k) +· · ·, is the Laurent expansion of u#(s) around s = −k, then the residue in (58) is the coefficient of (s+k)−1 in the Laurent expansion of the product u#(s)xs = x−ku#(s)e(s+k) logx. This gives
(59) cν=
a−(ν+1) ν! .
In case there is more than one pole, one simply adds the contributions the individual poles. than one pole, we simply add up the contribution of each of the poles to the asymptotics. In particular, ifu#(σ+iη) extends merorphically to{−k−N < σ < 0}, with poles We summarize this discussion in the following lemma:
Lemma A.1. (i) Suppose that, for suitablep,cν ∈Candη >0, we have
(60) u(x) =
p−1 X
ν=1
Thenu# is of the form
(61) u#(s) =
p
X
j=1
a−j(s+k)−j+h(s),
with h=h(s)holomorphic on−k−η <Re s <0.
(ii) Conversely, suppose that u#(s)is given by (61), with t→u#(σ+it) inte-grable, and |u#(σ+it)| → 0 as t → ±∞, uniformly for σ in compact subsets of
(−k−η,0). Then we have that, for all ε >0, u(x)has the truncated asymptotic expansion (60), with η replaced by η−ε.
Proof. (i) This follows from
u#(s) =
p−1 X
ν=0
Z ∞
1 cν
(logx)ν
xk+s+1dx+h(s),
with h(s) holomorphic on−k−α <Res <0, and
Z ∞
1
(logx)ν
xν+1 x
−sdx= (−1)ν
d
ds νZ ∞
1 dx xk+s+1 =
ν! s+k.
(ii) By shifting the integration path, as explained above. QED
References
[1] Acerbi, C., Tasche, D., 2002. On the coherence of expected shortfall, J. of Banking and Finance, 26, 1487-1503.
[2] Artzner, P., Delbaen, F., Eber , J. - M., Heath, D., 1999. Coherent measures of risk, Math. Finance, 9, 203-228.
[3] Bank for International Settlements website http://www.bis.org/bcbs/
[4] Basrak, B., Davis, R. A., Mikosch, T.,2002. Regular variation of GARCH processes, Stoch. Process. Appl., 99, 95-115.
[5] Berkowitz, J., O’Brien, J., 2002. How accurate are value-at-risk models at commercial banks? J. Finance LVII, 1093-1111.
[6] Bingham, N. H., Goldie, G. M., Teugels, J. L.,1987.Regular Variation, Cambridge University Press, Cambridge.
[7] Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity, J. of Econo-metrics 31, 307 - 325.
[8] Bollerslev, T., Engle, R. F., Nelson, D. B., 1994. ARCH models, in: Engle, R. F., McFadden, D., eds.,Handbook of Econometrics, vol. 4, North Holland, Amsterdam.
[9] Brummelhuis, R. , Gu´egan, D. , 2001. Multi-period conditional distribution functions for conditionally normal GARCH(1, 1)-models, Univ. of Reims preprint; to appear in J. Appl. Probability, 2005.
[10] Campbell, J. Y., Lo, A. W., MacKinlay, A. C. (1997).The Econometrics of Financial Mar-kets, Princeton University Press, Princeton, New Jersey.
[11] Embrechts, P., Kl¨uppelberg, C., Mikosch, T., 1997.Modelling Extremal Events for Insurance and Finance, Springer, Berlin.
[12] Embrechts, P., McNeil, A., Straumann, D.: Correlation and dependence in risk manage-ment: properties and pitfalls In: Risk Managemanage-ment: Value at Risk and Beyond, ed. M.A.H. Dempster, Cambridge University Press, Cambridge, pp. 176-223
[13] Engle, R. F., 1982. Autoregressive conditional heteroskedacity with estimates of the variance of United Kingdom inflation, Econometrica 50, 907 - 1007.
[14] Engle, R. F. (Editor) (2000).ARCH - Selected Readings, Advanced Texts in Econometrics, Oxford University Press.
[15] Frey, R., McNeil, A., 2000. Estimation of tail-related risk measures for heteroscedastic finan-cial time series : an extreme value approach, J. Empirical Finance, 7, 271 - 300.
[16] Goldie, C. M. , 1991. Implicit renewal theory and tails of solutions of random equations, Ann. Appl. Prob. 1, 126 - 166.
[18] Kesten, H., 1973. Random difference equations and renewal theory for products of random variables, Acta Math. 131, 207-248.
[19] Mandelbrot, B. (1963). The variation of certain speculative prices, J. Business, 36, 394 - 419. [20] Nelson, D. B., 1990. Stationarity and persistence in the GARCH(1, 1) model, Econometric
Theory, 6, 318-334.
[21] Rockafellar, A. T., Uryasev, S. (2002). Conditional value-at-risk for general loss distributions, J. Banking Finance, 26, 1443-1471.