Alexander J. McNeil
Departement Mathematik
ETH Zentrum
CH-8092 Zurich
Tel: +41 1 632 61 62
Fax: +41 1 632 15 23
May 17, 1999
Abstract
We providean overview oftherole of extremevalue theory(EVT)inrisk
man-agement (RM), as a method formodellingand measuring extreme risks. We
con-centrate onthepeaks-over-threshold(POT) modeland emphasizethegeneralityof
this approach. Wherever the tail of a loss distribution is of interest, whether for
market, credit, operationalor insurance risks, the POT method provides a simple
toolforestimatingmeasuresoftailrisk. InparticularweshowhowthePOTmethod
maybeembeddedinastochasticvolatilityframeworktodeliverusefulestimates of
Value-at-Risk (VaR) and expected shortfall,a coherent alternative to theVaR,for
marketrisks. Furthertopicsofinterest,includingmultivariateextremes,modelsfor
stress losses and software forEVT,arealso discussed.
1 A General Introduction to Extreme Risk
Extreme event riskispresent inallareasof riskmanagement. Whether weare concerned
with market, credit, operational or insurance risk, one of the greatest challenges to the
riskmanageristoimplementriskmanagementmodelswhichallowforrarebut damaging
events,and permit the measurement of their consequences.
This paper may be motivated by any numberof concrete risk management problems.
Inmarketrisk,wemightbeconcernedwiththe daytodaydeterminationofthe V
alue-at-Risk (VaR)for the losses we incur ona tradingbook due to adverse market movements.
In credit or operational risk management our goal might be the determination of the
risk capitalwerequire asacushionagainstirregularlosses fromcreditdowngradings and
defaults or unforeseen operationalproblems.
Alongside these nancial risks, it is also worth considering insurance risks; the
in-surance world has considerable experience inthe management of extreme risk and many
methodswhichwemightnowrecognize asbelongingtoextremevalue theory have along
AlexanderMcNeilisSwissReResearchFellowatETHZurichandgratefullyacknowledgesthe
reservesfor productswhichoerprotectionagainst catastrophiclosses,suchas
excess-of-loss (XL)reinsurance treaties concluded withprimary insurers.
Whatever the typeof riskwe are considering our approach toitsmanagementwillbe
similar in this paper. We will attempt to model it in such away that the possibility of
an extreme outcome is addressed. Using our modelwe willattempt to measure the risk
with a measurement which provides information about the extreme outcome. In these
activities extreme value theory (EVT) willprovidethe toolswe require.
1.1 Modelling Extreme Risks
The standard mathematicalapproach tomodellingrisks uses the languageof probability
theory. Risks are random variables, mapping unforeseen future states of the world into
valuesrepresentingprotsand losses. Theserisksmaybeconsideredindividually,orseen
aspartofastochasticprocesswherepresentrisksdependonpreviousrisks. Thepotential
valuesofariskhaveaprobabilitydistributionwhichwewillneverobserveexactlyalthough
past losses due to similar risks, where available, may provide partial information about
that distribution. Extreme events occur when a risk takes values from the tail of its
distribution.
We develop a model for a risk by selecting a particular probability distribution. We
mayhaveestimatedthisdistributionthrough statisticalanalysisofempiricaldata. Inthis
case EVT is a tool which attempts to provide us with the best possible estimate of the
tailarea of the distribution. However, even in the absence of useful historicaldata, EVT
provides guidance on the kind of distribution we should select so that extreme risks are
handled conservatively.
1.2 Measuring Extreme Risks
For our purposes, measuring a risk means summarising its distribution with a number
known asa risk measure. At the simplestlevel,we mightcalculatethe mean orvariance
of a risk. These measure aspects of the risk but do not provide much information about
the extreme risk. In this paper we will concentrate on two measures which attempt to
describe the tail of a loss distribution - VaR and expected shortfall. We shall adopt the
convention that a loss is a positive number and a prot is a negative number. EVT is
most naturallydevelopedasatheory oflargelosses, ratherthanatheory ofsmallprots.
VaRisahighquantileofthedistributionoflosses,typicallythe95thor99thpercentile.
It providesa kind of upper bound for a loss that is only exceeded ona smallproportion
of occasions. Itissometimesreferredtoasacondencelevel,althoughthis isamisnomer
whichis at odds with standard statisticalusage.
In recent papers Artzner, Delbaen, Eber & Heath (1997) have criticized VaR as a
measure of riskon two grounds. Firstthey showthat VaR isnot necessarilysubadditive
so that, in their terminology, VaR is not a coherent risk measure. There are cases where
a portfolio can be split into sub-portfolios such that the sum of the VaR corresponding
to the sub-portfolios is smaller than the VaR of the total portfolio. This may cause
problems if the risk-management system of anancial institution isbased on VaR-limits
for individual books. Moreover, VaRtells usnothing about the potentialsize of the loss
that exceeds it.
and is coherent according totheir denition.
1.3 Extreme Value Theory
The approach to EVT in this paper follows most closely Embrechts, Kluppelberg &
Mikosch (1997); other recent texts on EVT include Reiss & Thomas (1997) and
Beir-lant, Teugels & Vynckier (1996). All of these texts emphasize applications of the theory
in insurance and nance although much of the original impetus for the development of
the methodscame from hydrology.
Broadlyspeaking,therearetwoprincipalkindsofmodelforextremevalues. Theoldest
groupofmodelsaretheblockmaximamodels;thesearemodelsforthelargestobservations
collected from large samples of identically distributed observations. For example, if we
record dailyorhourly losses and protsfrom tradingaparticular instrumentorgroup of
instruments, the block maxima method provides a modelwhich may be appropriate for
the quarterly or annual maximum of such values. We see a possible role for this method
inthedenition andanalysisofstresslosses(McNeil1998)and willreturntothis subject
in Section4.1.
Amoremoderngroupofmodelsare thepeaks-over-threshold(POT)models;theseare
models for all large observations which exceed a high threshold. The POT models are
generally considered to be the most useful for practical applications, due to their more
eÆcient useof the(often limited)dataonextremevalues. This paperwillconcentrateon
such models.
Within the POT class of models one may further distinguish two styles of
analy-sis. There are the semi-parametric models built around the Hill estimator and its
rela-tives(Beirlantetal.1996,Danielsson,Hartmann&deVries 1998)andthefully
paramet-ric models based on the generalized Pareto distribution or GPD (Embrechts, Resnick &
Samorodnitsky1998). There islittleto pickand choose between these approaches -both
are theoreticallyjustiedand empiricallyusefulwhenusedcorrectly. Wefavour thelatter
style of analysis for reasons of simplicity - both of exposition and implementation. One
obtains simpleparametric formulae for measures ofextreme risk forwhichit isrelatively
easy to give estimates of statistical error using the techniques of maximum likelihood
inference.
The GPD will thus be the main tool we describe in this paper. It is simplyanother
probability distribution but for purposes of risk management it should be considered as
equally important as (if not more importantthan) the Normal distribution. The tails of
the Normal distribution are toothin to addressthe extremeloss.
We willnot describe the Hill estimatorapproach inthis paper; we referthe readerto
the references above and alsoto Danielsson& de Vries (1997).
2 General Theory
LetX
1 ;X
2
;::: be identicallydistributedrandomvariableswith unknown underlying
dis-tribution function F(x) = PfX
i
xg. (We work with distribution functions and not
densities.) The interpretationof these randomrisks isleft tothe reader. They mightbe:
Daily(negative)returns onnancialasset or portfolio{ losses and prots
Catastrophic insurance claims
Credit losses
Moreover, they might represent risks which we can directly observe or they might also
represent risks which we are forced tosimulate in some Monte Carlo procedure, because
of the impracticality of obtaining data. There are situations where, despite simulating
fromaknownstochasticmodel,thecomplexityofthesystem issuchthatwedonotknow
exactly what the lossdistribution F is.
Weavoidassumingindependence, whichfor certainof theabove interpretations
(par-ticularly marketreturns) is well-known tobeunrealistic.
2.1 Measures of Extreme Risk
Mathematically we dene our measures of extreme risk in terms of the loss distribution
F. Let1>q0:95 say. Value-at-Risk (VaR) isthe qthquantile of the distribution F
VaR
q =F
1
(q);
where F 1
isthe inverse of F,and expectedshortfall isthe expected losssize, given that
VaRis exceeded
ES
q
=E[X jX >VaR
q ]:
Like F itself these are theoretical quantities which we will never know. Our goalin risk
measurement isestimates d
VaR
q and
c
ES
q
of thesemeasures and in this chapterweobtain
twoexplicit formulae (6)and (10). The reader whowishes toavoid mathematicaltheory
may skim through the remaining sections of this chapter, pausing only to observe that
the formulae inquestion are simple.
2.2 Generalized Pareto Distribution
The GPD is a two parameter distributionwith distribution function
G
; (x)=
8
<
:
1 (1+x=) 1=
6=0;
1 exp ( x=) =0;
where >0,and where x0when 0and 0x = when <0.
Thisdistributionisgeneralizedinthesensethatitsubsumescertainotherdistributions
underacommonparametricform. isthe importantshapeparameterofthe distribution
and is an additionalscaling parameter. If >0 then G
;
is a reparametrized version
of the ordinaryParetodistribution, which has along history inactuarialmathematicsas
a modelfor large losses; = 0 corresponds to the exponential distribution and < 0 is
known asa Pareto type II distribution.
The rst case is the most relevant for risk management purposes since the GPD is
heavy-tailed when > 0. Whereas the normal distribution has moments of all orders, a
heavy-tailed distributiondoes not possess a complete set of moments. In the case of the
GPD with>0wendthat E[X k
fourth moment.
Certain types of large claims data in insurance typically suggest an innite second
moment; similarly econometricians might claim that certain market returns indicate a
distribution with innite fourth moment. The normal distribution cannot model these
phenomena but the GPD is used to capture precisely this kind of behaviour, as we shall
explain in the next sections. To make matters concrete we take a popular insurance
example. Our data consist of 2156 large industrial re insurance claims from Denmark
covering the years 1980 to 1990. The reader may visualize these losses in any way he or
she wishes.
2.3 Estimating Excess Distributions
The distribution of excesseslosses over ahigh threshold uis dened tobe
F
u
(y)=P fX uyjX >ug; (1)
for 0y<x
0
u where x
0
1 is the right endpoint of F, tobeexplained below. The
excess distribution represents the probability that a loss exceeds the threshold u by at
most anamount y, given the informationthat it exceeds the threshold. It is very useful
to observe that it can be writtenin terms of the underlyingF as
F
u (y)=
F(y+u) F(u)
1 F(u)
: (2)
Mostlywewould assumeour underlyingF isa distributionwithaninniteright
end-point,i.e.itallows thepossibilityofarbitrarilylargelosses, even ifitattributes negligible
probability tounreasonably large outcomes, e.g. the normal ort{distributions. But itis
also conceivable, in certain applications, that F could have a nite right endpoint. An
example is the beta distribution on the interval [0;1] which attributes zero probability
to outcomes larger than 1 and which might be used, for example, as the distribution of
credit losses expressed asa proportionof exposure.
Thefollowinglimittheorem isakey resultinEVTandexplainsthe importanceof the
GPD.
Theorem 1 Foralargeclassof underlyingdistributionswecanndafunction(u)such
that
lim
u!x0
sup
0y<x
0 u
jF
u
(y) G
;(u)
(y)j=0:
That is,foralarge class ofunderlyingdistributionsF, asthethreshold uis progressively
raised, the excess distribution F
u
converges to a generalized Pareto. The theorem is of
course not mathematically complete, because we fail to say exactly what we mean by a
large class of underlying distributions. For this paper it is suÆcient to know that the
class contains allthe commoncontinuous distributions of statistics and actuarialscience
(normal, lognormal, 2
, t, F, gamma,exponential,uniform, beta, etc.).
In the sense of the above theorem, the GPD is the natural model for the unknown
excess distributionabove suÆcientlyhigh thresholds,and this factisthe essential insight
on which our entire method is built. Our model for a risk X
i
to be exactlyGPD for some and
F
u
(y)=G
;
(y): (3)
AssumingwehaverealisationsofX
1 ;X
2
;::: weuse statisticstomakethe modelmore
precise by choosing a sensible u and estimating and . Supposing that N
u
out of a
total ofn data pointsexceed thethreshold, the GPDis ttedtothe N
u
excesses by some
statistical tting method to obtain estimates ^
and ^
. We favour maximum likelihood
estimation(MLE)oftheseparameters,wheretheparametervaluesarechosentomaximize
the joint probabilitydensity of the observations. This is the most generaltting method
instatisticsanditalsoallowsustogiveestimates ofstatisticalerror (standarderrors)for
the parameter estimates.
Choiceofthe threshold isbasicallyacompromisebetween choosingasuÆciently high
threshold so that the asymptotic theorem can be considered to be essentially exact and
choosing a suÆciently low threshold so that we have suÆcient material for estimation of
the parameters. For further informationonthis data-analytic issue see McNeil(1997).
Forour demonstrationdata, we take a threshold at10 (millionKrone). This reduces
our n=2156 losses to N
u
=109 threshold exceedances. On the basis of these data and
are estimated to be 0.50 and 7.0; the value of shows the heavy-tailedness of the
data and suggests a good explanatory model may have an innite variance. In Figure 1
the estimated GPD model for the excess distribution is shown as a smooth curve. The
empiricaldistributionofthe109 extremevaluesisshownby points;itisevidentthe GPD
modelts these excess losses well.
2.4 Estimating Tails of Distributions
By setting x = u+y and combining expressions (2) and (3) we see that our model can
also bewritten as
F(x)=(1 F(u))G
;
(x u)+F(u); (4)
for x>u. Thisformulashows that wemay moveeasilytoaninterpretationof the model
in terms of the tailof the underlying distributionF(x) for x>u.
Ouraimistouse (4)toconstruct atail estimatorand the onlyadditionalelementwe
require to dothis is anestimate of F(u). Forthis purpose we take the obvious empirical
estimator(n N
u
)=n. That is,we use the methodof historicalsimulation (HS).
An immediate question is, why do we not use the HS method to estimate the whole
tail of F(x) (i.e. for all x u)? This is because historical simulation is a poor method
in the tail of the distribution where data become sparse. In setting a threshold atu we
are judging that we have suÆcient observations exceeding u to enable a reasonable HS
estimate of F(u), but for higher levels the historicalmethod would be too unreliable.
Putting our HS estimate of F(u) and our maximum likelihood estimates of the
pa-rameters of the GPD together we arrive atthe tail estimator
^
F(x)=1 N
u
n
1+ ^
x u
^
1= ^
; (5)
itisimportanttoobservethatthisestimatorisonlyvalidforx>u. Thisestimatecanbe
bestunderstoodinthe situationwhenthe datamayalsobeassumed independent oronly
weakly dependent.
ForourdemonstrationdatatheHSestimateofF(u)is0.95((2156 109)=2156)sothat
ourthresholdispositioned(approximately)atthe95thsamplepercentile. Combiningthis
with our parametric model for the excess distribution we obtain the tail estimateshown
inFigure2. Inthis gurethey-axisactuallyindicatesthetailprobabilities1 F(x). The
topleftcornerofthegraphshowsthatthethresholdof10correspondstoatailprobability
of 0.05(asestimatedby HS).The pointsagainrepresentthe109 largelossesandthe solid
curve shows howthe tailestimationformulaallowsextrapolationintothe areawhere the
data becomea sparse and unreliable guide totheir unknown parent distribution.
2.5 Estimating VaR
For a given probability q > F(u) the VaR estimate is calculated by inverting the tail
estimation formula(5) toget
d
VaR
q
=u+ ^
^
n
N
u
(1 q)
^
1 !
: (6)
Instandardstatisticallanguagethisisaquantileestimate,wherethequantileisan
un-knownparameterofanunknownunderlyingdistribution. Itispossibletogiveacondence
interval for d
VaR
q
using a method known as prole likelihood; this yields an asymptotic
interval in which we have condence that VaR lies. The asymmetric interval reects a
fundamental asymmetry in the problem of estimating a high quantile for heavy-tailed
data: it iseasier to bound the intervalbelowthan to bound itabove.
In Figure 3 we estimate VaR
0:99
to be 27.3. The vertical dotted line intersects with
the tailestimateat the point(27:3;0:01) and allows the VaR estimateto beread o the
x-axis. The dotted curve is a tool to enable the calculation of a condence interval for
the VaR.The secondy-axison theright ofthe graphis acondence scale(not aquantile
scale). The horizontal dotted line corresponds to 95% condence; the x-coordinates of
the two points where the dotted curve intersects the horizontal line are the boundaries
of the 95% condence interval (23:3;33:1). Two things should be observed: we obtain a
wider 99% condence interval by dropping the horizontal line down to the value 99 on
the condence axis; the interval isasymmetric asdesired.
2.6 Estimating ES
Expected shortfallis related toVaRby
ES
q
=VaR
q
+E[X VaR
q
jX >VaR
q
]; (7)
where the second term is simply the mean of the excess distribution F
VaR
q
(y) over the
threshold VaR
q
. Our model for the excess distribution above the threshold u (3) has
a nice stability property. If we take any higher threshold, such as VaR
q
for q > F(u),
then the excess distribution abovethe higherthreshold isalsoGPD with the sameshape
parameter, but adierentscaling. Itis easilyshown that aconsequence of themodel(3)
is that
F
VaRq
(y)=G
;+(VaRq u)
VaR.Withthis modelwecan calculatemanycharacteristicsofthe lossesbeyond VaR.By
notingthat(provided<1)themeanofthedistributionin(8)is(+(VaR
q
u))=(1 ),
we can calculatethe expected shortfall. We nd that
ES
q
VaR
q =
1
1
+
u
(1 )VaR
q
: (9)
It isworth examining this ratio a littlemoreclosely inthe case where the underlying
distribution has aninniterightendpoint. In this case the ratio islargely determinedby
thefactor1=(1 ). Thesecondtermontherighthandsideof(9)becomesnegligiblysmall
as the probabilityq gets nearer and nearer to 1. This asymptoticobservation underlines
the importance of the shape parameter in tail estimation. It determines how our two
risk measures dier inthe extreme regions of the loss distribution.
Expected shortfall is estimated by substituting data-based estimates for everything
whichis unknown in (9)to obtain
c
ES
q =
d
VaR
q
1 ^
+
^
^
u
1 ^
: (10)
For our demonstration data 1=(1 ^
) 2:0 and ( ^
^
u)=(1 ^
) 4:0. Essentially
c
ES
q
is obtained from d
VaR
q
by doublingit. Our estimateis58.2 and wehavemarked this
withasecondverticallineinFigure4. Againusingtheprolelikelihoodmethodweshow
how anestimate of the 95% condence intervalfor c
ES
q
can beadded (41.6,154). Clearly
the uncertainty about the value of our coherent risk measure is large, but this is to be
expected with such heavy-taileddata. The prudent riskmanager should be aware of the
magnitude of his uncertainty about extreme phenomena.
3 Extreme Market Risk
In the marketrisk interpretationof our randomvariables
X
t
= (logS
t
logS
t 1 ) (S
t 1 S
t )=S
t 1
; (11)
representsthe lossonaportfoliooftradedassetsondayt, whereS
t
istheclosingvalue of
the portfolioonthatday. Wechangetosubscript ttoemphasizethe temporalindexingof
our risks. As shown above,the lossmaybedened asarelativeorlogarithmicdierence,
both denitions givingvery similarvalues.
In calculating daily VaR estimates for such risks, there is now a general recognition
thatthecalculationshouldtakeintoaccountvolatilityofmarketinstruments. Anextreme
value in aperiodof high volatility appears less extreme than the same value ina period
of lowvolatility. Various authorshave acknowledged the need to scale VaR estimates by
current volatility in some way (see, for example, Hull & White (1998)). Any approach
which achieves this we will call a dynamic risk measurement procedure. In this chapter
we focusonhow thedynamic measurementof marketriskscan befurther enhanced with
EVT totakeintoaccount the extreme risk overand abovethe volatility risk.
Most market return series show a great deal of common structure. This suggests
that more sophisticated modelling is both possible and necessary; it is not suÆcient
serialcorrelationofabsoluteorsquaredreturnsishigh;returnsshowvolatilityclustering{
the tendency oflargevaluestobefollowedbyotherlarge values,althoughnotnecessarily
of the same sign.
3.1 Stochastic Volatility Models
The most popularmodels for this phenomenon are the stochasticvolatility(SV) models,
whichtake the form
X
t =
t +
t Z
t
; (12)
where
t
is the volatility of the return on day t and
t
is the expected return. These
values are considered to depend in a deterministic way on the past history of returns.
The randomness in the model comes through the random variables Z
t
, which are the
noise variablesor the innovationsof the process.
Weassumethat thenoise variablesZ
t
are independentwithanidenticalunknown
dis-tributionF
Z
(z). (By convention we assumethis distributionhas mean zero and variance
1, so that
t
is directly interpretable as the volatility of X
t
.) Although the structure of
the model causes the X
t
to be dependent we assume that the modelis such that the X
t
are identicallydistributed withunknown distribution functionF
X
(x). In the languageof
time series we assume that X
t
isa stationary process.
Models which t into this framework include the ARCH/GARCH family. A simple
example is
t
= X
t 1
; (13)
2
t
=
0 +
1 (X
t 1
t 1 )
2
+
2
t 1 ;
with
0 ,
1
, > 0, +
1
< 1 and jj < 1. This is an autoregressive process with
GARCH(1,1) errors and with a suitably chosen noise distribution this is a model which
mimics many features of real nancialreturn series.
3.2 Dynamic Risk Management
Suppose we have followed daily market movements over a period of time and we nd
ourselves at the close of day t. In dynamic risk management we are interested in the
conditional return distribution
F
Xt+1+:::+X
t+k jFt
(x); (14)
where the symbol F
t
represents the history of the process X
t
up to and includingday t.
In looking at this distribution we ask, what is the distribution of returns over the next
k 1days, given the presentmarketbackground? This isthe issue indailyVaR(orES)
calculation.
This view can be contrasted with static risk management where we are interested in
the unconditional or stationary distribution F
X
(x) (or F
X
1 +:::+X
k
(x) for a k-day return).
Here we takea complementaryviewand askquestions like,howlargeisa 100 daylossin
general? What is the magnitude of a 5-year loss?
We redene our risk measures slightly tobe quantiles and expected shortfalls for the
distribution (14) and we introduce the notation VaR t
q
(k) and ES t
q
(k). The subscript t
shows that these are dynamic measures designed for calculation at the close of day t; k
The structure of the model(12) means that the dynamic measures take simple formsfor
a 1 day horizon.
VaR t
q
=
t+1 +
t+1
VaR(Z)
q
(15)
ES t
q
=
t+1 +
t+1 ES(Z)
q ;
where VaR(Z)
q
denotes the qth quantile of a noise variable Z
i
and ES(Z)
q
is the
corre-sponding expected shortfall.
The simplest approaches to estimating a dynamic VaR make the assumption that
F
Z
(z) is a known standard distribution, typically the normal distribution. In this case
VaR(Z)
q
is easily calculated. To estimate the dynamic measure a procedure is required
to estimate tomorrow's expected return
t+1
and tomorrow's volatility
t+1
. Several
approaches are available for forecasting the mean and volatility of SV models. Two
possibilities are the exponentially weighted moving average model (EWMA) as used in
Riskmetrics or GARCH modelling. The problem with the assumption of conditional
normality is that this tends to lead to an underestimation of the dynamic measures.
Empiricalanalyses suggestthe conditionaldistributionof appropriateSV modelsfor real
data is oftenheavier-tailed than the normaldistribution.
The trick as far as augmenting the dynamic procedure with EVT is concerned, is to
apply it to the random variables Z
t
rather than X
t
. In the EVT approach (McNeil &
Frey 1998) we avoid assumingany particular form for F
Z
(z); instead we apply the GPD
tailestimationproceduretothis distribution. Weassumethatabovesomehigh threshold
u the excess distribution is exactlyGPD. The problemwith the statistical estimation of
this modelis that the Z
t
variablescannot be directly observed, but this is solved by the
followingtwo stage approach.
Suppose atthe close ofday t weconsider a timewindow containing thelastn returns
X
t n+1
;:::;X
t .
1. AGARCH-typestochasticvolatilitymodel,typicallyanARmodelwithGARCH
er-rors,isttedtothehistoricaldatabypseudomaximumlikelihood(PML).Fromthis
model the so-calledresidualsare extracted. Ifthe modelistenablethese can be
re-gardedasrealisationsoftheunobserved,independentnoisevariablesZ
t n+1
;::: ;Z
t .
The GARCH-type modelis used tocalculate 1-steppredictions of
t+1 and
t+1 .
2. EVT is applied to the residuals. Forsome choice of threshold the GPD method is
used toestimateVaR(Z)
q
and ES(Z)
q
as outlinedinthe previouschapter. The risk
measures are calculated using equations(15).
3.4 Backtesting
The procedureabove, whichwe termdynamic orconditional EVT, isa successfulway of
adaptingEVTtothe special taskof dailymarketrisk measurement. This can beveried
by backtesting the method onhistoricalreturn series.
Figure5 shows a dynamicVaR estimateusing the conditional EVT method fordaily
losses on the DAX index. At the close of every day the method is applied to the last
1000 data points using a threshold u set at the 90th sample percentile of the residuals.
Volatilityand expected returnforecasts arebased onanAR(1)modelwith GARCH(1,1)
static EVT as inChapter 2. This changes onlygradually (as extreme observations drop
occasionallyfromthe back of the movingdata window).
A VaR estimation method is backtested by comparing the estimates with the actual
losses observed on the next day. A VaR violation occurs when the actual loss exceeds
the estimate. Various dynamic(and static)methodsofVaR estimationcan be compared
by counting violations; tests of the violation counts based on the binomial distribution
can show when a systematicunderestimation or overestimation of VaR seems to be
tak-ing place. It is also possible to devise backtests which compare dynamic ES estimates
with actual incurred losses exceeding the VaR on days when VaR violation takes place
(see McNeil&Frey(1998) for details).
S&P DAX
Lengthof Test 7414 5146
VaR
0:95
Expected violations 371 257
DynamicEVT violations 366 (0.41) 258 (0.49)
Dynamicnormal violations 384 (0.25) 238 (0.11)
StaticEVT violations 402 (0.05) 266 (0.30)
VaR
0:99
Expected violations 74 51
DynamicEVT violations 73(0.48) 55 (0.33)
Dynamicnormal violations 104 (0.00) 74 (0.00)
StaticEVT violations 86(0.10) 59 (0.16)
VaR
0:995
Expected violations 37 26
DynamicEVT violations 43(0.18) 24 (0.42)
Dynamicnormal violations 63(0.00) 44 (0.00)
StaticEVT violations 50(0.02) 36 (0.03)
Table 1: Some VaR backtesting results for two major indices. Values in brackets are
p-valuesforastatisticaltest ofthe success of themethod;values smallerthan 0.05 indicate
failure.
McNeiland Frey compare conditional EVTwith other dynamic approaches whichdo
not explicitly model the tail risk associated with the innovation distribution. In
par-ticular, they compare the approach with methods which assume normally distributed or
t{distributedinnovations(whichtheylabelthe dynamicnormalanddynamictmethods).
They alsocompare dynamic with static EVT.
TheVaRviolationsrelatingtotheDAXdatainFigure5are showninFigure6forthe
dynamicEVT,dynamicnormalandstaticEVTmethods,thesebeingdenotedrespectively
by the circular,triangular and square plottingsymbols. It is apparentthat although the
dynamic normal estimate reacts to volatility, itis violated more oftenthan the dynamic
EVT estimate; it is alsoclear that the static EVT estimate tends to be violated several
times inarowinperiodsof high volatilitybecauseitis unabletoreact swiftlyenoughto
the changingvolatility. Theseobservations areborneout by theresults inTable1, which
is a sampleof the backtesting results in McNeil&Frey(1998).
Dynamic EVT is in general the best method for estimating VaR
q
for q 0:95.
(Dynamic t is an eective simple alternative, if returns are not too asymmetric.)
For q0:99, the dynamic normalmethodis not goodenough.
The dynamic normalmethod isuseless for estimatingES t
q
, even when q=0:95. To
estimate expected shortfall adynamic procedurehas tobe enhanced with EVT.
Itis worth understanding inmoredetail why EVTisparticularlynecessary for
calcu-lating expected shortfall estimates. In a stochastic volatility modelthe ratio ES t
q =VaR
t
q
is essentially given by ES(Z)
q
=VaR(Z)
q
, the equivalent ratio for the noise distribution.
We have already observed in (9) that this ratio is largely determined by the weight of
the tail of the distribution F
Z
(z) as summarized by the parameter of a suitable GPD
approximation.
WehavetabulatedsomevaluesforthisratioinTable2inthecasewhenF
Z
(z)admitsa
GPDtailapproximationwith=0:22(thethresholdbeingsetatu=1:2with =0:57).
The ratio is compared with the equivalent ratio fora normal innovationdistribution.
q 0.95 0.99 0.995 q !1
GPD tail 1.52 1.42 1.39 1.29
Normal 1.25 1.15 1.12 1.00
Table 2: ESto VaRratios under two models for the noise distribution.
Clearly the ratios are smaller for the normal distribution. If we erroneously assume
conditional normality in our models, not only dowe tend to underestimate VaR, but we
also underestimate the ES/VaR ratio. Our error for ES is magnied due to this double
underestimation.
3.5 Multiple day horizons
Formultipleday horizons(k>1)wedonot havethesimple expressions fordynamicrisk
measures which we had in (15). Explicit estimation of the risk measures is diÆcult and
it is attractive to want to use a simple scaling rule, like the famous square root of time
rule, toturn one dayVaRintok-dayVaR.Unfortunately,square rootof timeisdesigned
for the case when returns are normally distributed and is not appropriate for the kind
of SV model driven by heavy-tailednoise that we consider realistic. Nor is itnecessarily
appropriateforscalingdynamicriskmeasures,whereonemightimaginecurrentvolatility
should be taken intoaccount.
It is possible to adopt a Monte Carlo approach to estimating dynamic risk measures
for longer time horizons. Possible future paths for the SV model of the returns may be
simulated and possible k-day losses calculated.
To calculate a single future path on day t we could proceed as follows. The noise
distribution is modelled with a composite model consisting of GPD tail estimates for
both tails and a simple empirical(i.e. historical simulation)estimate based onthe model
residuals in the centre. k independent values Z
t+1
;::: ;Z
t+k
are simulated from this
model(for detailsof the necessary randomnumbergeneratorsee McNeil&Frey (1998)).
Using the noise values and the current estimated volatilityfrom the tted GARCH-type
model, future values of the return process X
t+1
;::: ;X
t+k
distribution of the k-day loss(14).
By repeating this process many times to obtain a sample of values (perhaps 1000)
from the target distribution and then applying the GPD tail estimation procedure to
these simulated data, reasonable estimates of the risk measures may beobtained.
Such simulationresultscan thenbeused toexamine the natureofthe impliedscaling
law. McNeilandFreyconduct suchanexperimentand suggestthat forhorizonsupto50
days VaR t
q
(k) typically obeysa powerscaling law of the form
VaR t
q
(k)=VaR t
q k
t
;
where
t
depends on the current volatility. Their results are summarized in Table 3.
In their experiment square root of time scaling (
t
= 0:5) is appropriate on days when
estimated volatility ishigh; otherwise a a largerscaling exponent issuggested.
q 0.95 0.99
low volatility 0.65 0.65
average volatility 0.60 0.59
high volatility 0.48 0.47
Table 3: Typical scaling exponents for multiple day horizons. Low, average and high
volatilities are taken to be the 5th, 50th and 95th percentiles of estimated historical
volatilitiesrespectively.
4 Other Issues
In this sectionwe providebriefer noteson some other relevant topics inEVT.
4.1 Block Maxima Models for Stress Losses
For a more complete understanding of EVT we should be aware of the block maxima
models. Although less useful than the threshold models, these models are not without
practicalrelevance and could be used to provide estimates of stress losses.
Theorem1isnotreallyamathematicalresultasitpresentlystands. Wecouldmakeit
mathematically complete by saying that distributions which admit the asymptotic GPD
model for their excess distribution are precisely those distributions in the maximum
do-main of attraction of an extreme value distribution. To understand this statement we
must rst dene the generalized extremevalue distribution (GEV).
The distributionfunction of the GEV is given by
H
(x)=
exp ( (1+x) 1=
) 6=0;
exp ( e x
) =0;
where 1 +x > 0 and is the shape parameter. As in the case of the GPD, this
parametric form subsumes distributions which are known by other names. When > 0
the distribution isknown as Frechet; when =0 itis aGumbeldistribution;when <0
1
X
2
, ::: are independent identically distributed losses with distribution function F as
earlieranddenethemaximumofablockofnobservationstobeM
n
=max(X
1
;:::;X
n ).
Supposeitispossibletondsequencesofnumbersa
n
>0andb
n
suchthatthedistribution
of (M
n b
n )=a
n
, converges tosome limitingdistributionH asthe block size increases. If
this occurs F is said to be inthe maximumdomain of attraction of H. Tobeabsolutely
technically correctwe should assumethis limitisa non-degenerate (reasonably behaved)
distribution. We also note that the assumption of independent losses is by no means
important for the result that now follows and can be dropped if some additional minor
technical conditions are fullled.
Theorem 2 If F is in the maximum domain of attraction of a non-degenerate H then
this limit must be an extreme value distribution of the form H(x) = H
((x )=), for
some , and >0.
This result is known as the Fisher-Tippett Theorem and occupies ananalogous position
with respect to the study of maxima as the famous central limit theorem holds for the
study of sums or averages. Fisher-Tippett essentially says that the GEV is the only
possiblelimiting distributionfor (normalized) block maxima.
If the of the limiting GEV is strictly positive, F is said to be in the maximum
domain of attraction of the Frechet. Distributions in this class include the Pareto, t,
Burr, loggammaandCauchy distributions. If=0then F isinthe maximumdomainof
attraction of the Gumbel;examples are the normal,lognormaland gammadistributions.
If < 0 then F is in the maximum domain of attraction of the Weibull; examples are
the uniform and beta distributions. Distributions inthese three classes are precisely the
distributions for which excess distributions converge toa GPD limit.
Toimplementananalysisofstresslossesbasedonthislimitingmodelforblockmaxima
werequirealotofdata,sincewemustdeneblocksandreducethesedatatoblockmaxima
only. Suppose, forthesakeofillustration,thatwehavedaily(negative)returndatawhich
we divide into k large blocks of essentially equalsize; for example, we might take yearly
or semesterly blocks. Let M (j)
n
= max(X (j)
1 ;X
(j)
2
;::: ;X (j)
n
) be the maximum of the n
observations in block j. Usingthe methodof maximum likelihoodwetthe GEVto the
block maxima data M (1)
n
;::: ;M (k)
n
. That is we assume that our block size is suÆciently
large so that the limitingresult of Theorem 2may be taken asapproximatelyexact.
Suppose that we t a GEV model H
^
;^;^
to semesterly maxima of daily negative
re-turns. Thenaquantileofthisdistributionisastressloss. H 1
^
;^;^
(0:95)givesthemagnitude
ofdailylosslevelwemightexpect toreachevery20semestersor10years. Thisstressloss
is known asthe 20semester return leveland can beconsidered asakindof unconditional
quantileestimateforthe unknown underlyingdistributionF. In Figure7weshowthe20
semester return level for daily negative returns on the DAX index; the return level itself
is marked by asolidlineand anasymmetric95%condence intervalismarked by dotted
lines. In 23 years of data 4 observations exceed the point estimate; these 4 observations
occurin3dierentsemesters. Inafullanalysiswewould ofcoursetryaseriesofdierent
block sizes and compare results. See Embrechts etal. (1997)for amore detailed
descrip-tion of both the theoryand practice ofblock maximamodellingusing theFisher-Tippett
So far we have been concerned with univariate EVT. We have modelled the tails of
univariate distributions and estimated associated risk measures. In fact, there is also a
multivariate extreme value theory (MEVT) and this can be used to model the tails of
multivariate distributions in a theoretically supported way. In a sense MEVT is about
studying the dependence structure of extreme events, as we shall nowexplain.
Consider the random vector X = (X
1
;:::;X
d )
0
which represents losses of d dierent
kindsmeasured atthe same pointintime. Weassumetheselosses have jointdistribution
F(x
1
;:::;x
d
) = PfX
1 x
1
;::: ;X
d x
d
g and that individual losses have continuous
marginal distributions F
i
(x) = PfX
i
xg. It has been shown by Sklar (see Nelsen
(1999)) that every jointdistribution can be writtenas
F(x
1
;:::;x
d
)=C(F
1 (x
1
);::: ;F
d (x
d ));
for a unique function C that is known as the copula of F. A copula may be thought
of in two equivalent ways: as a function (with some technical restrictions) that maps
values inthe unit hypercube tovalues inthe unit interval;as a multivariatedistribution
function with standard uniform marginal distributions. The copula C does not change
under (strictly)increasingtransformationsof thelosses X
1
;::: ;X
d
and itmakessenseto
interpret C asthe dependence structure ofX orF, asthe followingsimple illustrationin
d=2 dimensions shows.
We take the marginal distributions to be standard univariate normal distributions
F
1 = F
2
= . We can then choose any copula C (i.e. any bivariate distribution with
uniform marginals)and apply ittothese marginalstoobtainbivariate distributionswith
normalmarginals. Forone particular choice ofC,whichwe calltheGaussiancopula and
denoteC Ga
,weobtainthestandard bivariatenormaldistributionwithcorrelation. The
Gaussian copula does not have a simple closed form and must be written as a double
integral - consult Embrechts, McNeil & Straumann (1999) for more details. Another
interesting copulais the Gumbel copulawhich doeshave asimple closed form,
C Gu
(v
1 ;v
2
)=exp
n
( logv
1 )
1=
+( logv
2 )
1= o
; 0< 1: (16)
Figure8shows thebivariatedistributionswhicharisewhenweapplythetwocopulasC Ga
0:7
and C Gu
0:5
to standard normal marginals. The left-hand picture is the standard bivariate
normal with correlation 70%; the right-hand picture is a bivariate distribution with
ap-proximatelyequal correlationbut the tendency togenerate extreme values of X
1
and X
2
simultaneously. It is,in this sense, a more dangerous distribution for risk managers. On
the basis of correlation, these distributions cannot be dierentiated but they obviously
have entirely dierent dependence structures. The bivariate normal has rather weak tail
dependence; the normal-Gumbeldistribution has pronounced tail dependence. For more
examples of parametric copulasconsult Nelsen (1999) orJoe(1997).
OnewayofunderstandingMEVTisasthe studyofcopulaswhichariseinthelimiting
multivariate distribution of componentwise block maxima. What do we mean by this?
SupposewehaveafamilyofrandomvectorsX
1 ;X
2
;::: representingd-dimensionallosses
at dierent points in time, where X
i = (X
i1
;::: ;X
id )
0
. A simple interpretation might
be that they represent daily (negative) returns for d instruments. As for the univariate
discussion ofblockmaxima, weassumethat lossesatdierentpointsintime are
indepen-dent. Thisassumption simpliesthe statement of the result,but can again be relaxedto
We denethe vectorof componentwiseblockmaxima tobe=(M
1n
;::: ;M
dn
) where
M
jn
= max(X
1j
;:::;X
nj
) is the block maximum of the jth component for a block of
size n observations. Now consider the vector of normalized block maxima given by
((M
1n b
1n )=a
1n
;::: ;(M
dn b
dn )=a
dn )
0
,wherea
jn
>0andb
jn
arenormalizingsequences
asinSection4.1. Ifthisvector converges indistribution toanon-degenerate limiting
dis-tributionthen this limitmust have the form
C
H
1
x
1
1
1
;::: ;H
d
x
d
d
d
;
for some values of the parameters
j ,
j
and
j
and some copula C. It must have this
form because of univariate EVT. Each marginaldistribution of the limitingmultivariate
distribution must be aGEV, aswe learnedin Theorem 2.
MEVTcharacterizes the copulasC which may arise inthis limit-the so-calledMEV
copulas. ItturnsoutthatthelimitingcopulasmustsatisfyC(u t
1
;::: ;u t
d )=C
t
(u
1
;::: ;u
d )
for t >0. There is nosingle parametric family which contains allthe MEV copulas, but
certain parametric copulas are consistent with the above condition and might therefore
beregarded as natural models for the dependence structure of extreme observations.
In two dimensions the Gumbel copula (16) is an example of an MEV copula; it is
moreover a versatile copula. If the parameter is 1 then C Gu
1 (v
1 ;v
2 ) = v
1 v
2
and this
copulamodelsindependenceofthecomponentsofarandomvector(X
1 ;X
2 )
0
. If 2(0;1)
then the Gumbel copula models dependence between X
1
and X
2
. As decreases the
dependence becomesstrongeruntilavalue =0correspondstoperfect dependenceofX
1
and X
2
; this means X
2
=T(X
1
) for some strictly increasing function T. For < 1 the
Gumbelcopula shows taildependence - thetendency ofextreme values tooccurtogether
as observed in Figure8. For more detailssee Embrechts etal.(1999).
The Gumbel copula can be used to build tail models in two dimensions as follows.
Suppose two risk factors (X
1 ;X
2 )
0
have an unknown joint distribution F and marginals
F
1
and F
2
so that, for some copula C, F(x
1 ;x
2
) = C(F
1 (x
1 );F
2 (x
2
)). Assume that we
have n pairs of data points fromthis distribution. Usingthe univariate POT methodwe
model the tails of the two marginal distributions by picking high thresholds u
1
and u
2
and using tail estimators of the form(5) toobtain
^
F
i
(x)=1 N
u
i
n
1+ ^
i
x u
i
^
i
1= ^
i
;x>u
i
; i=1;2:
Wemodelthe dependence structure of observations exceeding these thresholds using the
Gumbel copulaC Gu
^
for someestimated value ^
of the dependence parameter . We put
tail models and dependence structure together to obtaina modelfor the jointtail of F.
^
F(x
1 ;x
2 )=C
Gu
^
^
F
1 (x
1 );
^
F
2 (x
2 )
;x
1 >u
1 ; x
2 >u
2 :
The estimateof the dependence parameter can bedeterminedbymaximum likelihood,
either in a second stage after the parameters of the tail estimators have been estimated
or in a single stage estimation procedure where all parameters are estimated together.
Forfurther detailsofthese statisticalmatterssee Smith(1994). Forfurtherdetailsof the
theory consultJoe (1997).
This is perhaps the simplest bivariate POT model one can devise and it could be
a small number of dimensions. If we are interested in only a few risk factors and are
particularly concerned that joint extreme values may occur, we can use such models to
get useful descriptions of the joint tail. In very high dimensions there are simply too
manyparameters toestimateandtoomanydierenttailsof themultivariatedistribution
to worry about - the so-called curse of dimensionality. In such situations collapsing the
problemtoaunivariateproblemby consideringawholeportfolioofassets asasinglerisk
and collecting data ona portfoliolevel seems more realistic.
4.3 Software for EVT
We are aware of two software systems for EVT. EVIS (Extreme Values In S-Plus) is a
suite of free S-Plus functions for EVT developed at ETHZurich. To use these functions
it is necessary to have S-Plus, either for UNIXor Windows.
The functions provide assistance with four activities: exploring data to get a feel
for the heaviness of tails; implementing the POT method as described in Section 2;
im-plementing analyses of block maxima as described in Section 4.1; implementing a more
advanced form of the POT method known as the point process approach. The EVIS
functions provide simple templates which an S-Plus user could develop and incorporate
into a customized risk management system. In particular EVIS combines easily with
the extensive S-Plus time series functions orwith the S+GARCH module. This permits
dynamic risk measurement asdescribed inSection 3.
XTREMES is commercial software developed by Rolf Reiss and Michael Thomas at
the University of Siegen in Germany. It is designed to run as a self-contained program
under Windows (NT, 95,3.1). Fordidacticpurposes this programisverysuccessful; itis
particularlyhelpful for understanding the dierentsorts of extreme value modelling that
are possible and seeing how the models relate to each other. However, a user wanting to
adapt XTREMES for risk management purposes will need to learn and use the
Pascal-likeintegratedprogramminglanguageXPLthatcomeswithXTREMES.Thestand-alone
nature of XTREMES means that the user does not have access to the extensive libraries
of pre-programmed functions that packages likeS-Plus oer.
EVIS maybedownloaded overthe internetathttp://www.math.ethz.ch/ mcn eil.
InformationonXTREMEScanbefoundathttp://www.xtremes.math.un i-si egen .de.
5 Conclusion
EVT is here to stay as a technique in the risk manager's toolkit. We have argued in
this paperthat whenever tailsof probability distributions are of interest, it is naturalto
consider applying the theoretically supported methods of EVT. Methods based around
assumptions of normal distributionsare likely to underestimate tailrisk. Methods based
onhistoricalsimulationcan onlyprovideveryimprecise estimatesof tailrisk. EVTisthe
most scientic approach to an inherently diÆcult problem- predicting the size of a rare
event.
Wehavegivenoneverygeneralandeasilyimplementablemethod,theparametricPOT
methodof Section2,and indicatedhow this methodmay be adapted tomorespecialised
risk management problems such as the management of market risks. The reader who
wishes to learn more is encouraged to turn to textbooks like Embrechts et al. (1997)
& Stroughair (1999) for some commonsense caveats to uncritical use of EVT. We hope,
however, that all risk managers are persuaded that EVT will have an important role to
play in the development of sound risk management systems for the future.
References
Artzner, P., Delbaen, F., Eber, J. & Heath, D. (1997), `Thinking coherently', RISK
10(11), 68{71.
Beirlant,J.,Teugels,J.&Vynckier,P.(1996),Practicalanalysisofextremevalues,Leuven
University Press, Leuven.
Danielsson, J. & de Vries, C. (1997), `Tail index and quantile estimation with very high
frequency data', Journal of EmpiricalFinance 4,241{257.
Danielsson, J., Hartmann, P. & de Vries, C. (1998), `The cost of conservatism', RISK
11(1), 101{103.
Diebold, F., Schuermann, T. & Stroughair, J. (1999), Pitfalls and opportunities in the
use of extreme value theory in risk management, in `Advances in Computational
Finance', Kluwer AcademicPublishers, Amsterdam. To appear.
Embrechts, P., Kluppelberg, C. & Mikosch, T. (1997), Modelling extremal events for
insurance and nance, Springer, Berlin.
Embrechts, P., McNeil,A. & Straumann,D. (1999),`Correlation and dependency inrisk
management: properties and pitfalls', preprint,ETH Zurich.
Embrechts, P., Resnick, S. & Samorodnitsky, G. (1998), `Living on the edge', RISK
Magazine 11(1), 96{100.
Hull,J. &White, A.(1998),`Incorporatingvolatility updatingintothe historical
simula-tion methodfor value atrisk', Journal of Risk 1(1).
Joe,H.(1997),Multivariate ModelsandDependenceConcepts,Chapman&Hall,London.
McNeil,A. (1997),`Estimating the tailsof lossseverity distributionsusing extremevalue
theory', ASTIN Bulletin27, 117{137.
McNeil, A. (1998),`Historyrepeating', Risk 11(1),99.
McNeil,A.&Frey,R.(1998),`Estimation oftail-relatedriskmeasures forheteroscedastic
nancial time series: an extreme value approach', preprint,ETH Zurich.
Nelsen, R. B. (1999), An Introductionto Copulas, Springer, New York.
Reiss, R.&Thomas, M.(1997),StatisticalAnalysisof ExtremeValues,Birkhauser,Basel.
Smith, R. (1994), Multivariate threshold methods, in J. Galambos, ed., `Extreme Value
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
•••
• ••
•••
•••
•••
• ••
• • •
• ••
• ••
•
• •
•
10
50
100
0.0
0.2
0.4
0.6
0.8
1.0
x (on log scale)
Fu(x-u)
Figure1: Modellingthe excess distribution.
•••••••••••••••••••••••••
•••••••••••••••••••
•••••••••••••••
•••••••••••
•••••••••
••• ••••
•••••
••• •
•••
• •
••
•
•
•
•
•
•
•
10
50
100
0.00005
0.00050
0.00500
0.05000
x (on log scale)
1-F(x) (on log scale)
•••••••••••••••••••••••••
•••••••••••••••••••
•••••••••••••••
•••••••••••
•••••••••
••• ••••
•••••
••• •
•••
• •
••
•
•
•
•
•
•
•
10
50
100
0.00005
0.00050
0.00500
0.05000
x (on log scale)
1-F(x) (on log scale)
99
95
Figure3: EstimatingVaR.
•••••••••••••••••••••••••
•••••••••••••••••••
•••••••••••••••
•••••••••••
•••••••••
••• ••••
•••••
••• •
•••
• •
••
•
•
•
•
•
•
•
10
50
100
0.00005
0.00050
0.00500
0.05000
x (on log scale)
1-F(x) (on log scale)
99
95
99
95
Time
-0.05
0.0
0.05
0.10
0.15
01.10.87
01.04.88
01.10.88
01.04.89
01.10.89
01.04.90
Figure 5: Dynamic VaR using EVTfor the DAX index.
Time
-0.05
0.0
0.05
0.10
0.15
01.10.87
01.04.88
01.10.88
01.04.89
01.10.89
01.04.90
Time
-0.05
0.0
0.05
0.10
02.01.73
02.01.77
02.01.81
02.01.85
02.01.89
02.01.93
Figure 7: 20 semester return level for the DAX index (negative returns) is marked by
solid line; dotted linegives95% condence interval.
•
Figure 8: 10000 simulated data from two bivariate distributions with standard normal