Importance sampling for jump processes and applications to finance

(1)

Importance sampling for jump processes and

applications to finance

Laetitia Badouraly Kassim, J´

erˆ

ome Lelong, Imane Loumrhari

To cite this version:

Laetitia Badouraly Kassim, J´erˆome Lelong, Imane Loumrhari. Importance sampling for jump processes and applications to finance. Journal of Computational Finance, Incisive media Ltd, 2016, 19 (2), pp.109-139. <hal-00842362>

HAL Id: hal-00842362

https://hal.archives-ouvertes.fr/hal-00842362

Submitted on 8 Jul 2013

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche fran¸cais ou étrangers, des laboratoires publics ou privés.

(2)

Importance sampling for jump processes and applications to

finance

Laetitia Badouraly Kassim∗ Jérôme Lelong∗ Imane Loumrhari∗ July 8, 2013

Abstract

Adaptive importance sampling techniques are widely known for the Gaussian setting of Brownian driven diffusions. In this work, we want to extend them to jump processes. Our approach relies on a change of the jump intensity combined with the standard exponential tilting for the Brownian motion. The free parameters of our framework are optimized using sample average approximation techniques. We illustrate the efficiency of our method on the valuation of financial derivatives in several jump models.

Keywords: Importance sampling; sample average approximation; adaptive Monte Carlo methods.

1 Introduction

Lévy models have become quite popular in finance over the last decade. Vanilla options are easily and efficiently priced using the Fast Fourier Transform approach developed by Carr et al. (1999) but things become far more delicate for exotic options, for which Monte Carlo often reveals as the only possible approach from a numerical point of view. This becomes even more true when dealing with high dimensional products. In this work, we want to propose an adaptive Monte Carlo method based on importance sampling for computing the expectation of a function of a Lévy process. As explained by Kiessling and Tempone (2011), when resorting to Monte Carlo approaches, infinite activity Lévy processes are often approximated by finite activity processes, which can always be represented as the sum of a continuous diffusion (ie. driven by a Brownian motion) and a compound Poisson process. In this work, we will concentrate on such jump diffusions with a Brownian driven part and a jump part written as a compound Poisson process or possibly the sum of independent compound Poisson processes in the multidimensional case.

We consider a mixed Gaussian and Poisson framework in which we would like to settle an adaptive Monte Carlo method based on some importance sampling approach. Let G = (G1, . . . , Gd) be a standard normal random vector inRdandNµ= (N1µ1, . . . , N

µp

p ) a vector of

p independent Poisson random variables with parametersµ= (µ1, . . . , µp). We assume that

G and Nµare independent. We focus on the computation of

E=E_[_f₍_{G, N}µ_)] _(1.1)

∗_{Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, BP 53, 38041 Grenoble Cédex 9, FRANCE, e-mail:}

[email protected], [email protected], [email protected]. This project was supported by the Finance for Energy Market Research Centre, www.fime-lab.org.

(3)

wheref :Rd_×Np _−→R _satisfiesE_[_|_f₍_{G, N}µ₎_|_]_<_∞_.

Lemma 1.1. For any measurable functionh:Rd_×Np _−→R _{either nonnegative or such that} E_[_|_h₍_{G, N}µ₎_|_]_<_∞_{, one has} ∀θ∈Rd_{, λ}_∈R∗ +p, E[h(G, Nµ)] =E  h(G+θ, Nλ) e−θ·G− |θ|2 2 p Y i=1 eλi−µi _µ i λi N_iλi   (1.2)

where Nλ _{is a vector of} _p _{independent Poisson random variables with parameters} _λ ₌

(λ1, . . . , λp).

The proof of this lemma relies on elementary variable changes. Lemma 1.1 enables us to introduce some extra degrees of freedom in the computation of E. When the expectation

E is computed using a Monte Carlo method, the Central Limit Theorem advises to use the representation of f(G, Nµ_{) with the smallest possible variable which}

is achieved by choosing the parameters (θ, λ) which minimize the variance of of

f(G + θ, Nλ) e−θ·G−|θ| 2 2 Qp i=1eλi−µi _µ i λi N_iλi

. This raises several questions which are investigated in the paper. Does the variance of f(G+θ, Nλ) e−θ·G−|θ|

2 2 Qp i=1eλi−µi _µ i λi N_iλi

admits a unique minimizer? If so, how can it be computed numerically and how to make the most of it in view of a further Monte Carlo computation?

These questions are quite natural in the context of Monte Carlo computations and have already been widely discussed in the pure Gaussian framework. The first applications to option pricing of some adaptive Monte Carlo methods based on importance sampling goes back to the papers of Arouna (Winter 2003/04, 2004). These papers were based on a change of mean for the Gaussian random normal vectors and the optimal parameter was searched for by using some stochastic approximation algorithm with random truncations. This approach was later further investigated by Lapeyre and Lelong (2011) who proposed a more general framework for settling adaptive Monte Carlo methods using stochastic approximation, which is know to be a little tricky to fine tune in practical applications. To circumvent the delicate behaviour of stochastic approximation, Jourdain and Lelong (2009) proposed to resort to sample average approximation instead, which basically relies on deterministic optimization techniques. An alternative to random truncations was studied by Lemaire and Pagès (2010) who managed to modify the initial problem in order ta apply the more standard Robbins Monro algorithm. Not only have they applied this to the Gaussian framework but they have considered a few examples of Levy processes relying on the Esscher transform to introduce a free parameter. The idea of using the Esscher transform was also extensively investigated by Kawai (2007, 2008a,b).

In this work, we want to understand how the jump intensity of a Lévy process can be modified to reduce the variance. First, we explain the parametric importance sampling transformation we use for the Gaussian and Poisson parts. Then, in Section 2, we prove that this transfor-mation leads to a convex optimization problem and we study the properties of the optimal parameter estimator. Then, in Section 3, we explain how to use this estimator in a Monte Carlo method. We prove that this approach satisfies an adaptive strong law of large numbers and a central limit theorem with optimal limiting variance. Finally, in Section 4, we apply our methodology to option pricing with jump processes.

(4)

Notations.

• We encode any elements of Rm _{as column vectors.}

• Ifx_∈Rm_,_x∗ _{is a row vector. We use the “}∗_{” notation to denote the transpose operator} for vectors and matrices.

• If x, y _∈ Rm_, _x_·_y _{denotes the scalar product of} _x _and _y _{and the associated norm is} denoted by _{| · |}.

• If x_∈Rm_{, diag}_m₍_x_{) is the matrix with diagonal elements given by the vector}_x _{and all} extra diagonal elements equal to zero.

• The matrix Im denotes the identity matrix in dimensionm.

• If x ∈ Rm_{, we defined} _d₀₍_x_{) = min}₁_≤_i_≤_m_|_x_i_|_{which is the distance between} _x _{and the} set _{y_∈Rm _: Qm

i=1yi= 0}.

• We say that a random vector Xwith values inRm_{has Poisson distribution with} param-eter µ _∈ Rm _{if the} _X_i _{are independent and have Poisson distribution with parameter}

µi.

2 Computing the optimal importance sampling parameters

2.1 Properties of the variance

Thanks to Lemma 1.1, the expectation _E can be written

E=E  f(G+θ, Nλ) e−θ·G− |θ|2 2 p Y i=1 eλi−µi _µ i λi N_iλi  , ∀θ∈Rd, λ∈R∗₊p.

Note that for the particular choice of θ= 0 and λ=µ, we recover Equation (1.1).

The convergence rate of a Monte Carlo estimator of _E based on this new representation is governed by the variance off(G+θ, Nλ) e−θ·G−|θ|

2 2 Qp i=1eλi−µi _µ i λi N_iλi

which can be written in the formv(θ, λ)− E2 _where

v(θ, λ) =E " f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi# . (2.1)

This expression of v is easily obtained by applying Lemma 1.1 to the function h(g, n) =

f(g+θ, n)2_e−2θ·g−|θ|2Qp i=1e2(λi−µi) _µ i λi 2ni

. Applying the change of measure backward after computing the variance enables us to write the variance in a form which does not involve the parameters θ and λin the arguments of the functionf. This remark is of prime importance as it is the basement of the following key result stating the strong convexity of v.

Proposition 2.1. Assume that

(_A1) i. _∃(n1, . . . , np)∈N∗p, s.t.P(|f(G,(n1, . . . , np))|>0)>0

(5)

Then, the function v is infinitely continuously differentiable, strongly convex and moreover the gradient vectors are given by

∇θv(θ, λ) =E " (θ−G)f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi# (2.2) ∇λv(θ, λ) =E " a(Nµ, λ)f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi# (2.3)

where the vector a(Nµ, λ) =

1−N µ₁ 1 λ1 , . . . ,1− Npµp λp ∗

. The second derivatives are defined by

∇2θ,θv(θ, λ) =E " (Id+ (θ−G)(θ−G)∗)f(G, Nµ)2e−θ·G+ |θ|2 2 p Y i=1 eλi−µi _µ i λi N_iµi# (2.4) ∇2θ,λv(θ, λ) =E " (θ₋G)a(Nµ, λ)∗f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi# (2.5) ∇2λ,λv(θ, λ) =E " (D+a(Nµ, λ)a(Nµ, λ)∗)f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi# (2.6)

where the diagonal matrix Dis defined by D= diag_p

Nµ1 1 λ2 1 , . . . ,Npµp λ2 p .

Proof. Let us define the function F :Rd_×Rd_×R∗

+p −→Rby F(g, θ, n, λ) =f(g, n)2e−θ·g+|θ| 2 2 p Y i=1 eλi−µi _µ i λi ni . (2.7)

For any values of (g, n), the function (θ, λ) _7−→F(g, θ, n, λ) is infinitely continuously differ-entiable. Since for all 0< m < M,

sup |(θ,λ)|≤M,m<d0(λ) |∂θjF(G, θ, N µ_{, λ}₎_{| ≤} M+ eGj_{+ e}−Gj_f₍_{G, N}µ₎2_eM2/2+pM d Y k=1 (eM Gk_{+ e}−M Gk₎ p Y i=1 e−µi _µ i m N_iµi (2.8)

where the right hand side is integrable because by Hölder’s inequality and Assumption (A1-ii), we have that for all (θ, λ)∈Rd_×Rp_,E₍_f₍_{G, N}µ₎2_eθ·G+λ·Nµ₎_<_∞_{. Hence, Lebesgue’s theorem} ensures that v is continuously differentiable w.r.t. θand _∇θv is given by Equation (2.2).

We proceed similarly for the derivative w.r.t. λby using the following upper bound sup |(θ,λ)|≤M,m<d0(λ) |∂λjF(G, θ, N µ_{, λ}₎_{| ≤}_{1 + e}Njµj/m_{+ e}−N µj j /m f(G, Nµ)2eM2/2+pM d Y k=1 (eM Gk_{+ e}−M Gk₎ p Y i=1 e−µi _µ i m N_iµi . (2.9)

(6)

High order differentiability properties are obtained by similar arguments and in particular the Hessian matrix writes with the help of the functionF

∇2v(θ, λ) =E " F(G, θ, Nµ, λ) (θ−G)(θ−G) ∗ ₍_θ₋_G₎_a₍_Nµ_{, λ}₎∗ a(Nµ, λ)(θ₋G)∗ a(Nµ, λ)a(Nµ, λ)∗ ! +F(G, θ, Nµ, λ) Id 0 0 D !# Note that (θ−G)(θ−G)∗ (θ−G)a(Nµ, λ)∗ a(Nµ, λ)(θ₋G)∗ a(Nµ, λ)a(Nµ, λ)∗ ! = θ−G a(Nµ, λ) ! θ−G a(Nµ, λ) !∗ .

Hence the first part of the Hessian is a positive semi definite rank one matrix. E " F(G, θ, Nµ, λ) Id 0 0 D !# ≥E_[_F₍_{G, θ, N}µ_{, λ}₎₁_{_Nµ₌₍_n 1,...,np}] diag Id, n1 λ2 1 , . . . ,np λ2 p ! . Moreover, E_[_F₍_{G, θ, N}µ_{, λ}₎₁_{_Nµ₌₍_n₁_,...,n_p₎_}]≥E f(G,(n1, . . . , np))2e−θ·G+ |θ|2 2 p Y i=1 eni−2µi µ 2 i ni !ni 1 ni! ≥Eh_f₍_G,₍_n₁_{, . . . , n}_p₎₎2_e−θ·GiEh_eθ·Gi p Y i=1 eni−2µi µ 2 i ni !ni 1 ni! ≥E_[_|_f₍_G,₍_n₁_{, . . . , n}_p₎₎_|_]2 p Y i=1 eni−2µi µ 2 i ni !ni 1 ni!

Thanks to Condition (_A1-i), this lower bound is strictly positive. Hence, the Hessian matrix is uniformly bounded from below which yields the strong convexity of v. As a consequence, the function vadmits a unique minimizer (θ⋆, λ⋆) defined by∇θv(θ⋆, λ⋆) =

∇λv(θ⋆, λ⋆) = 0. The characterization of (θ⋆, λ⋆) as the unique minimizer of a strongly convex

function is very appealing but there is no hope to compute the gradient ofvin a closed form, so we will need to resort to some kind of approximations before running the optimization step. Before studying the possible ways of approximating the optimal parameter, let us note that that it is of dimensiond+pwhich can become very large in particular when the variables

G and Nl _{come from the discretization of jump diffusion process. In many situations, it is}

advisable to reduce the dimension of the space in which the optimization problem is solved.

Reducing the dimension of the optimization problem. Let 0< d′ _≤_d_{and 0}_{< p}′_≤_p be the reduced dimension. Instead of searching for the best importance sampling parameter (θ, λ) in the whole spaceRd_×R∗

+p, we consider the subspace {(Aϑ, Bλ) : ϑ∈Rd

′

, λ_∈R∗ +p

′

}

where A ∈ Rd×d′ _{is a matrix with rank} _d′ _≤_d _and _B _∈ R∗ +p×p

′

a matrix with rank p′ ≤ p. Note that since all the coefficients ofBare non negative, for allϑ∈R∗

+p

′

,Bϑ∈R∗

+p; actually, it is easily seen that the image of R∗

+p ′ through B is isomorphic toR∗ +p ′ .

For such matrices A and B, we introduce the functionvA,B:Rd′_×R∗₊p′ _7−→R _{defined by}

(7)

The function vA,B inherits from the regularity and convexity properties of v. Hence, from Proposition 2.1, we know that vA,B is continuously infinitely differentiable and strongly con-vex. As a consequence, there exists a unique couple of minimizers (ϑA,b⋆ ,λA,B⋆ ) such that

vA,B(ϑA,B⋆ ,λA,B⋆ ) = inf_ϑ_∈Rd′_,λ∈R∗

+

p′ vA,B(ϑ,λ). We can also deduce the gradient vector of vA,B_n

∇vA,B(ϑ,λ_{) =} A

∗_∇

θ(Aϑ, Bλ)

B∗_∇λ(Aϑ, Bλ)

!

and its Hessian matrix

∇2vA,B(ϑ,λ_{) =}E

"

F(G, Aϑ, Nµ, Bλ₎ A

∗₍_Aϑ₋_G₎₍_Aϑ₋_G₎∗_A _A∗₍_Aϑ₋_G₎_a₍_Nµ_{, B}_λ₎∗_B

B∗a(Nµ, Bλ₎₍_Aϑ₋_G₎∗_A _B∗_aa∗₍_Nµ_{, B}λ₎_B ! +F(G, Aϑ, Nµ, Bλ₎ A ∗_A ₀ 0 B∗DB !#

where the functionF is defined by Equation (2.7). For the particular choicesA=Id,B =Ip,

d=d′ and p=p′, the functions vId,Ip _and _v _coincide.

The Esscher transform as a way to reduce the dimension. Consider a two dimen-sional process (Xt)t≤T of the formXt = (Wt,N˜tµ˜) where W is a real Brownian motion and

˜

Nµ˜ _{is a Poisson process with intensity ˜}_µ_{. The Esscher transform applied to}_X _{yields that for} any nonnegative functionh, we have the following equality∀α∈R_,˜_λ_∈R∗

+, E_[_h₍₍_W_t_,_N˜_tµ˜₎_{, t}_≤_T_{)] =}E  h((W_t+αt,N˜ ˜ λ_{, t}_≤_T_{)) e}−αWT−|α| 2_T 2 eT(˜λ−µ˜) _µ_˜ ˜ λ N˜_Tλ˜  

Let 0 = t0 < · · · < tp = T be a time grid of [0, T]. If we consider the vector G (resp.

Nµ_{) as the increments of} _W _(resp. _N˜µ˜_{) on the grid, we can recover a particular form of} Equation (1.2) with A, B∈Rp _{given by}

A=√t1,√t2−t1, . . . ,ptp−tp−1

∗

; B = (t1, t2−t1, . . . , tp−tp−1)∗.

2.2 Tracking the optimal importance sampling parameter

The optimal importance sampling parameter (θ∗, λ∗) can characterized as the unique zero of an expectation, which is the typical framework for applying stochastic approximation. In particular, we could use the algorithm introduced by Chen and Zhu (1986); we refer to Lelong (2008, 2011) for a study of the convergence and asymptotic behaviour of these algorithms. The use of stochastic approximation for devising adaptive importance sampling method was deeply investigated in a recent survey by Lapeyre and Lelong (2011) who highlighted the difficulties in making those algorithms practically converge.

In this work, we adopt a totally different point of view often called sample average approximation, which basically consists in first replacing expectations by sample averages and then using deterministic optimization techniques on these empirical means. This approach was studied in the Gaussian framework by Jourdain and Lelong (2009) and proved to be very efficient.

(8)

Let (Gj)j≥1be a sequence ofd−dimensional independent and identically distributed standard normal random variables. We also introduce (Nµ,j)j≥1a sequence ofp−dimensional indepen-dent and iindepen-dentically distributed random vector following the law of Nµ_{, ie. the components}

of the vectors are independent and Poisson distributed with parameter µ.

For n≥1, we introduce the sample average approximation of the functionvA,B defined by

v_nA,B(ϑ,λ_{) =} 1 n n X j=1 f(Gj, Nµ,j)2e−Aϑ·Gj+|Aϑ| 2 2 p Y i=1 e(Bλ)i−µi _µ i (Bλ₎_i N_iµi,j . (2.11) For n large enough, f(Gj, Nµ,j) , 0 for some index j _{∈ {}1, . . . , n_} and the approximation

vA,B

n is also strongly convex and hence admits a unique minimizer (ϑA,Bn ,λA,Bn ) defined by

v_nA,B(ϑA,B_n ,λA,B_n _{) = inf}

ϑ∈Rd′_,λ_∈R∗

+p

′ v_nA,B(ϑ,λ).

Proposition 2.2. Under Assumption (A1), the sequence of random functions (v_nA,B)n

con-verges a.s. locally uniformly to the continuous functionvA,B.

To prove this result, we use the uniform strong law of large numbers recalled hereafter, see for instance Rubinstein and Shapiro (1993, Lemma A1). This result is also a consequence of the strong law of large numbers in Banach spaces Ledoux and Talagrand (1991, Corollary 7.10, page 189).

Lemma 2.3. Let (Xi)i≥1 be a sequence of i.i.d. Rm-valued random vectors, E an open set

of Rd _and _h_:_E_×Rm_→R _{be a measurable function. Assume that}

• a.s., χ∈E7→h(χ, X1) is continuous,

• for all compact sets K of Rd _{such that} _K_⊂_E_, E_sup

χ∈K|h(χ, X1)|

<+∞. Then, a.s. the sequence of random functions χ _∈ K _7→ _n1Pn

i=1h(χ, Xi) converges locally

uniformly to the continuous function χ_∈E _7→E₍_h₍_{χ, X}₁₎₎_.

Proof of Proposition 2.2. It is sufficient to prove the result for vn and it will hold for vA,Bn .

LetM > m >0. For all (θ, λ) such that_|(θ, λ)_{| ≤}M and d0(λ)> m, we have

f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi ≤f(G, Nµ)2 d Y k=1 (e−M Gk_{+ e}M Gk_{) e}M 2 2 p Y i=1 eM−µi _µ i m N_iµi .

The r.h.s. is integrable by (A1) and Hölder’s inequality; hence, we can apply Lemma 2.3.

Proposition 2.4. Under Assumption (_A1), the pair (ϑA,B

n ,λA,Bn ) converges a.s. to

(ϑA,B⋆ ,λA,B⋆ ) as n−→+∞. Moreover, if

(_A2) _∃δ >0, Eh_|_f₍_{G, N}µ₎_|4+δi_<_∞_,

√

n(ϑA,B_n ,λA,B

n )−(ϑA,B⋆ ,λA,B⋆ )

converges in law to the normal distribution Nd+p(0,Γ)

where

Γ =∇2vA,B(ϑA,B_⋆ ,λA,B

⋆ )

−1

Cov(∇F(G, AϑA,B_⋆ , Nµ, BλA,B

⋆ ))

∇2vA,B(ϑA,B_⋆ ,λA,B

⋆ )

−1

with the function F defined by Equation (2.7) and its gradient computed w.r.t. the reduced parameters (ϑ,λ₎_.

(9)

Condition (A2) ensures that the covariance matrix Cov(∇F(G, AϑA,B⋆ , Nµ, BλA,B⋆ )) does

exist. The non singularity of the matrix ∇2_vA,B₍_ϑA,B

⋆ ,λA,B⋆ ) is guaranteed by the strict

convexity ofv.

By combining Propositions 2.2 and 2.4, we can state the following result

Corollary 2.5. Under Assumption (_A1),v_nA,B(ϑA,B_n ,λ_nA,B₎ _{converge a.s. to}_vA,B₍_ϑA,B_⋆ _,λA,B_⋆ ₎

asn_−→+_∞.

Proof of Proposition 2.4. Letε >0. We define a compact neighbourhood Vε of (ϑ⋆,λ⋆)

Vεdef=

n

(ϑ,λ₎_∈Rd_×Rp _: _|₍_ϑ,λ₎₋₍_ϑ_⋆_,λ_⋆₎_{| ≤}_εo_. _(2.12)

In the following, we assume thatεis small enough, so that _Vε is included inRd×R∗+p. By the strict convexity and the continuity ofvA,B,

αdef= inf (ϑ,λ₎_∈Vc

ε

vA,B(ϑ,λ₎₋_vA,B₍_ϑA,B

⋆ ,λA,B⋆ )>0.

The local uniform convergence ofv_nA,B to vA,B ensures that for some nα sufficiently large,

∀n_≥nα, ∀(ϑ,λ)∈ Vε, |vnA,B(ϑ,λ)−vA,B(ϑ,λ)| ≤

α

3. (2.13)

Forn_≥nα and (ϑ,λ)<Vε, we define (ϑA,Bε ,λA,Bε )∈ Vε and writes as the convex combination

of (ϑA,B⋆ ,λA,B⋆ ) and (ϑ,λ).

(ϑA,B_ε ,λA,B ε ) def = ϑA,B_⋆ +ε ϑ−ϑ A,B ⋆ |(ϑ−ϑA,B⋆ ,λ−λA,B⋆ )| ,λA,B ⋆ +ε µ₋λA,B_⋆ |(ϑ−ϑA,B⋆ ,λ−λA,B⋆ )| ! .

We deduce, using the convexity of vA,B

n for the first inequality and Equation (2.13) for the

second one v_nA,B(ϑ,λ₎₋_vA,B n (ϑA,B⋆ ,λA,B⋆ )≥ | (ϑ−ϑA,B⋆ ,λ−λA,B⋆ )| ε h

v_nA,B(ϑA,B_ε ,λA,B

ε )−vnA,B(ϑA,B⋆ ,λA,B⋆ )

i

≥

vA,B(ϑA,B_ε ,λA,B_ε ₎₋_vA,B₍_ϑA,B_⋆ _,λA,B_⋆ ₎₋2α

3

≥ α

3. The optimality of (ϑA,G

n ,λA,Bn ) yields that vA,Bn (ϑA,Bn ,λnA,B) ≤ vA,Bn (ϑA,B⋆ ,λA,B⋆ ). So, we

conclude that (ϑA,B_n ,λA,B

n ) ∈ Vε for n ≥ nα. Therefore, (ϑA,Bn ,λA,Bn ) converges a.s. to

(ϑA,B⋆ ,λA,B⋆ ).

We have seen in the proof of Proposition 2.1, that Eh_sup

|(θ,λ)|≤M,m<d0(λ)∇F(G, θ, N

µ_{, λ}₎i _< _∞_{, see Equation (2.9) and (2.8). Similarly, we}

can show that Eh_sup

|(θ,λ)|≤M,m<d0(λ)∇

2_F₍_{G, θ, N}µ_{, λ}₎i _< _∞_{. The central limit theorem}

governing the convergence of the pair (ϑA,B_n ,λA,B_n _{) to the pair (}_ϑA,B_⋆ _,λA,B_⋆ _{) can be deduced}

(10)

3 Adaptive Monte Carlo

In this section, we assume to have at hand a sequence of optimal solutions (ϑA,B_n ,λA,B

n ) and

want to devise an adaptive Monte Carlo taking advantage of the knowledge of these parameters through the use of Equation (1.2). In a previous work Jourdain and Lelong (2009) dedicated to the Gaussian framework, we had used the same samples for approximating v by vn and

after to build a Monte Carlo estimator of_E involvingθn. This was possible because a normal

random vectorX with mean vector θ naturally writes as X =θ+G where Gis a standard normal random vector.

No such simple relation exists for the Poisson distribution to link a Poisson random variable with parameter µ to one with parameter λ. Hence, it is not worth trying to reuse, for the Monte Carlo estimator based on Equation (1.2), the same Poisson random samples as those involved invn. Then, we suggest the following two stages algorithm.

Algorithm 3.1.

First stage Generate a sequence (Gj)j=1,...,m of i.i.d random vector following the standard

normal distribution in Rd _{and a sequence} ₍_Nj _{= (}_Nj

1, . . . , Npj))j=1,...,m of i.i.d Poisson

random vectors with parameter µ. Define vA,B_m (ϑ,λ_{) =} 1 m m X j=1 f(Gj, Nµ,j)2e−Aϑ·Gj+|Aϑ| 2 2 p Y i=1 eBλi−µi _µ i Bλ_i N_iµi,j . (3.1) Compute (ϑm,λm) = arg min (ϑ,λ₎_∈Rd_×R∗ +p vA,B_m (ϑ,λ₎_.

Second stage: Generate a sequence( ¯Gj₎

j=1,...,nof i.i.d random vector following the standard

normal distribution in Rd _{and a sequence} _{( ¯}_Nj _{= ( ¯}_Nj

1, . . . ,N¯pj))j=1,...,n of i.i.d Poisson

random vectors with parameter Bλ_m_{. Conditionally on} λ_m_{, these two sequences are}

assumed to be independent of the sequences (Gj)j=1,...,m and (Nµ,j)j=1,...,m

Define M_n,mA,B = 1 n n X j=1 f( ¯Gj+Aϑm,N¯j) e−Aϑm·G¯ j₋|Aϑm|2 2 p Y i=1 e(Bλm)i−µi _µ i (Bλ_m₎_i N¯_ij . (3.2)

3.1 Strong law of large numbers and central limit theorem

The conditional independence between the two stages combined with Lemma 1.1 immediately shows that for any fixed m, n, the estimator MA,B

n,m is unbiased, ie. E[Mn,mA,B] =E.

Condition-ally on (Gj, Nj)j=1,...,m, the terms involved in the sum of Equation (3.2) are i.i.d., hence the

standard strong law of large numbers yields that limn→+∞Mn,mA,B =E[f(G, Nµ)] a.s. by

ap-plying Lemma 1.1. Similarly, the central limit theorem applies and we can state the following result.

Proposition 3.2. For any fixedm, MA,B

n,m converges a.s. toE[f(G, Nµ)]asngoes to infinity

and moreover √n(M_n,mA,B−E_[_f₍_{G, N}µ_)])_{−−−−−→}law

n→+∞ N(0, v

A,B₍_ϑ

(11)

This result is not fully satisfactory as from a practical point of view, we like to know the limiting of the estimator M_n,mA,B₍_n₎ where m(n) is a function of n tending to infinity with

n. To investigate the asymptotic behaviour when m and n tend to infinity together, it is convenient to rewriteM_n,mA,B₍_n₎ using an auxiliary sequence of random variables. We introduce a sequence ( ¯U_ij)1≤i≤p,j≥1of i.i.d. random variables following the uniform distribution on [0,1] and independent of all the other random variables used so far. If we define

˜ N_ij(λ) = ∞ X k=0 k1_{_P₍_λ i;k)≤U_ij<P(λi;k+1)} for all 1≤i≤p, 1≤j

where P(λ,·) is the cumulative distribution function of the Poisson distribution with pa-rameter λ, then ( ¯Nj)j=1,...,n Law= ( ˜Nj(λm(n)))j=1,...,n. Since for all k ∈ N, the function

λ_∈ R∗ _7−→_P₍_{λ, k}_{) is continuous and decreasing, we get that lim}_n_→∞_N˜j₍_λ_m₍_n₎_{) =} _Nj₍_λ_⋆₎ a.s. and for all λ≤λ′, ˜Nj(λ′) <N˜j(λ) where the ordering has to be understood component wise. We define ˜ Mn(θ, λ) = 1 n n X j=1 f( ¯Gj+θ,N˜j(λ)) e−θ·G¯j−|θ| 2 2 p Y i=1 eλi−µi _µ i λi N˜_ij(λ) .

It is obvious thatM_n,mA,B₍_n₎ Law= M˜n(Aϑm(n), Bλm(n)).

Theorem 3.3. Let m(n) be an increasing function of n tending to infinity. Then, under Assumptions (A1) and (A2), M_n,mA,B₍_n₎ converges a.s. to E_[_f₍_{G, N}µ_)] _as _n _{goes to infinity.}

It is actually sufficient to prove the result forA and B being identity matrices. For the sake of clear notations, whenA=Idand B =Ip, we writeMn,m(n) instead ofM_n,mA,B₍_n₎.

Proof. We have already seen that E_[_M_n,m_{] =} _E_{. Thanks the independence of the samples} used in the two stages of the algorithm, conditionally on ((Gj_{, N}j₎_{, j} _≥ _1), _M

n,m writes as

a sum of i.i.d random variables. We introduce the σ−algebra G = σ((Gj, Nj), j ≥ 1). We define for allm, j_≥1

Xm,j =f( ¯Gj+θm,N¯j) e−θm· ¯ Gj₋|θm|2 2 p Y i=1 e(λm)i−µi _µ i (λm)i N¯_ij .

Note that conditionally onG, the sequence (Xm,j)j≥1 is i.i.d. for any fixedm≥1. For a fixed ε >0, we recall the definition of _Vε

Vεdef=

n

(θ, λ)_∈Rd_×Rp _: _|₍_{θ, λ}₎₋₍_θ_⋆_{, λ}_⋆₎_{| ≤}_εo_.

(12)

allm, n≥1, Eh₍_M_n,m_{− E}₎2₁ {(θm,λm)∈Vε} i =E  E   1 n n X i=1 (Xm,i− E) !2 G  1_{₍_θ_m_,λ_m₎_∈Vε}   ≤ _n1 EhEh₍_X_m,i_{− E}₎2G i 1_{₍_θ_m_,λ_m₎_∈Vε}i ≤ 1 nE h v(θm, λm)1{(θm,λm)∈Vε} i ≤ 1 n (θ,λsup)∈Vε v(θ, λ)_{− E}2 ! ≤ c n. (3.3)

We deduce from the Borell Cantelli Lemma that for any increasing function ρ : N _→ N_, (M_n2_,ρ₍_n₎− E)1

{(θρ(n),λρ(n))∈Vε} tends to 0 a.s.

To prove that (M_n,m₍_n₎_{− E})1_{₍_θ

m(n),λm(n))∈Vε} converges to zero a.s., we mimic the proof of

the classical strong law of large numbers.

Let n∈N∗_{, we define}_k₌_⌊√_n_⌋_{; then} _k2_≤_{n <}₍_k_{+ 1)}2_.

M_n,m₍_n₎_{− E} =1 n k2 X i=1 (X_m₍_n₎_,i_{− E}) + 1 n n X i=k2₊₁ (X_m₍_n₎_,i_{− E}) Mn,m(n)− E ≤ 1 k2 k2 X i=1 (X_m₍_n₎_,i_{− E}) + 1 n n X i=k2₊₁ (X_m₍_n₎_,i_{− E}) . (3.4) Using Equation (3.3), E      1 k2 k2 X i=1 (Xm(n),i− E)   2 1_{₍_θ_m_,λ_m₎_∈Vε}   ≤ c k2.

Hence, we easily deduce from the Borrel Cantelli Lemma that 1 k2 Pk2 i=1(Xm(n),i− E)

1{(θ_m₍_n₎,λ_m₍_n₎)∈Vε} tends to 0 a.s. when k goes to infinity, ie. whenn

goes to infinity. A similar computation as in Equation (3.3) leads to

E      1 n n X i=k2₊₁ (Xm(n),i− E)   2 1_{₍_θ m(n),λm(n))∈Vε}   ≤ n−k2 n2 sup (θ,λ)∈K v(θ, λ)− E2 ! ≤ c n3/2.

Hence, the Borel Cantelli Lemma yields that _n1 Pn i=k2₊₁(X_m₍_n₎_,i− E) 1{(θ_m₍_n₎,λ_m₍_n₎)∈Vε} →0

a.s. when ngoes to infinity.

Eventually, we have proved that (M_n,m₍_n₎_{− E})1_{₍_θ

m(n),λm(n))∈Vε} converges to zero a.s. Since,

(θ_m₍_n₎, λ_m₍_n₎)_→(θ⋆, λ⋆) a.s., we deduce that Mn,m(n)→ E a.s. whenn goes to infinity.

Theorem 3.4. Let m(n) be an integer valued function of nincreasing to infinity with nand such that m(n)∼nβ for some β >0. Assume that

(13)

ii. there exists a compact neighbourhood V of (ϑ⋆,λ⋆) included in Rd ′ ×R∗ +p ′ and

η >0 such that Eh_sup

(ϑ,λ)∈V|f( ¯G+Aϑ,N˜1(Bλ))|2(1+η)

i

<_∞. Then, under Assumptions (_A1) and (_A2),

√

n( ˜Mn(Aϑm(n), Bλm(n))−E[f(G, Nµ)])

law

−−−−−→_n_→₊_∞ N(0, vA,B(ϑ⋆,λ⋆)).

Proof. It is actually sufficient to prove the result forA andB being identity matrices.

√

n( ˜Mn(θm(n), λm(n))− E) = √n( ˜Mn(θ⋆, λ⋆)− E) + √n( ˜Mn(θm(n), λm(n))−M˜n(θ⋆, λ⋆))

From the standard central limit theorem, √n( ˜Mn(θ⋆, λ⋆)−E)−−−−−→law

n→+∞ N(0, v(θ⋆, λ⋆)). There-fore, it is sufficient to prove that √n( ˜Mn(θm(n), λm(n))−M˜n(θ⋆, λ⋆)) −−−−−→P r

n→+∞ 0. Let ε > 0 and 0< α < β/2. P√_n Mñ(θm(n), λm(n))−Mñ(θ⋆, λ⋆) > ε ≤P₍_nα (θm(n), λm(n))−(θ⋆, λ⋆) >1) + n ε2E Mñ(θm(n), λm(n))−Mñ(θ⋆, λ⋆) 2 1_{|₍_θ m(n),λm(n))−(θ⋆,λ⋆)|≤n−α} .

Note that nα ∼ m(n)α/β with α/β < 1/2, hence we deduce from Proposition 2.4, that P₍_nα (θm(n), λm(n))−(θ⋆, λ⋆) >1)−→0. We define Q(θ, λ) = e−θ·G¯1−|θ| 2 2 p Y i=1 eλi−µi _µ i λi N˜_i1(λ) .

Conditionally on (θ_m₍_n₎, λ_m₍_n₎), ˜Mn(θm(n), λm(n)) writes as a sum of i.i.d random variables.

nE M˜n(θm(n), λm(n))−M˜n(θ⋆, λ⋆) 2 1_{|₍_θ m(n),λm(n))−(θ⋆,λ⋆)|≤n−α} = E " f( ¯G 1₊_θ ⋆,N˜1(λ⋆))Q(θ⋆, λ⋆)−f( ¯G1+θm(n),N˜1(λm(n)))Q(θm(n), λm(n)) 2 1_{|₍_θ m(n),λm(n))−(θ⋆,λ⋆)|≤n−α} # . (3.5)

Thanks to the convergence of ˜N1₍_λ

m(n)), Q(θm(n), λm(n)) converges a.s. to Q(θ⋆, λ⋆) when

n goes to infinity. Since for n large enough, N1(λm(n)) = N1(λ⋆), the continuity of f with

respect to its first argument enables to prove thatf( ¯G1+θ_m₍_n₎,N˜1(λ_m₍_n₎)) converges a.s. to

f( ¯G1 ₊_θ

⋆,N˜1(λ⋆)). Hence, the absolute value inside the expectation tends to zero a.s. We

need to bound the term inside the expectation by an integrable random variable to apply the bounded convergence theorem which yields the result.

f( ¯G 1₊_θ ⋆,N˜1(λ⋆))Q(θ⋆, λ⋆)−f( ¯G1+θm(n),N˜1(λm(n)))Q(θm(n), λm(n)) 2 1_{|₍_θ m(n),λm(n))−(θ⋆,λ⋆)|≤n−α} ≤2 sup |(θ,λ)−(θ⋆,λ⋆)|≤n−α f( ¯G 1₊_θ,_N_˜1₍_λ₎₎ 2 Q2(θ, λ).

(14)

Fornlarge enough,{|(θ, λ)−(θ⋆, λ⋆)| ≤n−α} ⊂ V. Moreover, there exist m >0 andM >0

such that V ⊂ {|θ| ≤M,|λ| ≤M and d0(λ)≥m}. Hence,

sup (θ,λ)−(θ⋆,λ⋆)|≤n−α f( ¯G 1₊_θ,_N_˜1₍_λ₎₎ Q(θ, λ) ≤ sup (θ,λ)∈V f( ¯G 1₊_θ,_N_˜1₍_λ₎₎ e pMYd i=1 (e−M G1i+ eM G1i) p Y i=1 _µ i m N˜_i1(m) + _µ i m N˜_i1(M)! ≤ X σ∈{−M,M}d ν∈{m,M}p sup (θ,λ)∈V f( ¯G 1₊_θ,_N_˜1₍_λ₎₎ e pM_eσ·G1 Yp i=1 _µ i m N˜_i1(ν) .

Then, using Hölder’s inequality we get

E      X σ∈{−M,M}d ν∈{m,M}p sup (θ,λ)∈V f( ¯G 1₊_θ,_N_˜1₍_λ₎₎ 2 epMeσ·G1 p Y i=1 _µ i m N˜_i1(ν)!2      ≤ X σ∈{−M,M}d ν∈{m,M}p E " sup (θ,λ)∈V f( ¯G 1₊_θ,_N_˜1₍_λ₎₎ 2(1+η)# 1 1+η E    e pM_eσ·G1Yp i=1 _µ i m N˜_i1(ν)!2+ 2 η    η 1+η .

Since we have assumed that

E sup₍_θ,λ₎_∈V f( ¯G 1₊_θ,_N˜1₍_λ₎₎ 2(1+η)

< ∞, the random variables

f( ¯G 1₊_θ ⋆,N˜1(λ⋆))Q(θ⋆, λ⋆)−f( ¯G1+θm(n),N˜1(λm(n)))Q(θm(n), λm(n)) 2 1_{|₍_θ m(n),λm(n))−(θ⋆,λ⋆)|≤n−α} are uniformly bounded w.r.t n by an integrable random variable. Hence, the

left hand side of Equation (3.5) tends to zero which achieves to prove that

√

n( ˜Mn(θm(n), λm(n))−M˜n(θ⋆, λ⋆))−−−−−→P r

n→+∞ 0.

3.2 Practical implementation

The difficult part of Algorithm 3.1 is the numerical computation of the minimizing pair (θm, λm). The efficiency of the optimization algorithm depends very much on the magnitude

of the smallest eigenvalue of_∇2_v_{. From the end of the proof of Proposition 2.1, we can deduce} that the smallest eigenvalue of_∇2_v _{is larger than}

Eh_F₍_{G, θ, N}µ_{, λ}₎₁_{_Nµ₌₍_n 1,...,np)} i min 1,n1 λ2 1 , . . . ,np λ2 p ! .

This lower bound depends on the function f whereas we would rather find a uniform lower bound. Hence, we advice to rewrite_∇v as

∇v(θ, λ) =E " θ 1p ! f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi# −E " G Nµ λ ! f(G, Nµ)2e−θ·G+|θ| 2 2 p Y i=1 eλi−µi _µ i λi N_iµi#

(15)

where N µ_λ = Nµ1 1 λ1 , . . . , Npµp λp

. Hence, (θ⋆_{, λ}⋆_{) can be seen as the root of}

∇u(θ, λ) = θ 1p ! − E " G Nµ λ ! f(G, Nµ₎2_e−θ·GQp i=1 _µ i λi N_iµi # E f(G, Nµ₎2_e−θ·GQp i=1 _µ i λi N_iµi with u(θ, λ) = |θ₂|2 +Pp i=1λi+ logE f(G, Nµ₎2_e−θ·GQp i=1 _µ i λi N_iµi

. The Hessian matrix of

uis given by ∇2u(θ, λ) =       Id 0 0 E Df(G,Nµ₎2_e−θ·G+Qp i=1 µi λi Nµi i E f(G,Nµ₎2_e−θ·GQp i=1 µi λi N_iµi       + E " G Nµ λ ! G Nµ λ !∗ f(G, Nµ)2e−θ·GQp i=1 _µ i λi N_iµi # E f(G, Nµ₎2_e−θ·GQp i=1 _µ i λi N_iµi − E " G Nµ λ ! f(G, Nµ)2e−θ·GQp i=1 _µ i λi N_iµi # E " G Nµ λ ! f(G, Nµ)2e−θ·GQp i=1 _µ i λi N_iµi #∗ E f(G, Nµ₎2_e−θ·GQp i=1 µi λi N_iµi2

where we recall that the diagonal matrix D is defined by D = diag_p

Nµ1 1 λ2 1 , . . . , Npµp λ2 p . The Cauchy Schwartz inequality yields that the last two terms in the expression of _∇2u form a positive semi definite matrix. The first part of the Hessian is a positive definite matrix with smallest eigenvalue larger than

min     1, 1 λ2 j E Nµi i f(G, Nµ)2e−θ·G Qp i=1 _µ i λi N_iµi E f(G, Nµ₎2_e−θ·GQp i=1 µi λi N_iµi     = min     1,µj λ3_j E f(G, Nµ₊_e j)2e−θ·GQpi=1 _µ i λi N_iµi E f(G, Nµ₎2_e−θ·GQp i=1 µi λi N_iµi    

where the equality comes from Stein’s formula for Poisson random variables and ej denotes

the j−th element of the canonical basis. When the function f is increasing with respect to each component of its second argument, then we come up with the following lower bound independent of the function f

min 1,µj λ3

j

!

(16)

Our numerical experiments advocate the use of u instead of v to speed up the computation of (θ⋆, λ⋆).

Using this new expression, we implement Algorithm 1 to construct an approximation xk_n of (θn, λn). Since un is strongly convex, for any fixed n, xkn converges to (θn, λn) when k goes

to infinity. The direction of descent dk

n at step k should be computed as the solution of

a linear system. There is no point in computing the inverse of ∇2_u

n(xkn), which would be

computationally much more expensive.

Remarks on the implementation : From a practical point of view, ε should be chosen reasonably smallε_≈10−6. This algorithm converges very quickly and, in most cases, less than 5 iterations are enough to get a very accurate estimate of (θn, λn), actually within theε−error.

Since the points at which the functionf is evaluated remain constant through the iterations of Newton’s algorithm, the values f2(Gj, Nj) for j= 1, . . . , n should be precomputed before starting the optimization algorithm which considerably speeds up the whole process. The Hessian matrix of our problem is easily tractable so there is no point in using Quasi-Newton’s methods.

Algorithm 1Projected Newton’s algorithm Choose an initial valuex0

n∈Rd+p. k= 1 while∇u n(xkn) > εdo 1. Computedk n such that (∇2un(xkn))dkn=−∇un(xkn) 2. xkn+1/2 =xk_n+dk_n for i= 1 :d+p do if xkn+1/2(i)>0 then xk_n+1(i) =xkn+1/2(i) else xk_n+1(i) = xkn(i) 2 end if end for 3. k=k+ 1 end while

4 Application to jump processes in finance

We will apply our methodology to two different classes of jump processes: jump diffusion processes and stochastic volatility processes with jumps, in this latter case the volatility itself may jump also.

We consider a filtered probability space (Ω,_A,(_Ft)0≤t≤T,P) with a finite time horizonT >0

and I financial assets. We define on this space a Brownian motionW with values in RI _and

I + 1 independent Poisson processes (N1, . . . , NI+1) with constant intensities µ1, . . . , µI+1. We also consider (I+ 1) independent sequences (Y_ji)j≥1 fori= 1. . . I+ 1 of i.i.d. real valued random variables with common law denotedY in the following. The Poisson processes, the Brownian motions and the sequences (Yi

j)j are supposed to be independent of each other.

Actually, we are interested in considering the compound Poisson process associated to the Poisson processNi _{and to the jump sequences} _Yi _for_i_{= 1}_{, . . . , I}_{+ 1.}

(17)

4.1 Jump diffusion processes

In this class of models, we assume that the log-prices evolve according to the following equation

X_ti = βi₋(σ i₎2 2 ! t+σiLiWt+ Ni t X j=1 Y_ji+ N_tI+1 X j=1 Y_jI+1 (4.1) where β = (βi, . . . , βI)∗ is the drift vector and σ = (σi, . . . , σI)∗ the volatility vector. The row vectorsLiare such that the matrixL= (L1;. . .;LI) verifies that Γ =LL∗ is a symmetric

definite positive matrix with unit diagonal elements. The matrix Γ embeds the covariance structure of the continuous part of the model. We have also chosen to take into account in the model the possibility to have simultaneous jumps which explains the extra jump term

PN

I+1

t

j=1 YjI+1 common to all underlying assets. This common jump term embeds the systemic

risk of the market.

From Equation 4.1, we deduce that the prices at timet Si t= eX i t are defined by S_ti=Si₀exp ( βi₋(σ i₎2 2 ! t+σiLiWt ) N_ti Y j=1 eYji NtI+1 Y j=1 eYjI+1

which corresponds for each asset to a one dimensional Merton model with intensityµi₊_µI+1 when the Y_ji are normally distributed.

As, we assumed that P _{was the martingale measure associated to the risk free rate} _{r >} ₀ supposed to be deterministic, the processes (e−rt_S

t)t must be martingales under P. This

martingale condition imposes that for every i= 1, . . . , I,

βi=r−(µiE_[_Yi_{] +}_µI+1E_[_YI+1_])_. In the following, βi will always stand for this quantity.

Remark 4.1. In the one dimensional case, ie. when I = 1, we only consider a single compound Poisson process as the systemic risk jump term becomes irrelevant. Hence, the log-price in dimension one will follow

Xt= β− σ2 2 ! t+σWt+ Nt X j=1 Yj.

For the sake of clearness, we will not treat the one dimensional case separately in the following, even though the practical one dimensional implementation relies on a single Poisson process. So, we will always consider that the Poisson process has values inRI+1_.

In the numerical examples, we will need to discretize the multi dimensional price process on a time grid 0 =t0 < t1 < · · ·< tJ = T. We will assume that this time grid is regular and

given by tj = jT_J , j = 0, . . . , J. Just to fix our notations, we consider that the Brownian

(resp. Poisson) increments are stored as a column vector with size I×J (resp. (I+ 1)×J).

        Wt1 Wt2 .. . WtJ−1 WtJ         =          √ t1Id 0 0 . . . 0 √ t1Id √t2−t1Id 0 . . . 0 .. . . .. . .. . .. ... .. . . .. . .. √tJ−1−tJ−2Id 0 √ t1Id √t2−t1Id . . . √tJ−1−tJ−2Id √tJ −tJ−1Id          G,

whereGis a normal random vector inRI×J _and _Id_{is the identity matrix in dimension}_I_×_I_. The Poisson process is discretized in a similar way.

(18)

The Merton jump diffusion model. The Merton model corresponds to the particular choice of a normal distribution for the variables (Yi),Yi ∼ N(α, δ) where α∈R _and _{δ >}_0. In this framework, the jump sizes in the price follow a log normal distribution.

The Kou model. In the Kou model Kou (2002), the variables Yi follow an asymmetric exponential distribution with density

piµi₊e−µi+x₁

{x>0}+ (1−p)iµi−eµ i

−x₁

{x<0}

wherepi_∈_[0_,_{1] is the probability of a positive jump for the}_i₋_th_{component and the variables}

µi₊ >0, µi₋>0 govern the decay of each exponential part.

4.2 Stochastic volatility models with jumps

In this section, we consider the stochastic volatility type model developed by Barndorff-Nielsen and Shephard (2001b,a) in which the volatility process is a non Gaussian Ornstein Uhlenbeck driven by a compound Poisson process.

We consider that the log-prices satisfy for i= 1, . . . , I dX_ti = (ai−σi/2)dt+ qσi

t−dWti+ψidZκii_t+ψI+1dZ_κI+1I+1_t

wherea∈RI_,_ψ_∈RI+1 _{has non-positive components which account for the positive leverage} effect, Z is (I+ 1)-dimensional Lévy process defined byZ_ti =PNti

k=1Yki fori= 1, . . . , I+ 1 and

the squared volatility process (σt)t is Lévy driven Ornstein Uhlenbeck

dσ_ti=−(κi+κI+1)σi_tdt+dZ_κii_t+dZ_κII+1+1_t.

For the squared volatility process to remain positive, we assume that the components of Z

only jumps upward, which means that the random variables Y_ji are non-negative.

More specifically, the jump sequence Yi _{is i.i.d following the exponential distribution with}

parameter βi >0 for i= 1, . . . , I+ 1. The drift vector a is chosen such that the discounted prices are martingales underP_{. Hence, a straight computation shows that we need to set}

ai=r₋ψi κ

i_µi

βi₋_ψi −ψ

I+1 κI+1µI+1

βI+1₋_ψI+1, fori= 1, . . . , I to ensure the martingale property of (e−rt_exp_X

t)t.

As in the section on jump diffusion models, the extra Poisson process giving raise to the term dZI+1 in the dynamics of X and σ accounts for modelling a systemic risk. When

ZI+1 jumps, all the volatilities and possibly all the assets (when there is a leverage effect) jump together. This parametrization of multi-dimensional stochastic volatility models with jumps corresponds to Section 5.3 of Barndorff-Nielsen and Stelzer (2013). Adding this extra jump process only makes sense in a multi-dimensional framework, hence we write the one-dimensional model using the previous equations but without the terms involving the index

I + 1.

In the following, we compare the efficiencies of several different approaches based on the theoretical part of the paper in the context of option pricing with jumps. The problem always boils down to computing the expectation of a function of a jump diffusion process.