3.3 Choice of the regularization parameter
3.3.1 A posteriori parameter choice rules for attainable calibration problems
In this subsection we make the following standing assumptions:
1. The prior L´evy process corresponds to an arbitrage-free model with jumps bounded from above by B: P ∈ LN A∩ L+B (this implies the existence of a minimal entropy martingale
measure Q∗).
2. There exists a solution Q+ of problem (2.11) with data CM (minimum entropy least
squares solution) such that I(Q+|P ) < ∞ and
kCQ+− CMkw = 0 (the data is attainable by an exp-L´evy model).
3. There exists δ0 such that
εmax:= inf
δ≤δ0kC Q∗
− CMδ k2w > δ20. (3.11)
Remark 3.1. In the condition (3.11), δ0can be seen as the highest possible noise level of all data
sets that we consider. If, for some δ, kCQ∗
− Cδ
Mkw < δ0 then either Q∗ is already sufficiently
completely blurred and the data does not allow to construct a better approximation of the true solution than Q∗.
Let Qδ
αdenote a solution of the regularized problem (2.27) with data CMδ and regularization
parameter α. The function
εδ(α) := kCQ δ α− Cδ
Mk2w
is called the discrepancy function of the calibration problem (2.27). Note that since this problem can have many solutions, εδ(α) is a priori a multivalued function. Given two constants c1 and
c2 satisfying
1 < c1 ≤ c2 <
εmax
δ20 , (3.12)
the discrepancy principle can be stated as follows:
Discrepancy principle For a given noise level δ, choose α > 0 that satisfies
δ2 < c1δ2 ≤ εδ(α) ≤ c2δ2, (3.13)
If, for a given α, the discrepancy function has several possible values, the above inequalities must be satisfied by each one of them.
The intuition behind this principle is as follows. We would like to find a solution Q of the equation CQ = CM. Since the error level in the data is of order δ, it is the best possible
precision that we can ask for in this context, so it does not make sense to calibrate the noisy data CMδ with a precision higher than δ. Therefore, we try to solve kCQδα − Cδ
Mk2w ≤ δ2. In
order to gain stability we must sacrifice some precision compared to δ, therefore, we choose a constant c with 1 . c, for example, c = 1.1 and look for Qδα in the level set
kCQδα− Cδ
Mk2w≤ cδ2. (3.14)
Since, on the other hand, by increasing precision, we decrease the stability, the highest stability is obtained when the inequality in (3.14) is replaced by equality and we obtain
εδ(α) ≡ kCQ δ α− Cδ
Mk2w = cδ2.
To make the numerical solution of the equation easier, we do not impose a strict value of the discrepancy function but allow it to lie between two bounds, obtaining (3.13).
Supposing that an α solving (3.13) exists, it is easy to prove the convergence of the regular- ized solutions to the minimal entropy least squares solution, when the regularization parameter is chosen using the discrepancy principle.
Proposition 3.3. Suppose that the hypotheses 1–3 of page 103 are satisfied and let c1 and c2 be
as in (3.12). Let {Cδk
M}k≥1 be a sequence of data sets such that kCM− CMδkkw< δk and δk→ 0.
Then the sequence {Qδk
αk}k≥1 where Q δk
αk is a solution of problem (2.27) with data C δk M, prior
P and regularization parameter αk chosen according to the discrepancy principle, has a weakly
convergent subsequence. The limit of every such subsequence of {Qδk
αk}k≥1 is a MELSS with
data CM and prior P .
Proof. Using the optimality of Qδαkk, we can write: εδk(αk) + αkI(Q δk αk|P ) ≤ kC Q+ − Cδk Mk2w+ αkI(Q+|P ) ≤ δ2k+ αkI(Q+|P ).
The discrepancy principle (3.13) then implies that
I(Qδαkk|P ) ≤ I(Q+|P ), (3.15)
By Lemma 2.10, the sequence {Qδk
αk}k≥1 is tight and therefore, by Prohorov’s theorem and
Lemma 2.4, relatively weakly compact in M ∩ L+B.
Choose a subsequence of {Qδk
αk}k≥1, converging weakly to a limit Q and denoted, to simplify
notation, again by {Qδk
αk}k≥1. Inequality (3.13) and the triangle inequality yield:
kCQδkαk − CMkw≤ kCQδkαk − Cδk
Mkw+ δk≤ δk(1 +
√c
2) −−−→
k→∞ 0.
By Lemma 2.2 this means that kCQ− CMkw = 0 and therefore Q is a solution. By weak lower
semicontinuity of I (cf. Corollary 2.1) and using (3.15), I(Q|P ) ≤ lim inf
k I(Q δk αk|P ) ≤ lim sup k I(Qδk αk|P ) ≤ I(Q +|P ),
which means that Q is a MELSS.
The discrepancy principle performs well for regularizing linear operators but may fail in nonlinear problems like (2.27), because Equation (3.13) may have no solution due to disconti- nuity of the discrepancy function εδ(α). Examples of nonlinear ill-posed problems to which the
discrepancy principle cannot be applied are given in [95] and [50]. However, in our numerical tests (see Section 3.6) we have always been able to find a solution to (3.13)
We will now give a simple sufficient condition, adapted from [94], under which (3.13) admits a solution.
Proposition 3.4. Suppose that the hypotheses 1–3 of page 103 are satisfied and let c1 and c2
satisfy (3.12). If εδ(α) is a single-valued function then there exists an α satisfying (3.13).
This proposition is a direct consequence of the following lemma.
Lemma 3.5. The function εδ(α) is non-decreasing and satisfies the following limit relations:
lim α↓0εδ(α) ≤ δ 2, lim α→∞εδ(α) = kC Q∗ − CMδ k2w.
If, at some point α > 0, εδ(α) is single-valued, then it is continuous at this point.
The function
Jδ(α) := kCQ δ α− Cδ
Mk2w+ αI(Qδα|P ).
is non-decreasing, continuous, and satisfies the following limit relations: lim α↓0Jδ(α) ≤ δ 2, lim α→∞Jδ(α) ≥ kC Q∗ − CMδ k2w.
Proof. Let γδ(α) := I(Qδα|P ) and let 0 < α1 < α2. By the optimality of Qδα1 and Q δ α2 we have: εδ(α1) + α1γδ(α1) ≤ εδ(α2) + α1γδ(α2), εδ(α2) + α2γδ(α2) ≤ εδ(α1) + α2γδ(α1) and therefore εδ(α2) − εδ(α1) ≥ α1(γδ(α1) − γδ(α2)) εδ(α2) − εδ(α1) ≤ α2(γδ(α1) − γδ(α2)),
which implies that εδ(α2) ≥ εδ(α1) and γδ(α1) ≥ γδ(α2). To prove the first limit relation for
εδ(α), observe that for all α > 0,
εδ(α) ≤ kCQ +
− CMδ k2w+ αI(Q+|P ) ≤ δ2+ αI(Q+|P ) −−−→
α→0 δ
2.
To prove the second limit relation for εδ(α), one can write, using the optimality of Qδα:
εδ(α) + αγδ(α) ≤ kCQ ∗
From [30, Theorem 2.2], γδ(α) ≥ I(Qδα|Q∗) + I(Q∗|P ). (3.16) Therefore I(Qδα|Q∗) ≤ kC Q∗ − CMδ k2w α −−−→α→∞ 0.
Using the inequality
|P − Q| ≤p2I(P |Q), (3.17)
where |P − Q| denotes the total variation distance (see [30, Equation (2.3)]), this implies that Qδα converges to Q∗ in total variation distance (and therefore also weakly) as α goes to infinity. The limit relation now follows from Lemma 2.2.
To prove the continuity of εδ(α), let {αn} be a sequence of positive numbers, converging to
α > 0. By the optimality of Qδ αn, I(Q
δ
αn|P ) is bounded and one can choose a subsequence of
{Qδαn}, converging weakly toward some measure Q0, and denoted, to simplify notation, again
by {Qδ
αn}n≥1. We now need to prove that Q0 is the solution of the calibration problem with
regularization parameter α. By weak continuity of the pricing error (Lemma 2.2) and weak lower semicontinuity of the relative entropy (Lemma 2.11), we have for any other measure Q:
kCQ0− CMδ k2w+ αI(Q0|P ) ≤ lim infn{kCQ δ αn − Cδ Mk2w+ αI(Qδαn|P )} = lim inf n {kC Qδ αn − Cδ Mk2w+ αnI(Qδαn|P ) + (α − αn)I(Q δ αn|P )}
≤ lim infn {kCQ− CMδ kw2 + αnI(Q|P ) + (α − αn)I(Qδαn|P )}
= kCQ− CMδ k2w+ αI(Q|P ).
Therefore, the sequence {εδ(αn)} converges to one of the possible values of εδ(α). If this function
is single-valued in α, this means that every subsequence of the original sequence {εδ(αn)} has
a further subsequence that converges toward εδ(α), and therefore, the original sequence also
converges toward εδ(α).
The fact that Jδ(α) is nondecreasing as a function of α is trivial. To show the continuity,
observe that for 0 < α1 < α2,
Jδ(α2) − Jδ(α1) ≤ εδ(α1) + α2γδ(α1) − εδ(α1) − α1γδ(α1)
The limit relations for Jδ(α) follow from the relations for εδ(α).
We will now present an alternative a posteriori parameter choice rule, which reduces to the discrepancy principle when inequality (3.13) has a solution but also works when this is not the case. However, if the parameter is chosen according to the alternative rule, the sequence of regularized solutions does not necessarily converge to a minimum entropy solution as in Proposition 3.3 but to a solution with bounded entropy (see Proposition 3.7). Our treatment partly follows [50] where this parameter choice rule is applied to Tikhonov regularization.
Alternative principle For a given noise level δ, if there exists α > 0 that satisfies
c1δ2 ≤ εδ(α) ≤ c2δ2, (3.18)
choose one such α; otherwise, choose an α > 0 that satisfies
εδ(α) ≤ c1δ2, Jδ(α) ≥ c2δ2. (3.19)
Proposition 3.6. Suppose that the hypotheses 1–3 of page 103 are satisfied and let c1 and c2
be as in (3.12). Then there exists α > 0 satisfying either (3.18) or (3.19).
Proof. Suppose that (3.18) does not admit a solution. We need to prove that there exists α > 0 satisfying (3.19). Let
B := {α > 0 : εδ(α) ≤ c1δ2} and U := {α > 0 : εδ(α) > c2δ2}.
The limit relations of Lemma 3.5 imply that both sets are nonempty. Moreover, since we have assumed that (3.18) does not admit a solution, necessarily sup B = inf U . Let α∗ := sup B ≡
inf U . Now we need to show that
Jδ(α∗) > c2δ2. (3.20)
By continuity of Jδ(α),
Jδ(α∗) ≥ c2δ2+ lim α↓α∗γδ(α).
If limα↓α∗γδ(α) > 0 then (3.20) holds. Otherwise from (3.16), P is the minimal entropy martin-
gale measure and (3.17) implies that Qδ
α ⇒ P as α ↓ α∗. Therefore, Jδ(α∗) = limα↓α∗εδ(α) =
By continuity of Jδ(α), there exists ∆ > 0 such that Jδ(α∗ − ∆) > c2δ2. However, since
α∗ = sup B and εδ(α) is nondecreasing, necessarily εδ(α∗− ∆) ≤ c1δ2. Therefore, α − ∆ is a
solution of (3.19).
Remark 3.2. If c1< c2, one can show, along the lines of the above proof, that there exists not
a single α that satisfies either (3.18) or (3.19) but an interval of nonzero length (α1, α2), such
that each point inside this interval satisfies one of the two conditions. From the computational viewpoint this means that a feasible α can be found by bisection with a finite number of iterations.
Proposition 3.7. Suppose that the hypotheses 1–3 of page 103 are satisfied and that c1 and c2
are chosen according to (3.12). Let {Cδk
M}k≥1be a sequence of data sets such that kCM−C δk Mkw<
δk and δk→ 0.
Then the sequence {Qδk
αk}k≥1 where Q δk
αk is a solution of problem (2.27) with data C δk M, prior
P and regularization parameter αk chosen according to the alternative principle, has a weakly
convergent subsequence. The limit Q0 of every such subsequence of {Qδk
αk}k≥1 satisfies
kCQ0− CMkw = 0
I(Q0|P ) ≤ c2 c2− 1
I(Q+|P ). Proof. Using the optimality of Qδk
αk, we can write: εδk(αk) + αkI(Q δk αk|P ) ≤ kC Q+ − Cδk Mk2w+ αkI(Q+|P ) ≤ δ2+ αkI(Q+|P ). (3.21)
If αk satisfies (3.18), the above implies that
I(Qδαkk|P ) ≤ I(Q+|P ), otherwise (3.19) entails that
εδk(αk) + αkI(Q δk
αk|P ) ≥ c2δ 2,
and together with (3.21) this gives
I(Qδαkk|P ) ≤ c2 c2− 1
I(Q+|P ).
In both cases I(Qδαkk|P ) is uniformly bounded, which means, by Lemma 2.10, that the sequence {Qδk
αk}k≥1 is tight and therefore, by Prohorov’s theorem, relatively weakly compact. The rest
3.3.2 A posteriori parameter choice rule for non-attainable calibration prob-