Anglers’ Fishing Problem

(1)

Munich Personal RePEc Archive

Anglers’ Fishing Problem

Karpowicz, Anna and Szajowski, Krzysztof

Bank Zachodni WBK, Rynek 9/11, 50-950, Wrocław, Poland,

Institute of Mathematics and Computer Science, Wybrzeże

Wyspiańskiego 27, 50-370, Wrocław, Poland

2 December 2010

Online at

https://mpra.ub.uni-muenchen.de/41800/

(2)

Anglers’ fishing problem

Anna Karpowicz and Krzysztof Szajowski

AbstractThe considered model will be formulated as related to ”the fishing prob-lem” even if the other applications of it are much more obvious. The angler goes fishing. He uses various techniques and he has at most two fishing rods. He buys a fishing ticket for a fixed time. The fishes are caught with the use of different methods according to the renewal processes. The fishes’ value and the inter arrival times are given by the sequences of independent, identically distributed (i.i.d.) random vari-ables with the known distribution functions. It forms the marked renewal–reward process. The angler’s measure of satisfaction is given by the difference between the utility function, depending on the value of the fishes caught, and the cost function connected with the time of fishing. In this way, the angler’s relative opinion about the methods of fishing is modelled. The angler’s aim is to have as much satisfaction as possible and additionally he has to leave the lake before a fixed moment. Therefore his goal is to find two optimal stopping times in order to maximize his satisfaction. At the first moment, he changes the technique of fishinge.g.by excluding one rod and intensifying on the rest. Next, he decides when he should stop the expedition. These stopping times have to be shorter than the fixed time of fishing. The dynamic programming methods have been used to find these two optimal stopping times and to specify the expected satisfaction of the angler at these times.

Key words: stopping time, optimal stopping, dynamic programming, semi-Markov process, marked renewal process, renewal–reward process, infinitesimal generator, fishing problem, bilateral approach, stopping game

AMS 2010 Subject Classifications:60G40, 60K99, 90A46

Anna Karpowicz

Bank Zachodni WBK, Rynek 9/11, 50-950 Wrocław, Poland e-mail:[email protected]

Correspondence to: Krzysztof Szajowski

Institute of Mathematics and Computer Sci., Wybrze˙ze Wyspia´nskiego 27, 50-370 Wrocław, Poland

e-mail:[email protected]

(3)

1 Introduction

Before we start the analysis of the double optimal stopping problem (cf. idea of multiple stopping for stochastic sequences in Haggstrom [8], Nikolaev [16]) for the marked renewal process related to the angler behavior, let us present the so called ”fishing problem”. One of the first authors who considered the basic version of this problem was Starr [19] and further generalizations were done by Starr and Woodroofe [21], Starr, Wardrop and Woodroofe [20], Kramer, Starr et al. [14]. The detailed review of the papers related to the ”fishing problem” was presented by Fer-guson [7]. The simple formulation of the fishing problem, where the angler changes the fishing place or technique before leaving the fishing place, has been done by Karpowicz [12].We extend the problem to a more advanced model by taking into account the various techniques of fishing used the same time (the parallel renewal– reward processes or the multivariate renewal–reward process). It is motivated by the natural, more precise models of the known, real applications of the fishing problem. The typical process of software testing consists of checking subroutines. At the be-ginning many kinds of bugs are being searched. The consecutive stopping times are moments when the expert stops general testing of modules and starts checking the most important, dangerous type of errors. Similarly, in proof reading, it is natural to look for typographic and grammar errors at the same time. Next, we are looking for language mistakes.

As various works are done by different groups of experts, it is natural that we would compete with each other. If in the first period work is meant for one group and the second period needs other experts, then they can be players of a game be-tween them. In this case the proposed solution is to find the Nash equilibrium where strategies of players are the stopping times.

The applied techniques of modelling and finding the optimal solution are similar to those used in the formulation and solution of the optimal stopping problem for the risk process. Both models are based on the methodology explicated by Boshuizen and Gouweleeuw [1]. The background mathematics for further reading are mono-graphs by Br´emaud [3], Davis [4] and Shiryaev [18]. The optimal stopping problems for the risk process are subject of consideration in papers by Jensen [10], Ferenstein and Sieroci´nski [6], Muciek [15]. A similar problem for the risk process having dis-ruption (i.e.when the probability structure of the considered process is changed at one momentθ) has been analyzed by Ferenstein and Pasternak–Winiarski [5]. The model of the last paper brings to mind the change of fishing methods considered here, however it should be made by a decision maker, not the type of the environ-ment.

(4)

formulation a version of the problem for a detailed solution will be chosen. However, the solution is presented as the scalable procedure dependent on parameters which depends on various circumstances. It is not difficult to adopt a solution to wide range of natural cases.

1.1 Single Angler’s expedition

The angler goes fishing. He buys a fishing ticket for a fixed timet0which gives him

the right to use at most two rods. The total cost of fishing depends on real time of each equipment usage and the number of rods used simultaneously. He starts fishing with two rods up to the moments. The effect on each rod can be modelled by the renewal processes{Ni(t),t≥0}, whereNi(t)is the number of fishes caught on the

rodi,i∈A:={1,2}during the timet. Let us combine them together to the marked renewal process. The usage of thei-th rod by the timetgenerates costci:[0,t0]→

ℜ(when the rod is used simultaneously with other rods it will be denoted by the index dependent on the set of rods,e.g.a,ca

i) and the reward represented byi.i.d.

random variablesX₁{i},X₂{i}, . . .(the value of the fishes caught on the i-th rod) with cumulative distribution functionHi1. The streams of two kinds of fishes are mutually

independent and they are independent of the sequence of random moments when the fishes have been caught. The 2-vector process−→N(t) = (N1(t),N2(t)),t≥0, can be

represented also by a sequence of random variablesTntaking values in[0,∞]such

that

T0=0,

Tn<∞ ⇒ Tn<Tn+1, (1)

forn∈N_{, and a sequence of}_{A-valued random variables}_z_n_forn∈N_{∪ {}₀_} _(see Ch. II Br´emaud [3], Jacobsen [9]). The random variableTndenotes the moment of

catching then-th fish (T0=0) of any kind and the random variableznindicates to

which kind then-th fish belongs. The processesNi(t)can be defined by the sequence

{(Tn,zn)}∞n=0as:

Ni(t) =

∞

∑

n=1

I_{_T

n≤t}I{zn=i}. (2)

Both the 2-variate process−→N(t)and the double sequence{(Tn,zn)}∞n=0are called

2-variate renewal process. The optimal stopping problems for the compound risk process based on 2-variate renewal processwas considered by Szajowski [22].

Let us define, fori∈Aandk∈N_{, the sequence}

n{₀i} =0,

n{_ki₊}₁=inf{n>n{_ki}:zn=i}

(3)

1_{The following convention is used in all the paper:}−→_x _{= (}_x

(5)

and putT_k{i}=T

n{_ki}. Let us define random variablesS

{i}

n =Tn{i}−T_n{₋i}₁assuming that

they arei.i.d.with continuous, cumulative distribution functionFi(t) =P(S{ni}≤t)

and the conditional distribution function F_is(t) =P(Sn{i}≤t|S{ni}≥s). In the

sec-tion2.1the alternative representation of the 2-variate renewal process will be pro-posed. There is also mild extension of the model in which the stream of events after some moment changes to another stream of events.

Remark 1.In various procedures it is needed to localize the events in a group of the renewal processes. LetCbe the set of indices related to such a group. The sequence

{nC_k}∞_k₌₀such thatnC₀ =0,nC_k₊₁:=inf{n>nC_k :zn∈C}has an obvious meaning.

Analogously,nC(t):=inf{n:Tn>t,zn∈C}.

Let i,j∈A. The angler’s satisfaction measure (the net reward) at the period a from the rod i is the difference between the utility function ga_i :[0,∞)2_×_A_×

ℜ+→[0,Ga_i]which can be interpreted as the reward from the i-th rod when the last success was on rod jand, additionally, it is dependent on the value of the fishes caught, the moment of results’ evaluation, and the cost functionca_i :[0,t0]→[0,Cia]

reflecting the cost of duration of the angler’s expedition. We assume thatga_i andca_i are continuous and bounded, additionallyca_i are differentiable. Each fishing method evaluation is based on different utility functions and cost functions. In this way, the angler’s relative opinion about them is modelled.

The angler can change his method of fishing at the momentsand decide to use only one rod. It could be one of the rods used up to the momentsor the other one. Even though the rod used aftersis the one chosen from the ones used beforesits effectiveness could be different before and afters. After momentsthe modelling process is the renewal–reward one with the stream ofi.i.d.random variablesXn{3}at

the momentsTn{3}(i.e. appearing according to the renewal processN3(t)). Following

these arguments, the mathematical model of catching fishes, and their value afters, could (and in practice should) be different from those for the rods used befores. The reason for reduction of the number of rods could be their better effectiveness. The value of the fishes which have been caught up to timet, if the change of the fishing technology took place at the times, is given by

M_ts=

_∑

i∈A Ni(s∧t)

∑

n=1

Xn{i}+

N3((t−s)+)

∑

n=1

Xn{3}=Ms∧t+

N3((t−s)+)

∑

n=1

Xn{3},

whereMt{i}=∑ Ni(t)

n=1X

{i}

n , andMt =∑2i=1M

{i}

t We denote

− →

Mt = (Mt{1},M{

2}

t ). Let

Z(s,t)denote the angler’s pay-off for stopping at timet(the end of the expedition) if the change of the fishing method took place at times. The natural filtration related to the double indexed processZ(s,t)isFs

t =σ{0≤u≤s≤v≤t:Z(u,v)}.If the

effect of extending the expedition aftersis described bygb_j:ℜ+2×A×[0,t0]×ℜ×

[0,t0]→[0,Gbj], j∈B, minus the additional cost of timecbj(·), wherecbj :[0,t0]→

[0,Cb

j](when card(B) =1 then index jwill be abandoned, alsocb=∑j∈Bcbj will

(6)

Z(s,t) =         

ga(−→Mt,zN(t),t)−ca(t) ift<s≤t0,

ga(−→Ms,zN(s),s)−ca(s)

+gb(−→Ms,zN(s),s,Mts,t)−cb(t−s) ifs≤t≤t0,

−C ift0<t.

(4)

where the function ca(t),ga(−→m,i,t) and the constantC can be taken as follows: ca(t) =∑2_i₌₁ca_i(t),ga(−→Ms,j,t) =∑2i=1gai(

− →

Mt,j,t),C=Ca1+C2a+Cb. With the

no-tationwb(−→m,i,s,m,e t) =wa(−→m,i,s) +gb(→−m,i,s,m,e t)−cb(t−s)andwa(−→m,i,t) =

ga(−→m,i,t)−ca(t), formula (4) is reduced to:

Z(s,t) =Z{zN(t)}₍_s,t₎_I

{t<s≤t0}+Z

{z_N₍_s₎}₍_s,t₎

I{s≤t},

where

Z{i}(s,t) =I{t<s≤t0}w a₍−→_M

t,i,t) +I{s≤t≤t0}w b₍−→_M

s,i,s,Mts,t)−I{t0<t}C.

1.2 The competitive fishing

Whenthe methods of fishing are operating by separated anglersthen the stopping random field can be built based on the structure of the marked renewal–reward pro-cess as a model of the competitive expedition results. One possible definition of pay-off is based on the assumption that each player has his own account related to the exploration of the fishery. The states of the accounts depend on who forces the first stop for changing technique, under which circumstances and what techniques they choose. The first stopping moment, the minimum of stopping moments chosen by the players, is after the moment of the event (catching fish)Tnby the rodznand

the reward functions depend on the type of fishing which gives recent fish (i.e. j, where j=zn). The player’s pay-offwia(−→m,j,t) =gai(−→m,j,t)−cai(t). The part of

the pay-off which depends on the second chosen moment, which stops the expedi-tion, is different for the player who forces the change of fishing methods (the leader) by himself and the other for the opponent. The leader is the responsible angler for determining the expedition deadline.

Lets assume for a while that thei-th player, i=1,2, will take the rod of the opponent and gives his rod to him.It is not a crucial assumption anyway and the method of fishing after the change can be different from both available before the considered moment.The method of treatment of the case without this assumption will be explained later (see page9), when the behavior of the player in the second part of the expedition will be formulated. Define the function

˜

wb_i(−→m,j,s,k,m,te ) =w˜a_i(−→m,j,s) +g˜b_i(−→m,j,s,k,m,e t)−cb(t−s)

(7)

(the denotation −kis used for a complimentary rod or player who has decided, which is appropriate). It describes the case when the player deciding to change the method chooses the perspective technique of fishing as the first one. Presumably he will explore the best methods with improvements and the second angler will use the rod which is not used by the leader. The pay-off of the players, wheni-th is the one who forces the first stop, has the following form:

Zi(j,s,t) =I{t≤s≤t0}g˜ a i(

− →

Mt,j,t) +I{s<t≤t0}w˜ b i(

− →

Ms,i,s,−i,Mts,t)−I{t0<t}C(5) Z−i(j,s,t) =I{t≤s≤t0}g˜

a

−i(

− →

Mt,j,t) +I{s<t≤t0}w˜ b

−i(

− →

Ms,i,s,i,Mts,t)−I{t0<t}C.(6)

In the above pay–offs it is assumed that the final stop can be declared at any moment. The change of techniques declaration each player makes just after an event at his rod (the catching fish at his rod) as long as on the opponent’s rod there is no event. The details of the strategy sets and the solution concept are formulated in the further parts of the paper.

The extension considered here is motivated by the natural, more precise mod-els of the known real applications of the fishing problem. The typical process of software testing consists of checking subroutines. Various types of bugs can be dis-covered. Each problem with subroutines generates the cost of a bug removal and increases the value of the software. It depends on the types of the bug found. The preliminary testing requires various types of experts. The stable version of subrou-tines can be kept by less educated computer scientists. The consecutive stopping times are moments when the expert of the defined class stops testing one module and the another tester starts checking. Similarly as in the proof reading.

2 The optimization problem and a two person game

2.1 Filtrations and Markov moments

Let the sequences of pairs {(Tn,zn)}∞_n₌0 be 2-variate renewal process (A-marked

renewal process) defined on(Ω,F_,P). According to the denotation of the previ-ous section there are three renewal processes{Tn{i}}∞_n₌₀,i=1,2,3, and denoted by

Tn=T_N{_zzn}

n(Tn). There are also three renewal–rewarded processes{(T

{i}

n ,Xn{i})}∞n=0,

i=1,2,3 . By convention let us denoteXn=X_N{zn}

zn(Tn). The followingσ-field

gener-ated by the history of theA-marked renewal processes are defined

F_t₌FA

t =σ(X0,T0,z0. . . ,XN(t),TN(t),zN(t)), (7)

fort≥0. Thisσ-field can be defined as

FA

t =σ{(

− →

(8)

Definition 1.LetT _{be a set of stopping times with respect to}_σ_-fields_{F_t_}_,_t_≥_0,

defined by (7). The restricted sets of stopping times are

T_n_,_K₌_{_τ_∈T _: _τ_≥_0,_T_n_≤_τ_≤_T_K_} ₍₈₎

forn∈N_,n<Kare subsets ofT_{. The elements of}T_n_,_K _{are denoted}_τ_n_,_K _.

The stopping timesτ∈T _{have nice representation which will be helpful in the}

solu-tion of the optimal stopping problems for the renewal processes (see Br´emaud [3]). The crucial role in our subsequent considerations plays such a representation. The following lemma is for the unrestricted stopping times.

Lemma 1.If τ∈T then there exist Rn∈Mes(Fn) such that the condition τ∧

Tn+1= (Tn+Rn)∧Tn+1on{τ≥Tn}a.s. is fulfilled.

Various restrictions in the class of admissible stopping times will change this repre-sentation. Some examples of subclasses ofT _{are formulated here (see Lemma}_1).

Only a few of them are used in the optimization problems investigated in the paper (see page9, Corollary1).

Let F_s_,_t ₌_σ₍FA

s ,X

{3}

0 ,T

{3}

0 , . . . ,X

{3}

N3((t−s))+,T

{3}

N3((t−s)+)) be theσ-field gener-ated by all events up to time t if the switch at time s from 2-variate renewal process to another renewal process took place. For simplicity of notation we set2

F_n{i}_:₌F

Tn{i} ,

F_n_:₌F_T n,F

s

n:=F_s_,_T{3}

n . Let

Mes(F_n₎_(Mes₍F_n{i}₎_{) denote the}

set of non-negative and F_n ₍F_n{i}_{)-measurable random variables. From now on,} T _andTs _{stands for the sets of stopping times with respect to}_σ_-fields_F

s and

{F_s_,_t_,0_≤s≤t}, respectively. Furthermore, we can define forn∈N_andn≤Kthe sets

1. T{i}

n,K ={τ∈T :τ≥0, T

{i}

n ≤τ≤TK};

2. T_n{i}₌_{_τ_∈T _:_τ_≥Tn{i}};

3. T¯{i,A{−i}}

n,K ={τ∈T :τ≥0, T

{i}

n ≤τ≤TK,∀kτ∈/[TA

−i

k ,TA

−i

k+1 ∨T

{i}

n{i}₍_TA−i

k )

]}

whereA{−i}_:₌_A_{\ {}_i_}_,_TA−i

k :=min{j∈A{−i}_}{T{j}

n{j}₍_T{i}

k )

};

4. T¯_n{i}₌_{_τ_∈T _:_τ_≥Tn{i},∀kτ∈/[TA

−i

k ,TA

−i

k+1 ∨T

{i}

n{i}₍_TA−i

k )

]};

5. Ts

n,K={τ∈Ts: 0≤s≤τ,T

{3}

n ≤τ≤TK}.

The stopping timesτ∈T{i}_and_τ_∈_T_¯{i}_{can also be represented in the way shown}

in Lemma1.

Lemma 2.Let the index i∈Abe chosen and fixed.

2_{For the optimization problem there are two epochs: before the first stop, where there are some}

(9)

1. For everyτ∈T{i}_{and n}_∈_N_{there exist R}{i}

n ∈Mes(Fn{i})such thatτ∧T_n{₊i}₁=

(Tn{i}+R{ni})∧Tn{+i}1on{τ{i}≥T

{i}

n }a.s. is fulfilled.

2. Ifτ∈T¯{i}_{and n}_∈_N_{there exist R}{i}

n ∈Mes(Fn{i})such that the conditionτ∧

T_n{₊i}₁= (Tn{i}+R{ni})∧T_n{₊i}₁on{τ≥Tn{i}}a.s. is fulfilled.

Obviously the angler wants to have as much satisfaction as possible and he has to leave the lake before the fixed moment. Therefore, his goal is to find two optimal stopping timesτa∗ _and_τb∗_{so that the expected gain is maximized}

EZ(τa∗,τb∗) = sup

τa_∈T

sup

τb_∈Tτa

EZ(τa,τb), (9)

whereτa∗ _{corresponds to the moment, when he eventually should change the two}

rods to the more effective one and τb∗_{, when he should stop fishing. These}

stop-ping moments should appear before the fixed time of fishingt0. The processZ(s,t)

is piecewise-deterministic and belongs to the class of semi-Markov processes. The optimal stopping of similar semi-Markov processes was studied by Boshuizen and Gouweleeuw [1] and the multivariate point process by Boshuizen [2]. Here the structure of multivariate processes is discovered and their importance for the model is shown. We use the dynamic programming methods to find these two optimal stopping times and to specify the expected satisfaction of the angler. The way of the solution is similar to the methods used by Karpowicz and Szajowski [13], Karpow-icz [12] and Szajowski [22]. Let us first observe that by the properties of conditional expectation we have

EZ(τa∗_,_τb∗_{) =}

sup

τa_∈T

E{E h

Z(τa_,_τb∗₎_|_F τa

i

}= sup

τa_∈T

EJ(τa₎_,

where

J(s) =EhZ(s,τb∗₎_|_F s

i

=ess sup

τb_∈Ts

EhZ(s,τb₎_|_F s

i

. (10)

Therefore, in order to findτa∗ _and_τb∗_{, we have to calculate}_J₍_s₎_{first. The process}

J(s)corresponds to the value of the revenue function in one stopping problem if the observation starts at the moments.

2.2 Anglers’ games

Based on the consideration of the section1.2a version of competitive fishing is formulated here. There are two anglers, each using one method of fishing at the beginning of an expedition and an additional fishing period after a certain moment by another method up to the moment chosen by a certain rule. The random field which is the model of payoffs in such a case is given by (5) and (6). The final segment starts at the moment when one of the anglers wants it. Letτi∈T¯{i},i=A,

(10)

time segment which is stopped at momentσ determined by one angler (let us call thema leader). The payoffs of the players are

ψi(τ1,τ2) =Zi(zN(τ1∧τ2),τ1∧τ2,σ τ1∧τ2₎_I

{τ16=τ2} (11)

+Zi(zN(τ1∧τ2)∧zN(τ1∧τ2),τ1∧τ2,σ τ1∧τ2₎_I

{τ1=τ2}.

Assignment of the leader in the caseτ1=τ2is arbitrary but defined. The aim is to find a pair(τ⋆

1,τ2⋆)of stopping times such that fori∈ {1,2}we have

Eψi(τi⋆,τ−⋆i)≥Eψi(τi,τ−⋆i). (12)

The optimization problem of the angler and the game between two anglers will involve the construction of the optimal second stopping moment.

3 Construction of the optimal second stopping time

In this section, we will find the solution of one stopping problem defined by (10). We will first solve the problem for the fixed number of fishes caught, next we will consider the case with the infinite stream of fishes caught. In this section we fix s- the moment when the change took place and m=Ms - the mass of the fishes

at the time s. Taking into account various models of fishing after the first stop it is needed to admit various models of stream of events. Assume that the moments of successive fishes catching after the first stop are Tn{3} and the times between

the events are i.i.d. with continuous, cumulative distribution function F(t)with the density function f(t)and the fishes value represented byi.i.d.random variables with distribution function H(t)(for conveniences this part of expedition is modelled by the renewal process denoted(Tn{3},Xn{3})).

3.1 Fixed number of fishes caught

In this subsection we are looking for the optimal stopping timeτb∗

0,K:=τKb

∗

EhZ(s,τKb

∗

)|F_si₌_{ess sup}

τb

K∈T0s,K

EhZ(s,τKb)|Fs i

, (13)

wheres≥0 is a fixed time when the position was changed andKis the maximum number of fishes which can be caught. Let us define

Γs

n,K= ess sup τb

n,K∈Tns,K

EhZ(s,τb n,K)|Fns

i

=EhZ(s,τb∗ n,K)|Fns

i

(11)

and observe thatΓs

K,K=Z(s,T

{3}

K ). In the subsequent considerations we will use the

representation of stopping time formulated in Lemma1and2. The exact form of the stopping strategies are given in the following corollary.

Corollary 1.Let i∈A. Ifτa_∈_T{i}_,_τb_∈_Ts_{, then there exist R}a

n∈Mes(F

{i}

n )and

Rb_n∈Mes(Fs

n)respectively, such that for conditionsτa∧T

{i}

n+1= (T

{i}

n +Ran)∧T

{i}

n+1

on{τa_≥_T{i}

n }a.s. andτb∧T_n{₊3}₁= (Tn{3}+Ran)∧T

{3}

n+1on{τa≥s∧T

{3}

n }a.s. are

valid.

Now we can derive the dynamic programming equations satisfied by Γs n,K. To

simplify the notation we can writeMt=Mtsfort≤s,Mb

{1}

n =MT1

n,M

s n=Ms

Tn{3}

and

¯Fi=1−Fi. The payoff functions are simplified here to ˆga(m) =ga(m1,m2,i,t)I{m1+m2=m}(m1,m2), ˆ

gb(m) =gb(m1,m2,i,s,m,te )I{me−m1−m2=m}

Lemma 3.Let s≥0be the moment of changing fishery. For n=K−1,K−2, . . . ,0

Γs

K,K =Z(s,T

{3}

K ),

Γs

n,K =ess supRb

n∈Mes(Fns)ϑn,K(Ms,s,M

s n,T

{3}

n ,Rbn)a.s.,

(15)

where

ϑn,K(m,s,m,e t,r) =I{t≤t0}

¯

F(r)[I{r≤t0−t}wˆ

b₍_m,_s,_m,_e _t₊_r₎₋_CI

{r>t0−t}]

+E

I

{S{_n3₊}₁≤r}Γ

s n+1,K|Fns

−CI{t>t0}

and there exists Rb_n⋆∈Mes(Fs

n)such that

Γs

n,K=ϑn,K(Ms,s,Mns,T

{3}

n ,Rbn

⋆

)a.s., (16)

τb∗ n,K=

(

τb∗

n+1,K if Rbn

∗ ≥S{_n3₊}₁, Tn{3}+Rbn

∗

if Rb n

∗

<S{_n3₊}₁, (17)

τb∗ K,K=T

{3}

K andwˆb(m,s,m,e t) =wˆa(m,s) +gˆb(me−m)−cb(t−s)wherewˆa(m,t) =

ˆ

ga(m)−ca(t).

Remark 2.Let{Rb_n∗}K n=1,Rb

∗

K =0, be a sequence ofFns–measurable random

vari-ables,n=1,2, . . . ,K, andη⋆s

n,K=K∧inf{i≥n:Rbi

⋆

<S{_i₊3}₁}. ThenΓs n,K=E

h

Z(s,τb∗ n,K)|Fns

i

forn≤K−1, whereτb∗ n,K=Tη⋆s

n,K+R

b⋆ η⋆s

n,K.

PROOF OF REMARK. 2. It is a consequence of an optimal choiceRb_n⋆in (15).

(12)

PROOF OF LEMMA.3First observe that the form of theΓs

n,Kfor the caseT

{3}

n >t0

is obvious from (4) and (14). Let us assume (15) and (16) forn+1,n+2, . . . ,K. For anyτ∈Ts

n,K(i.e.τ≥T

{3}

n we have{τ<T_n{₊3}1}={τ∧T

{3}

n+1<T

{3}

n+1}={T

{3}

n +Rbn<

T_n{₊3}₁}.It implies

{τ<T_n{₊3}₁}={S{_n3₊}₁>Rb_n}, {τ≥T_n{₊3}₁}={S_n{3₊}₁≤Rb_n}. (18)

Suppose thatT_K{3₋}₁≤t0and take anyτKb−1,K∈TKs−1,K. According to (18) and the

properties of conditional expectation

E[Z(s,τ)|Fs

n] =E

I_{

S{_n3₊}₁≤Rb

n}

E[Z(s,τ∨T_n{₊3}₁)|Fs

n+1]|Fn

+E

I

{S{_n3₊}₁>Rb

n}

Z(s,τ∧T_n{₊3}₁)|Fs

n

=I_{_Rb

n≤t0−Tn}¯F(Rn)wˆ

b₍_M

s,s,Mns,T

{3}

n +Rbn)

+E

I

{S{_n3₊}₁≤Rb

n}

E[Z(s,τ∨T_n{₊3}₁)|Fs

n+1|Fns

.

Letσ∈Tb

n+1. For everyτ∈Tnswe have

τ= (

σ ifRb_n≥S{_n3₊}₁, Tn{3}+Rbn ifRbn<S

{3}

n+1.

We have

E[Z(s,τ)|Fs

n] =E

I_{

S{_n3₊}₁≤Rb

n}

E[Z(s,σ)|Fs

n+1]|Fn

+I_{_Rb

n≤t0−Tn}¯F(R

b

n)wˆb(Ms,s,Mns,T

{3}

n +Rbn)

≤ sup

R∈Mes(Fs n)

{E

I_{

S{_n3₊}₁≤R}Γ

s n+1,K|Fn

+I{R≤t0−Tn}¯F(R)wˆ

b₍_M

s,s,Mns,T

{3}

n +R)}=E[Z(s,τn⋆,K)|Fns]

It follows sup_τ_∈Ts

nE[Z(s,τ)| Fs

n]≤E[Z(s,τn⋆,K)|Fns]≤supτ∈Tb

n E[Z(s,τ)| Fs

n]where

the last inequality is becauseτ_n⋆_,_K∈Ts

n,K. We apply the induction hypothesis, which

completes the proof.

z

Lemma 4.Γs n,K=γ

s,Ms

K−n(Mns,T

{3}

n )for n=K, . . . ,0, where the sequence of functions

(13)

γ₀s,m(m,te ) =I{t≤t0}wˆ b₍_m,_s,

e

m,t)−CI{t>t0},

γs_j,m(m,te ) =I{t≤t0}sup r≥0

κb

γs_j,₋m₁(m,s,m,t,e r)−CI{t>t0}, (19)

where

κ_δb(m,s,m,te ,r) =F¯(r)[I{r≤t0−t}wˆ b₍_m,_s,

e

m,t+r)−CI{r>t0−t}]

+ Z r

0

dF(z) Z ∞

0 δ(e

m+x,t+z)dH(x).

PROOF OF LEMMA. 4. Since the case fort>t0is obvious let us assume that

Tn{3}≤t0forn∈ {0, . . . ,K−1}. Let us notice that according to Lemma3we obtain Γs

K,K =γ s,Ms

0 (MKs,T

{3}

K ), thus the proposition is satisfied forn=K. Letn=K−1

then Lemma3and the induction hypothesis leads to

Γs

K−1,K = ess sup Rb

K−1∈Mes(Fs,K−1)

¯F(Rb_K₋₁)[I

{Rb

K−1≤t0−TK{−3}1} ˆ

wb(Ms,s,MKs−1,T

{3}

K−1+R

b K−1)

−CI_{

Rb_K₋₁>t0−TK{3−}1}

] +E

I_{

S{_K3}≤Rb_K₋₁}γ

s,Ms

0 (M

s K,T

{3}

K )|Fs,K−1

a.s.,

where Ms

K =MKs−1+X

{3}

K , T

{3}

K =T

{3}

K−1+S

{3}

K and the random variables X

{3}

K

andS{_K3}are independent ofF_s_,_K₋₁_{. Moreover}_Rb

K−1,MKs−1andT

{3}

K−1areFs,K−1

-measurable. It follows

ΓKs−1,K = ess sup Rb

K−1∈Mes(Fs,K−1)

¯F(Rb_K₋₁)[I_{

Rb

K−1≤t0−T{ 3}

K−1} ˆ

wb(Ms,s,MKs−1,T

{3}

K−1+R

b K−1)

−CI

{Rb_K₋₁>t0−TK{3−}1}

] + Z Rb_K₋₁

0

dF(z) Z ∞

0 γs,Ms

0 (M

s

K−1+x,T

{3}

K−1+z)dH(x)

=γs,Ms

1 (M

s K−1,T

{3}

K−1)a.s.

Letn∈ {1, . . . ,K−1}and suppose thatΓs n,K=γ

s,Ms

K−n(Mns,T

{3}

n ). Similarly like before,

we conclude by Lemma3and induction hypothesis that

Γs

n−1,K = ess sup Rb

n−1∈Mes(Fns−1)

¯F(Rb_n₋₁)[I

{Rb

n−1≤t0−Tn{−3}1} ˆ

wb(Ms,s,Mns−1,T

{3}

n−1+R

b n−1)

−CI_{

Rb_n₋₁>t0−Tn{−3}1}

] + Z Rb_n₋₁

0 dF(

s) Z ∞

0 γs,Ms

K−n(M s

n−1+x,T

{3}

n−1+s)dH(x)

a.s.

thereforeΓs n−1,K=γ

s,Ms

K−(n−1)(Mns−1,T

{3}

n−1).

(14)

From now on we will useαito denote the hazard rate of the distributionFi(i.e.

αi=fi/¯Fi) and to shorten notation we set∆•(a) =Egˆ•(a+X{i})−gˆ•(a), where

•can beaorb.

Remark 3.The sequence of functionsγs_j,mcan be expressed as:

γ₀s,m(m,te ) =I{t≤t0}wˆ b₍_m,s,

e

m,t)−CI{t>t0},

γs_j,m(m,te ) =I{t≤t0}

ˆ

wb(m,s,m,te ) +yb_j(me−m,t−s,t0−t)

−CI{t>t0}

andyb_j(a,b,c)is given recursively as follows

yb₀(a,b,c) =0 yb_j(a,b,c) = max

0≤r≤cφ b yb

j−1(a,b,c, r),

where φb

δ(a,b,c,r) = Rr

0¯F(z){α(z)

∆b₍_a_{) +}_E_δ₍_a₊_X{3}_,_b₊_z,c₋_z₎₋_cb′₍_b₊

z)}dz, and F is thec.d.f.ofS{3}(α(t)is the hazard rate of the distribution ofS{3}).

PROOF OF REMARK.3Clearly

Z r

0

dF(s) Z ∞

0

γs_j,₋m₁(me+x,t+s)dH(x) =EhI_{_S{3}_≤_r_}γs_j,₋m₁(me+X{3},t+S{3})

i

,

whereX{3}has thec.d.f.H. Since F is continuous andκb

γs_j,₋m₁(m,s,m,e t,r)is bounded

and continuous fort∈R+_{\ {}t0}, the supremum in (19) can be changed into

maxi-mum. Letr>t0−tthen

κ_γbs,m

j−1(m,s,m,t,e r) =E

h

I_{_S{3}_≤_t 0−t}γ

s,m

j−1(me+X{

3}_,t₊_S{3}₎i₋_C_¯F₍_t

0−t)

≤EhI_{_S{3}_≤_t 0−t}γ

s,m j−1(me+X

{3}_,t₊_S{3}₎i₊_¯F₍_t

0−t)wˆb(m,s,m,e t0)

=κ_γbs,m

j−1(m,s,m,e t,t0− t).

The above calculations cause thatγs_j,m(m,te ) =I{t≤t0}max0≤r≤t0−tϕj(m,s,m,e t,r)− CI{t>t0}, whereϕj(m,s,m,t,e r) =¯F(r)wˆ

b₍_m,_s,_m,_e _t₊_r₎₊_Eh_I

{S{3}_≤_r_}γs_j₋,m₁(me+X{3},t+S{3})

i

.

Obviously forS{3}≤randr≤t0−twe haveS{3}≤t0therefore we can consider the

casest≤t0andt>t0separately. Lett≤t0thenγ₀s,m(m,te ) =wˆb(m,s,m,e t)and the

(15)

ϕj+1(m,s,m,t,e r) = ¯F(r)wˆb(m,s,m,te +r) +E

h

I_{_S{3}_≤_r_}γs_j,m(me+X{3},t+S{3})

i

=gˆa(m)−ca(s) +¯F(r)hgˆb(me−m)−cb(t−s+r)i

+ Z r

0 f(

z){Egˆb(me−m+X{3})−cb(t−s+z)

+Eyb_j(me−m+X{3},t−s+z,t0−t−z)}dz.

It is clear that for anyaandb

¯F(r)hgˆb(a)−cb(b+r)i=gˆb(a)−cb(b)

−

Z r

0

{f(z)hgˆb(a)−cb(b+z)i+¯F(z)cb′(b+z)}dz,

therefore

ϕj+1(m,s,m,t,e r) =wˆb(m,s,m,te ) +

Z r

0 ¯F(

z){α(z)[∆b(me−m)

+Eyb_j(me−m+X{3},t−s+z,t0−t−z)]−cb

′

(t−s+z)}dz,

which proves the theorem. The case fort>t0is trivial.

Following the methods of Ferenstein and Sieroci´nski [6], we find the second op-timal stopping time. LetB₌_B_([_0,∞)×[0,t0]×[0,t0])be the space of all bounded,

continuous functions with the norm kδk=sup_a_,_b_,_c|δ(a,b,c)|. It is easy to check that B_{with the norm supremum is complete space. The operator} Φb_:_B_→_B _is

defined by

(Φb_δ₎₍_a,_b,c_{) =} _max

0≤r≤cφ b

δ(a,b,c,r). (20)

Let us observe thatyb_j(a,b,c) = (Φb_yb

j−1)(a,b,c). Remark3now implies that there

exists a functionrb_j∗(a,b,c)such thatyb_j(a,b,c) =φb

yb_j₋₁(a,b,c,r b∗

j(a,b,c))and this

gives

γs_j,m(m,te ) =I{t≤t0}

ˆ

wb(m,s,m,te )

+φ_ybb

j−1(e

m−m,t−s,t0−t,rb

∗

j (me−m,t−s,t0−t))

−CI{t>t0}.

The consequence of the foregoing considerations is the theorem, which determines the optimal stopping timesτb∗

n,K in the following manner:

Theorem 1.Let Rb_i∗=rb_K∗₋_i(M_is−Ms,T_i{3}−s,t0−T_i{3})for i=0,1, . . . ,K

more-overηs

n,K=K∧inf{i≥n:Rbi

∗

<S{_i₊3}₁}, then the stopping timeτb∗ n,K=T

{3}

ηs n,K+R

b∗ ηs n,K

is optimal in the classTs

n,KandΓns,K=E h

Z(s,τb∗ n,K)|Fns

i

(16)

3.2 Infinite number of fishes caught

The task is now to find the functionJ(s)and the stopping timeτb∗_{, which is optimal}

in the class Ts_{. In order to get the solution of one stopping problem for infinite}

number of fishes caught it is necessary to put the restriction F(t0)<1.

Lemma 5.If F(t0)<1then the operatorΦb:B→Bdefined by (20) is a

contrac-tion.

PROOF OF LEMMA.5. Letδi∈Bfori∈ {1,2}. There existsρisuch that(Φbδi)(a,b,c) =

φb

δi(a,b,c,ρi). We thus get

(Φb_δ1₎₍_a,_b,_c₎₋₍_Φb_δ2₎₍_a,_b,_c_{) =}_φb

δ1(a,b,c,ρ1)−φ b

δ2(a,b,c,ρ2)

≤φ_δb

1(a,b,c,ρ1)−φ b

δ2(a,b,c,ρ1)

= Z ρ1

0

dF(z) Z ∞

0

[δ1−δ2](a+x,b+z,c−s)dH(x)

≤

Z ρ1

0

dF(z) Z ∞

0

sup

a,b,c

|[δ1−δ2](a,b,c)|dH(x)

≤F(c)kδ1−δ2k ≤F(t0)kδ1−δ2k ≤Ckδ1−δ2k,

where 0≤C<1. Similarly, like as before, (Φb_δ2₎₍_a,b,_c₎₋₍_Φb_δ1₎₍_a,b,_c₎_≤ Ckδ2−δ1k. Finally we conclude thatΦb_δ1₋_Φb_δ2_≤_C_k_δ1₋_δ2_k_which

com-pletes the proof.

z

Applying Remark3, Lemma5and the fixed point theorem we conclude

Remark 4.There existsyb∈B_{such that}yb=Φb_yb_{and lim}

K→∞kybK−ybk=0.

According to the above remark, yb is the uniform limit of yb_K, when K tends to infinity, which implies thatybis measurable andγs,m₌_lim

K→∞γ_Ks,mis given by

γs,m(m,te ) =I{t≤t0}

h

ˆ

wb(m,s,m,e t) +yb(me−m,t−s,t0−t)

i

−CI{t>t0}. (21)

We can now calculate the optimal strategy and the expected gain after changing the place.

Theorem 2.If F(t0)<1and has the density function f, then

(i)for n∈N the limitτb⋆

n =limK→∞τb ∗

n,K a.s. exists and τb ⋆

n ≤t0 is an optimal

stopping rule in the setTs_{∩ {}_τ_≥_T{3} n },

(ii)EZ(s,τb⋆ n )|Fns

=γs,m₍_Ms n,T

{3}

n )a.s.

PROOF. (i) Let us first prove the existence ofτ_nb⋆. By the definition ofΓ_ns_,_K₊₁we

(17)

Γs

n,K+1= ess sup

τ∈Ts n,K+1

E[Z(s,τ)|Fs

n] =ess sup τ∈Ts

n,K

E[Z(s,τ)|Fs

n]∨ ess sup τ∈Ts

K,K+1

E[Z(s,τ)|Fs

n]

=EhZ(s,τb∗ n,K)|Fns

i

∨E[Z(s,σ∗)|Fs

n]

thus we observe thatτb∗

n,K+1 is equal toτb

∗

n,K or σ∗, whereτb

∗

n,K ∈Tns,K andσ∗∈

Ts

K,K+1respectively. It follows thatτb

∗

n,K+1≥τb

∗

n,K which implies that the sequence

τb∗

n,K is nondecreasing with respect to K. Moreover Rbi

∗

≤t0−Ti{3} for all i∈

{0, . . . ,K}thusτb∗

n,K≤t0and thereforeτb

⋆

n ≤t0exists.

Let us now look at the process ξs₍_t_{) = (}_t_,Ms

t,V(t)), wheresis fixed andV(t) =

t−T_N{3} 3(t).ξ

s₍_t₎_{is Markov process with the state space}_[_s,t

0]×[m,∞)×[0,∞). In a

general case the infinitesimal operator forξs_{is given by}

Aps,m(t,m,ve ) = ∂

∂tp

s,m₍_t,_m,_e _v_{) +} ∂

∂vp

s,m₍_t,_m,_e _v₎

+α(v)

Z

R+

ps,m(t,x,0)dH(x)−ps,m(t,m,ve )

,

whereps,m(t,m,e v):[0,∞)×[0,∞)×[0,∞)→R_{is continuous, bounded, measurable} with bounded left-hand derivatives with respect tot andv(see [1] and [17]). Let us notice that fort≥sthe processZ(s,t)can be expressed asZ(s,t) =ps,m(ξs₍_t₎₎_,

where

ps,m(ξs₍_t_{)) =}

ˆ

ga(Ms)−ca(s) +gˆb(Mts−Ms)−cb(t−s) ifs≤t≤t0,

−C ift0<t.

It follows easily that in our caseAps,m(t,m,e v) =0 fort0<tand

Aps,m(t,m,ve ) =α(v)[Egˆb(me+X{3}−m)−gˆb(me−m)]−cb′(t−s) (22)

fors≤t≤t0. The processps,m(ξs(t))−ps,m(ξs(s))−

Rt

s(Aps,m)(ξs(z))dzis a

mar-tingale with respect toσ{ξs₍_z₎_,_z_≤_t_}_{which is the same as}_Fs

t. This can be found

in [4]. Sinceτb∗

n,K ≤t0, applying the Dynkin’s formula we obtain

Ehps,m(ξs(τnb,∗K))|Fns i

−ps,m(ξs(Tn{3})) =E "_Z

τb∗

n,K

Tn{3}

(Aps,m)(ξs(z))dz|Fs

n #

a.s.

(23) From (22) we conclude that

Z τb∗

n,K

Tn{3}

(Aps,m)(ξs₍_z₎₎_dz_{= [}_E_g_ˆb₍_Ms

n+X{3}−m)−gˆb(Mns−m)] Z τb∗

n,K

Tn{3}

α(z−Tn{3})dz

−

Z τb∗

n,K

Tn{3}

(18)

Moreover let us check that

Z τb∗

n,K

Tn{3}

α(z−Tn{3})dz ≤ 1 ¯F(t0)

Z τb∗

n,K

Tn{3}

f(z−Tn{3})dz≤

1 ¯F(t0)

<∞,

Z τb∗

n,K

Tn{3}

cb′(z−s)dz

= cb(τb∗

n,K−s)−cb(T

{3}

n −s) <∞,

Egˆb(M_ns+X{3}−m)−gˆb(M_ns−m)<∞,

where the two last inequalities result from the fact that the functions ˆgbandcbare bounded. On account of the above observation we can use the dominated conver-gence theorem and

lim

K→∞E

"_Z τb∗

n,K

Tn{3}

n #

=E

"_Z τb⋆

n

Tn{3}

n #

. (24)

Sinceτb⋆

n ≤t0applying the Dynkin’s formula to the left side of (24) we conclude

that

E "_Z

τb⋆

n

Tn{3}

(Aps,m)(ξs₍_z₎₎_dz_|_Fs n #

=Ehps,m(ξs₍_τb⋆ n ))|Fns

i

−ps,m(ξs₍_T{3}

n )) a.s.

(25) Combining (23), (24) and (25) we can see that

lim

K→∞E

h

ps,m(ξs₍_τb∗ n,K))|Fns

i

=Ehps,m(ξs₍_τb∗ n ))|Fns

i

, (26)

hence limK→∞E

h

Z(s,τb∗ n,K)|Fns

i

=EZ(s,τb∗ n )|Fns

. We next prove the optimality

ofτb n

∗

in the classTs_{∩ {}_τb n ≥T

{3}

n }. Letτ∈Ts∩ {τnb≥T

{3}

n }and it is clear that

τ∧T_K{3}∈Ts

n,K. Asτb

∗

n,Kis optimal in the classTns,K we have

lim

K→∞E

h

ps,m(ξs₍_τb∗ n,K))|Fns

i

≥ lim

K→∞E

h

ps,m(ξs₍_τ_∧_T{3}

K ))|Fns i

. (27)

From (26) and (27) we conclude thatEps,m(ξs₍_τb∗ n ))|Fns

≥E[ps,m(ξs₍_τ₎₎_|_Fs n]

for any stopping timeτ∈Ts_{∩ {}_τ_≥_T{3}

n }, which implies thatτb

∗

n is optimal in this

class.

(ii) Lemma4and (26) lead toE h

Z(s,τb n

∗

)|Fs n i

=γs,Ms₍_Ms

n,T

{3}

n ).

The remainder of this section will be devoted to the proof of the left-hand differ-entiability of the functionγs,m₍_m,_s₎_{with respect to}_{s. This property is necessary to}

construct the first optimal stopping time. First, let us briefly denoteδ(0,0,c)∈B_by ¯

(19)

Lemma 6.Let ν¯(c) =Φb_δ¯₍_c₎_, _δ¯₍_c₎_∈B and δ¯′

+(c)

≤A1 for c∈[0,t0) then

ν¯′

+(c) ≤A2.

PROOF OF LEMMA. 6. Let us first observe that the derivative ¯ν₊′(c)exists because ¯

ν(c) =max0≤r≤cφ¯b(c,r), where ¯φb(c,r)is differentiable with respect tocandr.

Fixh∈(0,t0−c)and define ¯δ1(c) =δ¯(c+h)∈Band ¯δ2(c) =δ¯(c)∈B. Obviously,

kΦb_δ¯

1−Φbδ¯2k ≥ |Φbδ¯1(c)−Φbδ¯2(c)|=|Φbδ¯(c+h)−Φbδ¯(c)|and on the other

side using Taylor’s formula for right-hand derivatives we obtain

_δ1¯ ₋_δ2¯ ₌_sup

c

_δ¯₍_c₊_h₎₋_δ¯₍_c₎_≤_h_sup

c _δ¯′

+(c)

+|o(h)|.

From the above and Remark8it follows that

−C

sup

c _δ¯′

+(c)

+|o(h)|

h

≤ν¯(c+h)−ν¯(c)

h ≤C

sup

c _δ¯′

+(c)

+|o(h)|

h

and lettingh→0+givesν¯′

+(c)

≤CA1=A2.

z

The significance of Lemma6is such that the function ¯y(t0−s)has bounded

left-hand derivative with respect tosfors∈(0,t0]. The important consequence of this

fact is the following

Remark 5.The functionγs,m_{can be expressed as}_γs,m₍_m,s_{) =}_I

{s≤t0}u(m,s)−CI{s>t0}, whereu(m,s) =gˆa(m)−ca(s) +gˆb(0)−cb(0) +y¯b(t0−s)is continuous, bounded,

measurable with the bounded left-hand derivatives with respect tos.

At the end of this section, we determine the conditional value function of the second optimal stopping problem. According to (10), Theorem2and Remark5we have

J(s) =EhZ(s,τb∗₎_|_F s

i

=γs,Ms₍_M

s,s)a.s. (28)

4 Construction of the optimal first stopping time

In this section, we formulate the solution of the double stopping problem. On the first epoch of the expedition the admissible strategies (stopping times) depend on the formulation of the problem. For the optimization problem the most natural are the stopping times fromT _{(see the relevant problem considered in Szajowski [}₂₂_]).

However, when the bilateral problem is considered the natural class of admissible strategies depends on who uses the strategy. It should beT{i}_{for the}_{i-th player.}

Here the optimization problem with restriction to the strategies from theT{1}_{at the}

first epoch is investigated.

Let us first notice that the functionu(m,s)has a similar properties to the function ˆ

(20)

define againΓn,K =ess supτa_∈T_n_,_KE[J(τa)|Fn], n=K, . . . ,1,0,which fulfills the

following representation

Lemma 7.Γn,K =γK−n(Mbn{1},Tn{1})for n=K, . . . ,0,where the sequence of

func-tionsγjcan be expressed as:

γ0(m,s) =I{s≤t0}u(m,s)−CI{s>t0},

γj(m,s) =I{s≤t0}

u(m,s) +ya_j(m,s,t0−s)

−CI{s>t0}

and ya_j(a,b,c)is given recursively as follows:

ya₀(a,b,c) =0 ya_j(a,b,c) = max

0≤r≤cφ a ya

j−1(a,b,c,r) where

φ_δa(a,b,c,r) = Z r

0

¯ F1(z)

n

α1(z)h∆a(a) +Eδ(a+x{1},b+z,c−z)i

−(y¯b′₋(c−z) +ca′(b+z))odz.

Lemma7corresponds to the combination of Lemma4and Remark3from Subsec-tion3.1. Let the operatorΦa_:_B_→_B_{be defined by}

(Φaδ)(a,b,c) = max

0≤r≤cφ a

δ(a,b,c,r). (29)

Lemma7implies that there exists a functionr∗₁_,_j(a,b,c)such that

γj(m,s) =I{s≤t0}

u(m,s) +φa

ya_j₋₁(m,s,t0−s,r

∗

1,j(m,s,t0−s))

−CI_{s>t0}.

We can now state the analogue of Theorem1.

Theorem 3.Let Ra_i∗ =r_Ka∗₋_i(Mi,T{

1}

i ,t0−T{ 1}

i )andηn,K=K∧inf{i≥n:Rai

∗_<

S{_i₊1}₁}, thenτa∗ n,K=T

{1}

ηn,K+R

a∗

ηn,Kis optimal in the classTn,KandΓn,K=E

h

J(τa∗ n,K)|Fn

i

.

The following results may be proved in much the same way as in Section3.

Lemma 8.If F1(t0)<1then the operatorΦa:B→Bdefined by (29) is a

contrac-tion.

Remark 6.There existsya∈B_{such that}ya=Φa_ya_{and lim}

K→∞kya_K−yak=0. The above remark implies thatγ=limK→∞γK is given by

γ(m,s) =I{s≤t0}[u(m,s) +y a₍_m,_s,t

0−s)]−CI{s>t0}. (30)