Estimation Strategy - Model and Estimation Strategy

2.4 Model and Estimation Strategy

2.4.2 Estimation Strategy

Arrival Process: Finally, recall that we found that the mean number of times that callers contacted the call center within our dataset is 1.81 with a standard deviation of 1.90. The high standard deviation implies that callers differ in their frequency with which they contact the call center. Hence, to capture any differences among the segments in their contact frequency, we characterize the caller arrival process for callers in each segment. To do this, we assume that the interarrival time of a caller from segment s is exponentially distributed with a rate parameter denoted by λs ∈Λ. Hathaway et al. (2019a) took a similar approach by modeling the caller interarrival process under a latent class framework.

2.4.2 Estimation Strategy

We estimate our model in three steps. First, we estimate the callers’ expected waiting time to receive service in the online queue (E[Wn|A, m, t]) separately for each combination

of period (t ∈ {1,· · · , T}), message (A ∈ L), and time-of-day interval (m = 1 for day, and

m = 2 for evening), and the expected waiting time to receive a callback (E[Wf|A, m, t = 0])

separately for each message, and time-of-day interval. Second, we estimate the probability of callers being available to answer a callback (pA,mans) separately for each combination of message

We assume that the idiosynratic shocks follow the standard Gumbel distribution with a location parameter of 0 and a scale parameter of 1.

and time-of-day interval directly from the data. Third, using our estimates of E[Wn|A, m, t], E[Wf|A, m, t = 0], and pA,mans , we estimate the caller parameters via the maximum likelihood approach.

Step 1 - Estimate Expected Waiting Times: To estimate the expected waiting time to receive service in the online queue, we first estimate the cumulative distribution function of the number of periods required to receive service in the online queue for each combination of time-of-day interval (m), and message (A). We denote this function by Fn,A,m(·) and estimate it using the Kaplan-Meier estimator (Kaplan and Meier, 1958), which accounts for censored observations due to abandonment. Also, we denote by fn,A,m(·) the probability distribution function of this distribution. Given Fn,A,m(·) and fn,A,m(·), we estimate the expected waiting time to receive service in the online queue for each combination of period (t), time-of-day interval (m), and message (A) using the following formula:

E[Wn|A, m, t] = T X t0₌_t₊₁ fn,A,m(t0)·t0 1−Fn,A,m₍_t₎ −t.

Note that the above formula is the expected value of the truncated distributionFn,A,m(t) at the caller’s current period minus the current period, which represents the expected waiting time to receive service at periodt. For the expected waiting time to receive a callback, we calculate the average waiting time for each combination of message, and time-of-day interval directly from the data.10

Step 2 - Estimate Probability of Answering Callback: Recall from our model that callers who accept a callback offer do not answer the callback if they are unavailable when the callback arrives. Furthermore, from (2.3), callers consider their probability of answering the callback (pA,mans) when determining the utility of accepting a callback offer. Denote by pA,mar (t) the observed probability from the data that a callback offered under message A during time- of-day interval m will arrive in period t, and bypm_av(t) the observed probability from the data that a caller who receives a callback in periodt during time-of-day interval m will answer the

In this case we do not require the Kaplan-Meier estimator since we observe how long it takes for a callback to arrive regardless of whether the caller answered the callback, i.e., the data is not censored.

callback. Then the probability of callers answering the callback under a given message and time-of-day interval is given by

pA,m_ans = T

t=1

pA,m_ar (t)·pm_av(t),

which is obtained by multiplying the probability that the callback arrives in a given period by the probability that the caller is available in that period and summing over all periods.

Step 3 - Maximum Likelihood Estimation of Caller Parameters: Using our estimates ofE[Wn|A, m, t], E[Wf|A, m, t= 0], pA,mans, the observed decisions of the callers, and the observed number of calls initiated by each caller, we estimate the model parameters (Θ,Λ) via maximum likelihood. Recall that the probability of caller i from segment s choosing action

dijt in period t of call j is denoted by Pijt(dijt;qij, Aij, bij, s,Θ). Also, let τij be the final de- cision period of calleri during call j. Then the likelihood of observing all of the callback and abandonment decisions of caller ifrom segments, which we denote byli(s; Θ), is given by

li(s; Θ) = Ji Y j=1 τij Y t=1 Pijt(dijt;qij, Aij, bij, s,Θ).

We next give the likelihood of observing the number of calls initiated by each caller, given their segment. Denote by ηi the number of calls initiated by caller iin the dataset, which is given by Ji for callers who dialed at least once within the dataset, and by zero for callers who did not initiate any calls. For callers who dialed at least once, denote by γij the time between the end of call j−1 and the beginning of call j of caller i. We remark that for the first and last call of each caller, γij is right-censored as we do not observe the ending time of the call that preceded the caller’s first call in the dataset, and the commencement time of the call that follows the caller’s final call of the dataset. Next denote by Γithe total time that callerispends between calls in the dataset, which is given by the sum ofγij over all the calls of calleriin the dataset for callers who initiated at least one call, and the span of the entire dataset for callers who initiated no calls. Lastly, let ℘i(s; Λ, ηi,Γi) be the probability of observing ηi calls from calleriduring the interval Γi, given that calleriis from segments. Then, given our assumption that the time between calls is exponentially distributed, the number of calls initiated by caller

i from segment s is Poisson distributed, where the expected number of calls is equal to λsΓi. Thus, the probability of observing the number of calls initiated by caller ifrom segment s is given by

℘i(s; Λ, ηi,Γi) =

(λsΓi)ηi·exp(−λsΓi)

ηi!

. (2.5)

We lastly calculate the likelihood of observing all of the callers’ callback and abandonment decisions along with the number of calls they initiated in the dataset, which we denote byL(Θ), and refer to as the likelihood function. The likelihood function is given by

L(Θ) = N1 Y i=1 S X s=1 πs·li(s; Θ)·℘i(s; Λ, ηi,Γi) ! | {z } Term 1 · N2 Y i=N1+1 S X s=1 πs·℘i(s; Λ,0,Γi) ! | {z } Term 2 .

For reference, term 1 is the likelihood of observing the callback and abandonment decisions and the number of calls initiated by callers who initiated at least one call during the span of the data. Term 2 is the likelihood of observing no calls for the callers who did not initiate any calls during the span of the data. We maximize the log of the likelihood function over the model parameters subject to the constraints below:

max Θ logL(Θ) = N1 X i=1 log S X s=1 πs·li(s; Θ)·℘i(s; Λ, ηi,Γi) ! + N2 X i=N1+1 log S X s=1 πs·℘i(s; Λ,0,Γi) ! , subject to ∀s:λs>0, πs ≥0, S X s=1 πs= 1.

We assume that callers who wait in the online queue make decisions every ten seconds.11 Since our data is collected every second, we round caller waiting times upward and abandonment times downward. We assume that callers know they will be answered within 450 periods, i.e.,

T = 450.Thus, we run our estimation only on calls with waiting times that are no greater than 4500 seconds; this eliminates only .03% of the data. We also assume that there are two segments

of callers.12 We maximize the likelihood function at 100 random starting points, and select the solution with the highest likelihood. To obtain standard errors, we perform nonparametric bootstrapping (Horowitz, 2001). Finally, in A.4 we formally discuss identification of our model and what sources of variation in the data allow us to identify the caller parameters.

2.5 Estimation Results and Model Validation

In document Hathaway_unc_0153D_18409.pdf (Page 50-54)