The Estimation Algorithm - Matsumoto_unc_0153D

The model parameters can be estimated using Simulated Maximum Likelihood. The estimation procedure we propose is based on the EM Algorithm. First, we will present the estimation procedure, and then we will discuss the advantages of using the EM Algorithm over Simulated Maximum

Likelihood.

Denote the parameters to be estimated asθ. Then the parameters that maximize the log likelihood function will also maximize the following equation:

Equation 2.14 is the expected conditional likelihood of the joint probability of the individual’s choices and the unobserved heterogeneity, where the expectation is over the distribution q, which is the conditional probability distribution of the unobserved heterogeneity. This expected likelihood function serves as the basis of the estimation procedure.

The estimation method in this section extends the method of Arcidiacono and Miller (2011). Their method incorporates unobserved heterogeneity as a finite mixture distribution over unobserved types using the Expectation Maximization (EM) algorithm. We extend this approach to allow for any continuous distribution of unobserved heterogeneity by using simulation methods. The estimation procedure uses a CCP representation of the individual’s value function in the maximization step of a Simulated EM (SEM) algorithm. An individual’s “type” corresponds to a draw from the distribution of unobserved heterogeneity.

Each step of the estimation procedure will be covered in detail after an overview of the entire process. The algorithm begins with initial guesses for the parameters and iterates over the following steps:

1. E-step, part 1: Use the current parameter values to updateq.

2. E-step, part 2: Update the Conditional Choice Probabilities (CCPs) using the current parameter values.

3. M-step: Update the parameter estimates by maximizing the simulated likelihood function using the updated CCPs and values ofq.

The process terminates when the parameter estimates converge. Since the EM algorithm is an iterative procedure, the maximization step (or M-step) must be performed on each iteration. The SML estimator only requires maximizing the likelihood function a single time. Despite the iterative nature of the estimation, the EM algorithm can still yield considerable computational savings. The use of CCPs in the maximization step can require far fewer computations than full solution methods. Also, the EM algorithm reintroduces additive separability in the maximization step. When the likelihood function is additively separable, it may be possible to estimate the parameters sequentially.

2.4.1 E step

The Expectation step uses the prior iteration parameter estimates,θˆ= (α,µ,¯ Σ, σξ), to update the conditional probability distribution of the unobserved variable,q, and the CCPs,Pˆ. The probability distribution of the unobserved variables and the CCPs are functions ofµ, which can take an infinite number of values. Each individual, however, can only take on a finite number of values correspond- ing to the draws needed to simulate the likelihood function. Therefore, these functions only need to be evaluated at a finite number of points. Denote the iteration number with a superscript.

The first step is to use the prior iteration estimates of the population distribution parameters to update the individual parameter values:

µm,s_n +1 = ¯µs+chol(Σs)ηm_n, form = 1, ..., M (2.16)

where {ηm_n}M

m=1 are size J vector draws from the standard normal distribution, and chol(Σs) is

the lower triangular Cholesky decomposition of the population variance matrix. The value of the sequences of signal draws are updated similarly using the current iteration estimate of the standard deviation of the signal. The following equation updatesq:

qm,s+1(µn,∆n) = Ln(µmn,∆mn) 1 M PM m=1Ln(µmn,∆mn) (2.17)

The CCPs are updated as a weighted multinomial logit of the outcome on a flexible polynomial of the state variables where the weights are q. Alternatively, the CCPs can be updated using the structure

of the model. The model can be used to calculate the probability of a given choice at different points in the state space and interpolation methods can be used to generate estimates of the CCPs at other points in the state space.

2.4.2 The M Step

The maximization step uses the updated CCPs and the updated q’s to maximize the simulated version of equation (2.14). The updated parameter estimates are:

θs+1 = max θ N X n=1 1 M M X m=1 qm,s_n +1log(Ln(θ,Pˆs+1)) (2.18)

It is important to note that equation (2.14) is additively separable in the likelihood of the choice (Ln)

and the likelihood of the unobserved heterogeneity (f(µ,∆|θ)). The latter term can be used to update the parameters of the distribution of unobserved variables. The updated distribution parameters are the ML estimate of the mean and variance of the multinomial normal distribution, which simply becomes the mean and covariance matrix of the sample ({µn,∆n}Nn=1) with weightsq. The closed

form solution for the updated distribution parameters follows Train (2007). Additionally, the EM Algorithm introduces additive separability into the choice likelihood, Ln, which could allow for

sequential estimation of the other model parameters (Arcidiacono and Jones 2003). Finally, it is possible to use an alternative version of the EM algorithm that replaces the full maximization in the M step with a single iteration of an optimization procedure. This variant of the EM algorithm is known as the Generalized EM (GEM) algorithm. Using the GEM variant requires more iterations, but can substantially reduce the computation required for each iteration. Full maximization in the M step can be computationally intensive particularly if the optimization procedure uses numerical gradients.4

In document Matsumoto_unc_0153D_15284.pdf (Page 61-65)