Obtaining samples from parameter distributions

7 Introducing heterogeneity using an agent-based approach

7.1 Obtaining samples from parameter distributions

In this section parameter distribution samples are obtained using two different methods; Bayesian sampling and the Jack-knife approach. These methods are now elaborated upon.

7.1.1 Bayesian sampling

The Bayesian approach uses both prior information (i.e. what is expected or believed) and posterior information obtained by data collection according to the following conditional probability formula (Bolstad, 2007):

Where is the unknown parameter, is the posterior distribution, is the sampling density of the data (i.e. likelihood), is the prior distribution and is the distribution of the present data (i.e. normalizing constant).

For most problems the posterior distributions are difficult or impossible to compute in an analytical way. Generalized linear models, such as the binary logistic regression model used in this research, are one of those (MathWorks, 2014). Luckily, Bayesian estimates of the model parameters can be obtained from their posterior distributions using the Markov Chain Monte Carlo (MCMC) slice algorithm as implemented in the Matlab software, which generates random samples from distributions based on the initial value of the sampling sequence (i.e. the initial model coefficients as provided in chapter 6), a prior distribution (i.e. assumed to be , which means no prior information is known) and the sampling density of the data (i.e. likelihood, assumed to be , where is the probability of certain choice behavior (i.e. inertial choice or compromising choice) to occur at each run of all choice situations ).

The MCMC slice algorithm does not generate independent simulated distribution samples. Instead, each simulated sample depends on its immediate predecessor. That is, for each current simulated solution the algorithm evaluates a solution within the neighborhood space (which is set to 10 in

81 | P a g e Matlab by default). Based on the probability of occurrence within the posterior distribution, this neighbor is adopted as the next simulation. In the end, each solution will be represented within the simulations according to its number of occurrence within the posterior distribution.

As a result of this dependency between each subsequent simulation, it might take a while before the effect of the initial values of the sampling sequence disappears and the Markov Chain reaches a stationary state. Therefore, the first distribution samples (i.e. 350 for the inertia model and 250 for the compromising model; these are called burn-in rates) are not used. Then, a total of 1000 distribution samples are obtained for the inertia model, while for the compromising model 500 distribution samples are sufficient. In order to obtain independent samples, these simulated samples are selected by picking only 1 simulated sample out of every 2000 simulations for the inertia model and only 1 simulated sample out of every 1500 simulations for the compromising model (these are called the thinning-rates). This prevents obtaining distribution samples that are close to each other in the Markov Chain and therefore being dependent on each other. Appendix C elaborates on how these values are set.

Based on the obtained distribution samples the posterior distributions are approximated by cumulative distribution functions. These approximate posterior distributions are shown in figure 25. The posterior distributions of some of the coefficients have wide dispersions, while others have not. This represents the heterogeneity of the population to these attributes as derived from the used dataset. A small dispersion indicates that there is high homogeneity to the extent that a specific attribute affects the choices of each individual within the population, while a wide dispersion indicates that individuals weigh that attribute quite differently from each other. So apparently, the population is homogeneous in weighting personality traits, driving years, travel time and familiarity, while they are more heterogeneous in weighting the time of day, ethnicity, gender, education and their preference to expose a certain choice strategy as is caught in the model constants.

Figure 25: Approximate posterior distributions of the coefficients of a) the inertia model and b) the compromising model -8 -6 -4 -2 0 2 4 6 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Value of coefficient C u m u la ti ve p ro b a b il it y

Cumulative distribution of posterior based on Bayesian sample - Inertia model

Constant Time of day 1 Time of day₂ Ethnicity 1 Education 1 Education 2 Education 3 Driving Years C Maximum Familiarity tt prev Residency -12 -10 -8 -6 -4 -2 0 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Value of coefficient C u m u la ti ve p ro b a b il it y

Cumulative distribution of posterior based on Bayesian sample - Compromising model Constant Gender 1 Education 1 Education 2 Education 3 E C TT abs Maximum familiarity

82 | P a g e

7.1.2 Jack-knife approach

The jack-knife approach is used for model validation earlier in this report (chapter 6). The basic idea is that one of the observations within the dataset is left out of the data on which the model parameters are estimated using binomial regression analysis. This is repeated until all observations are left out once. In the end, some sample of the parameter distributions is obtained by all estimated parameters together.

7.1.3 Jack-knife versus Bayesian sampling

The main difference between the two sampling approaches is that the Jack-knife approach does not provide simulated samples and its sample size is therefore restricted to the number of observations within the dataset, while the Bayesian approach can provide an indefinite amount of samples. As a result, the jack-knife sample might not be large enough to be a correct representation of the underlying parameter distributions, while the Bayesian sample can simply be increased in order to obtain a representative sample. This implies that the Bayesian approach is more flexible to use. However, in this report both approaches are tested.

In document One route or the other? : development and evaluation of a day to day route choice model incorporating the principljes of inertial behavior and quantification of the indifference band based on a real world experiment (Page 80-82)