Step three: Implementation and evaluation of Markov Chain models in modelling

CHAPTER 7 CASE STUDY: PHILLIP ISLAND

7.3 A CASE STUDY ON MODELLING THE SPATIO - TEMPORAL MOVEMENT OF

7.3.3 Step three: Implementation and evaluation of Markov Chain models in modelling

level

This step of the case study will use tourist movement data at the macro level collected from Phillip Island to model the spatio-temporal movement of tourists based on the stationary discrete MC theory. The probabilities of movement patterns will be calculated and the model will then be validated.

Model design

As defined in chapter 5, a stochastic movement process is a family of random movement states (X_{t t T})_∈ on a given probability space S and indexed by t belonging to a parameter set T.

The set T is regarded as the time sequence of the process T ={1, 2,...}. The family of random movement states (X_{t t T})_∈ can be represented as a set of attractions that a tourist visited. Table 7.1 shows the codes used for the attractions on Phillip Island.

Table 7.1 The attractions of Phillip Island

A B C D E F G H J OUT

Information

Centre Cape

Woolamai Churchill Island

Koala Conservation

Centre Rhyll

Inlet Cowes Penguin

Parade The

Nobbies Ventnor Outside the Island

The tourist movement on Phillip Island is considered as moving around nine attractions listed in table 7.1. A stationary discrete MC will be used to model the movements of these tourists between each attraction from the moment they enter the park until they have completed their visits.

The states of the chain are the nine attractions visited with an additional absorbing state which is labelled as “OUT” and signals the completion of their daily trip. The maximum number of attractions visited by each tourist on a daily trip is seven. The transition probability of the MC is the conditional probability of moving from one attraction to another. For example, using the rule of conditional probability, the transition probability of going from attraction A to _i attraction A , i_j ≠ from time n to n + 1 where the attractions belong to the above set is given j

By the assumption of stationary, the transition probabilities (7.1) and hence (7.2), will be the same for all n.

A graph tree (see figure 7.14 ) is drawn which summarises all sequences of movements by tourists and frequency of visits at each attraction as well as frequency of movements from one

attraction to the next. From Equation (7.3), the probability of any sequence of movements from the time a tourist enters the park until the time the tourist leaves it can be calculated once the transition probability matrix (see table 7.2) and the initial probabilities (see table 7.3) are obtained. These are estimated using the graph tree. Notice that there are seven destinations and six transitions displayed in the graph.

1 1 2 2 3 3 4 1

The initial probability of visiting an attraction Ai, Pr(Ai(1)), is estimated by counting the number of visits to the attraction as a first destination divided by the total number of visits to all first attractions. For example, in the case of attraction D

Pr(D(1)) = 95 0.21 451=

The estimation of (2) is also straightforward using the data contained in the graph tree. For example, to estimate the conditional probability of going from D to C, Pr(D(n) C(n + 1)) ∩ is calculated first as follows:

(1) Count the number of movements that satisfy the profile D(n) ∩ C(n + 1) for n = 1,2,…,m-1 where m is the maximum number of possible visits. For example, N(D(1) ∩ C(2)) is the number of movements where tourists began their trips at D and then next move to C.

(2) Sum these frequencies, i.e.^N-1

n=1

N(D(n) ∩ C(n + 1))

∑

(3) Repeat Step 1 and 2 for all states in S not including D and sum all these frequencies to obtain the total number of one-step movement patterns.

(4) Divide the number obtained in (2) by that of (3).

For the denominator of (7.3), the above steps are repeated but this time these steps are done for all other attractions to get the required probability.

Figure 7.14 The graph tree

Table 7.2 A transition probability matrix

Table 7.3 Initial probabilities

Pr(A) Pr(B) Pr(C) Pr(D) Pr(E) Pr(F) Pr(G) Pr(H) Pr(J) Initial Probability 0.01 0.06 0.07 0.20 0.03 0.33 0.18 0.10 0.01

The probabilities of movement patterns can be computed using equation (7.3) once the one-step transition probabilities and the distribution of the initial state have been estimated. For example, the probability of the travel pattern DÆFÆ G, i.e. the KCC (D) is the first attraction visited, followed by Cowes (F) with the Penguin Parade (G) being the last attraction is calculated using table 7.2 and table 7.3 as follows:

Pr(DFGOUT) = Pr(D) Pr(F|D) Pr(G|F) Pr(OUT|G) = 0.20 0.48 0.57 0.95 = 0.05

× × ×

In order to be able to identify the significant movement patterns from the data collected, it is essential and logical that the sequences of movements are further categorised by the number of attractions visited during each visit to the park. To do this, the probability of each pattern consisting of n visits, n = 1, 2,…,7 is simply divided by the sum of the probabilities of all patterns consisting of n attractions. For example, for n = 1, the probabilities of the events G, D and C are estimated as 0.168, 0.01 and 0.008 respectively, giving a total probability of 0.186 for a one-attraction pattern. The conditional probability of visiting a single attraction G, D or C is therefore equal to 0.168/0.186 = 0.903; 0.01/0.186 = 0.054 or 0.008/0.186 = 0.043 respectively. Thus, G is the most significant pattern for a one attraction visit to the park. This is summarized in table 7.4. In tables 7.5 to 7.9, the same process for n = 2,…,7 patterns is repeated. The most significant patterns correspond to the ones with the highest probabilities of occurring.

Table 7.4 Probability of movement patterns (n = 1) Pattern G D C Probability 0.903 0.054 0.043

Table 7.5 Probability of movement patterns (n = 2)

Pattern FG HG DG EG BG FD CB CF HD AF Probability 0.59 0.25 0.12 0.02 0.02 0 0 0 0 0

Table 7.6 Probability of movement patterns (n = 3)

Pattern FHG DFG DHG CFG HFG BHG EFG

Table 7.7 Probability of movement patterns (n = 4)

Pattern DFHG CFHG EFHG BFHG DHFG BDFG FGHG

Table 7.8 Probability of movement patterns (n = 5)

Pattern CDFHG BDFHG DFGHG DEFHG CFHFG CDHFG EDFHG BEFHG

Pattern CDFGHG BDEFHG DHCBFG DBCEFG Pattern CBDEFHG Probability 0.67 0.3 0.01 0.01 Probability 0.99

Pattern FBDJHG JEFCHG JBJBJG HGEFGF Pattern CFBDEHG Probability 0 0 0 0 Probability 0.01

Model validation

In this chapter, MC has been applied to model the spatio-temporal movement of tourists and to calculate the probabilities of various movement patterns. However, the key question now is how well are these movements modelled by MC? In other words, how close are the observed frequencies of movement patterns to those which would be expected when equation (3) is used to calculate probabilities of different patterns? In this section, The results were validated using a Chi-Square Goodness of Fit test (Devore 2004).

A data mining approach was adopted by dividing the data set, consisting of 464 records, into two parts: training data and test data. The training data will be used to fit the MC model (that is, to estimate the transition probability matrix P and initial probability vector). The expected values obtained using (4) will then be compared with the frequencies observed in the test data using a Chi- Square Goodness of Fit test. If the computed Chi-Square statistics are not statistically significant, then the Markov model is proved to fit to data.

Four hundred and sixty four records in the data set are divided by a random sampling method called EPSEM (equal probability of selection method sampling) using the STATISTIC package. Seventy percent of the data are selected first as training data, leaving the remainder as test data and then the whole process is repeated and 60% of the data as training data is selected. The objective here is to gauge the effect sampling bias has on the results obtained.

Then a set of observed frequencies O is compared with a set of expected frequencies _i E where i= 1, 2,…, k ranges over the different movement sequences associated with a fixed i

number of visits n using the Chi-Square Goodness of Fit test.

For example, two one-attraction-movement patterns are identified from the training data (70%

of the total data) C and G. their probabilities are calculated using MC model (see table 7.10), then the probability of each pattern is divided by the sum of the probabilities of these two patterns to calculate ˆp . i.e. the ˆp of pattern C is calculated as follows:

2 2 2

Table 7.10 The Chi-squared test for movement pattern n=1 when 70% of data is used for training

Pattern Probability Sum Probability ˆp Observed Chi-Square C 0.006972 0.20823 0.033482 1 0.1296995

G 0.201258 0.966518 20

In table 7.11 and table 7.12 the results are summarised based on the Chi-squared test statistics for different number of visits n and different size of training data. The p-values given in the tables indicate the degree of significance in the results. Note that a p-value of 0.05 or less signifies a significant result, i.e. the model provides a poor fit to data. As can be seen from the tables, in the majority of cases, the MC distributions provide a good fit to data. However, the fit is poor when n is large, leading one to suspect that the stationary assumption breaks down when there is a large number of visits. The more general equation (7.3) in chapter 5.2.1 should be used in such cases. This is to be expected since for longer trips, it is anticipated that the choice of destinations would depend on which stage of the trip the decision has to be made, an assumption which is ignored when one assumes stationarity.

Table 7.11 Chi-squared test statistics when 70% of data is used for training

n=1 n=2 n=3 n=4 n=5 n=6

χ2 = 0.13 χ2 = 3.27 χ2 = 3.39 χ2 = 15.19 χ2 = 4.39 χ2 = 28.67

p-value = 0.72 p-value = 0.20 p-value = 0.07 p-value = 0.004 p-value = 0.11 p-value = 0.000 (Not significant) (Not significant) (Not significant) (Significant) (Not significant) (Significant)

Table 7.12 Chi-squared test statistics when 60% of data is used for training

n=1 n=2 n=3 n=4 n=5

χ2 = 0.37 χ2 = 1.60 χ2 = 6.92 χ2 = 2.32 χ2 = 4.12

p-value = 0.83 p-value = 0.45 p-value = 0.14 p-value = 0.31 p-value = 0.04 (Not significant) (Not significant) (Not significant) (Not significant) (Significant)

7.3.4 Step four: Application of log-linear models in testing significance

In document Modelling the spatial-temporal movement of tourists (Page 119-127)