• No results found

3.5 Results

3.5.3 Behavioral Clusters

For the third part of our analysis, we will cluster the dataset. Based on a scree-plot, we observed that the best number of clusters is 4 for both the first and second phase data. We run the k-means algorithm on the data for the first and second phase. The cluster means and cluster sizes are reported in Table 3.7.

Based on this table, we observe the following types of clusters which we will interpret as follows:

The stoic (st): The stoic has a very short period of experimentation at the beginning of the experiment and then sticks with a single choice for the remainder of the rounds. The mean of the stoic cluster in the first and second phase of the experiment is very similar. The stoics are fewer in number during the second

66 Time Choice Data for Public Transport Optimization

Table 3.4: Table of Summary Statistics over the full population of 515 respondents. The top part shows the different behavioral measures during the first phase of the experiment (i.e. without a crowding indicator). The middle part shows the behavioral measures for the second phase of the experiment (i.e. with a crowding indicator). The bottom part shows the distribution of change of the behavioral measures from the first to second phase for each individual.

Percentiles

Statistic Mean St. Dev. Min. 25% Median 75% Max.

ca-1 0.34 0.19 0.00 0.20 0.33 0.46 0.92 ls-1 0.65 0.38 0.00 0.29 0.84 0.95 0.95 mtsc2-1 0.41 0.38 0.00 0.00 0.42 0.79 0.89 sp-1 0.41 0.31 0.00 0.11 0.37 0.68 1.00 rsc-1 0.41 0.31 0.00 0.12 0.40 0.67 1.00 rsca-1 0.59 0.17 0.00 0.50 0.61 0.67 1.00 avgS-1 0.56 0.08 0.30 0.51 0.56 0.62 0.79 maxstr-1 0.45 0.33 0.05 0.16 0.32 0.76 1.00 dr-1 0.52 0.15 0.11 0.42 0.53 0.63 0.79 ca-2 0.33 0.23 0.00 0.12 0.33 0.50 0.91 ls-2 0.83 0.25 0.00 0.84 0.95 0.95 0.95 mtsc2-2 0.55 0.36 0.00 0.05 0.74 0.84 0.89 sp-2 0.59 0.27 0.00 0.42 0.63 0.79 1.00 rsc-2 0.59 0.27 0.00 0.41 0.62 0.80 1.00 rsca-2 0.58 0.16 0.17 0.50 0.56 0.67 1.06 avgS-2 0.60 0.10 0.29 0.54 0.61 0.67 0.83 maxstr-2 0.28 0.24 0.05 0.11 0.21 0.37 1.00 dr-2 0.47 0.13 0.11 0.37 0.47 0.58 0.79 si-2 0.40 0.23 0.00 0.25 0.40 0.55 1.00 Φca −0.01 0.26 −0.67 −0.18 0.00 0.19 0.75 Φls 0.18 0.36 −0.95 0.00 0.00 0.32 0.95 Φmtsc2 0.14 0.39 −0.89 0.00 0.00 0.32 0.89 Φsp 0.18 0.30 −0.95 0.00 0.16 0.37 0.95 Φrsc 0.18 0.30 −1.00 −0.004 0.16 0.36 1.00 Φrsca −0.01 0.22 −0.72 −0.11 0.00 0.11 0.78 ΦavgS 0.04 0.10 −0.23 −0.03 0.04 0.11 0.47 Φmaxstr −0.17 0.29 −0.89 −0.32 −0.11 0.00 0.95 Φdr −0.05 0.17 −0.58 −0.16 −0.05 0.05 0.58 3.5 Results 67

Table 3.5: Table indicating for which behavioral measures a significant impact of the manipulations (represented by the columns) and the introduction of information in the second phase (represented by the Φ values) was found. For each split of the population based on a manipulation, it is indicated

whether there was a significant difference (p  0.01) and how large the

observed difference was (+ indicates an increase of less than 0.1, while ++ indicates a larger increase). The comparison is summarized for the first (1) and second (2) phase and the pair-wise differences between the two phases (Φ)

Information Disruptions Crowding

1 2 Φ 1 2 Φ 1 2 Φ ca ++ ++ + - - ls - - ++ + mtsc2 - - sp - - - - ++ ++ rsc - - - - ++ ++ rsca ++ + - - - avgS - - + + - + maxstr ++ - - - dr + + - - -

si n.a. n.a. n.a. - n.a. n.a. - n.a.

that they switch more often in case the crowding is low. In the situation where crowding is reactive, it is also observed that respondents are less sensitive to information.

3.5.3 Behavioral Clusters

For the third part of our analysis, we will cluster the dataset. Based on a scree-plot, we observed that the best number of clusters is 4 for both the first and second phase data. We run the k-means algorithm on the data for the first and second phase. The cluster means and cluster sizes are reported in Table 3.7.

Based on this table, we observe the following types of clusters which we will interpret as follows:

The stoic (st): The stoic has a very short period of experimentation at the beginning of the experiment and then sticks with a single choice for the remainder of the rounds. The mean of the stoic cluster in the first and second phase of the experiment is very similar. The stoics are fewer in number during the second

68 Time Choice Data for Public Transport Optimization

Table 3.6: Transitions between clusters comparing phase 1 against phase 2 Phase 2 Phase 1 st du ms hs st 0.25 0.31 0.27 0.17 du 0.05 0.31 0.42 0.22 ms 0.02 0.15 0.43 0.41 hs 0.02 0.07 0.45 0.46

Table 3.7: Results of the three clustering runs. For the first and second phase, four clusters were chosen. The centroid values for each cluster are presented, as well as the number of observations with the clustering and the fraction this comprises of the entire population.

Clusters Phase 1 Clusters Phase 2

st du ms hs st du ms hs ca 0.19 0.29 0.41 0.44 0.17 0.36 0.48 0.18 ls 0.04 0.78 0.83 0.93 0.06 0.85 0.89 0.94 mtsc2 0.01 0.06 0.65 0.82 0.01 0.08 0.77 0.72 sp 0.04 0.28 0.44 0.80 0.04 0.41 0.58 0.86 rsc 0.04 0.29 0.45 0.80 0.05 0.41 0.57 0.87 rsca 0.56 0.61 0.65 0.56 0.65 0.59 0.61 0.52 avgS 0.58 0.56 0.54 0.56 0.60 0.59 0.57 0.65 maxstr 0.94 0.47 0.30 0.14 0.92 0.36 0.23 0.12 dr 0.64 0.57 0.51 0.38 0.65 0.53 0.48 0.38 Size 130 110 126 149 43 105 202 165 Fraction 0.25 0.21 0.25 0.29 0.08 0.20 0.40 0.32 3.5 Results 69

phase of the experiment, which suggests that the availability of information leads to more variation in choices made. The main reason for stoics to switch seems to be a disruption, as their dr value is a lot higher than the ca or rsc values.

The dualist (du): The dualist has a short period of experimentation at the beginning of the experiment and then keeps switching between two alternative choices for the remainder of the experiment. As such, a dualist has a very low value

for mtsc2, but a high value for the last switch ls. The number of dualists is

comparable during the first and the second phase of the experiment. During the second phase of the experiment, the dualists are more likely to switch than during the first phase of the experiment.

The moderate switcher (ms): The final two types, the moderate and the heavy switcher, are similar in regard to the fact that they consider more than two alternative choices for a relatively long time (more than 75% of the rounds). Their main difference is that the moderate switcher switches in 58% of the cases on average. During the second phase of the experiment, the moderate switcher also has a higher crowd avoidance than the heavy switcher.

The heavy switcher (hs): The heavy switcher keeps switching between more than two alternatives for a long time. When compared to the moderate switcher, the reactive switching coefficient is larger in case of the heavy switcher, and thus the satisfaction score seems to be a good predictor of additional switches. It is also remarkable that during the second phase, the crowd avoidance is a lot lower for a heavy switcher than for a moderate switcher, while they have similar levels during the first phase.

It is worthwhile to consider whether these different cluster types appear propor- tionally when we split the dataset according to our experimental manipulations. We have computed these appearances for both the first and second phase and have ap- plied an exact binomial test to see whether the observed distributions are significantly different from a random distribution. The results are reported in Table 3.8.

The following observations are significant at the p 0.01 level.

• During the first phase of the experiment, the occurrence of large disruptions

leads to fewer stoics.

• During the second phase of the experiment, accurate information leads to a

significantly larger number of heavy switchers, while inaccurate information leads to a significantly larger number of moderate switchers.

• During the second phase, reactive crowding leads to more heavy switchers.

This suggests that respondents learn that in the setting of reactive crowding it pays off to switch more often.

68 Time Choice Data for Public Transport Optimization

Table 3.6: Transitions between clusters comparing phase 1 against phase 2 Phase 2 Phase 1 st du ms hs st 0.25 0.31 0.27 0.17 du 0.05 0.31 0.42 0.22 ms 0.02 0.15 0.43 0.41 hs 0.02 0.07 0.45 0.46

Table 3.7: Results of the three clustering runs. For the first and second phase, four clusters were chosen. The centroid values for each cluster are presented, as well as the number of observations with the clustering and the fraction this comprises of the entire population.

Clusters Phase 1 Clusters Phase 2

st du ms hs st du ms hs ca 0.19 0.29 0.41 0.44 0.17 0.36 0.48 0.18 ls 0.04 0.78 0.83 0.93 0.06 0.85 0.89 0.94 mtsc2 0.01 0.06 0.65 0.82 0.01 0.08 0.77 0.72 sp 0.04 0.28 0.44 0.80 0.04 0.41 0.58 0.86 rsc 0.04 0.29 0.45 0.80 0.05 0.41 0.57 0.87 rsca 0.56 0.61 0.65 0.56 0.65 0.59 0.61 0.52 avgS 0.58 0.56 0.54 0.56 0.60 0.59 0.57 0.65 maxstr 0.94 0.47 0.30 0.14 0.92 0.36 0.23 0.12 dr 0.64 0.57 0.51 0.38 0.65 0.53 0.48 0.38 Size 130 110 126 149 43 105 202 165 Fraction 0.25 0.21 0.25 0.29 0.08 0.20 0.40 0.32 3.5 Results 69

phase of the experiment, which suggests that the availability of information leads to more variation in choices made. The main reason for stoics to switch seems to be a disruption, as their dr value is a lot higher than the ca or rsc values.

The dualist (du): The dualist has a short period of experimentation at the beginning of the experiment and then keeps switching between two alternative choices for the remainder of the experiment. As such, a dualist has a very low value

for mtsc2, but a high value for the last switch ls. The number of dualists is

comparable during the first and the second phase of the experiment. During the second phase of the experiment, the dualists are more likely to switch than during the first phase of the experiment.

The moderate switcher (ms): The final two types, the moderate and the heavy switcher, are similar in regard to the fact that they consider more than two alternative choices for a relatively long time (more than 75% of the rounds). Their main difference is that the moderate switcher switches in 58% of the cases on average. During the second phase of the experiment, the moderate switcher also has a higher crowd avoidance than the heavy switcher.

The heavy switcher (hs): The heavy switcher keeps switching between more than two alternatives for a long time. When compared to the moderate switcher, the reactive switching coefficient is larger in case of the heavy switcher, and thus the satisfaction score seems to be a good predictor of additional switches. It is also remarkable that during the second phase, the crowd avoidance is a lot lower for a heavy switcher than for a moderate switcher, while they have similar levels during the first phase.

It is worthwhile to consider whether these different cluster types appear propor- tionally when we split the dataset according to our experimental manipulations. We have computed these appearances for both the first and second phase and have ap- plied an exact binomial test to see whether the observed distributions are significantly different from a random distribution. The results are reported in Table 3.8.

The following observations are significant at the p 0.01 level.

• During the first phase of the experiment, the occurrence of large disruptions

leads to fewer stoics.

• During the second phase of the experiment, accurate information leads to a

significantly larger number of heavy switchers, while inaccurate information leads to a significantly larger number of moderate switchers.

• During the second phase, reactive crowding leads to more heavy switchers.

This suggests that respondents learn that in the setting of reactive crowding it pays off to switch more often.

70 Time Choice Data for Public Transport Optimization

Table 3.8: Distribution of Treatments over Clusters for both phases. The p-value in- dicates the probability that the distribution of the two treatments within a behavioral group occurs by chance. Significant results at the p < 0.01 level are displayed in a bold font.

Information Disruptions Reactive Cap.

Cluster 0 1 p-value 0 1 p-value 0 1 p-value

Phase 1 st 73 57 0.09 49 81 0.003 78 52 0.014 du 60 50 0.2 58 52 0.31 53 57 0.39 ms 64 62 0.46 75 51 0.02 56 70 0.123 hs 68 81 0.16 73 76 0.43 63 86 0.035 Phase 2 st 18 25 0.18 15 28 0.033 28 15 0.033 du 45 60 0.08 53 52 0.5 64 41 0.016 ms 61 141 9 · 10−9 103 99 0.42 117 85 0.014 hs 141 24 1.1 · 10−21 84 81 0.45 41 124 3.4 · 10−11

Finally, let us consider to which extent the respondents jump between the clusters from the first to the second phase. For this we can produce a Markov-style transition matrix, where we count the number of respondents who jumped from one cluster to another between the first and second phase and divide it by the total number of respondents belonging to the first cluster in the first phase. This yields the transition matrix presented in Table 3.6

First, it can be observed that the stoics in the second phase of the experiment are likely to also have been stoics during the first phase. Furthermore, when stoics start switching more often during the second phase, they are more likely to become dualists or moderate switchers than heavy switchers. The behavioral measures of these types are simple enough to be able to deduce these from the smart card data of commuters, which makes the findings of practical importance for public transport operators.

Dualists during the first phase are most likely to become moderate switchers or remain dualists when crowding indicators are introduced in the second phase. Moderate Switchers and Heavy Switchers change roles in almost equal proportions in the second phase and they are not likely to become dualists or stoics.