Temporal Variations in Activity Network Using
Smart Card Data
December 25, 2019
Abstract
This study explores temporal variations in activity networks for four million pas-sengers, differentiated as workers and non-workers, using public transport based on a large-scale smart card dataset generated over 105 days in Beijing. We aim to capture their day-to-day transition and cumulative temporal expansion in activity network using transit over days, weeks, and months. Particularly, workers and non-workers are automatically identified based on their different daily routines, whose activity net-works are characterized by six features concerning space coverage, distance coverage, and frequency coverage in two ways, namely, on a per-day transition and with an ac-cumulation of days. The transition features of the networks are statistically analyzed and compared by time, while how the expansion features evolve with time are mod-eled. Results show that, on weekdays, workers are more likely to travel longer (have larger distance coverage), but cover less area (have smaller space coverage) than non-workers. While opposite patterns occur on weekends. Traveling in the ‘North-South’ direction is weakly correlated with traveling in the ‘East-West’ direction. Workers on weekdays, as well as non-workers on weekends, make longer ‘North-South’ trips.
Manhattan distance, trip count, and perimeter present a∩shape in their probability
density functions fit by the exponential distribution. The distance coverage expands faster than that of space coverage. Most passengers increase coverage of space and dis-tance when time expands (obviously no one decreases coverage over time, but some don’t change). The research enables findings on temporal load-balancing, long-term cumulative expansion in travel demands of workers and non-workers, re-balancing the distribution of existing workplace and residential location opportunities, and con-structing transit-oriented developments with mixed functions over time.
keyword: Public transport; activity space; activity network; temporal variation; transition and expansion; smart card data
1
Introduction
1
Each traveler covers a network of individual routes that serve their daily activities over
2
time, which we refer to as the ‘activity network’. For a transit user, the activity network
3
is formed based on stations visited along the transit network. This research uses a new
4
dataset to examine the temporal variations in these networks.
5
Previous studies have investigated the temporal variations of activity network, mostly
6
considering the day-to-day transition with small-scale datasets, generated by limited
sam-7
ples and/or over a short period of time (Dharmowijoyoa et al.,2017,Kang and Scott,2010,
8
Parthasarathi et al.,2015,Xianyu et al.,2017). These smaller scale studies may have biases
9
in a broader field abundant with big data, e.g., smart card data (Kieu et al., 2015, Ma
10
et al., 2013, Zhao et al.,2019), and are not sufficiently representative to explain long-term
11
variation patterns in activity network.
12
In this paper, we use smart card data generated by 3,922,131 passengers using public
13
transit over 105 days in Beijing, to empirically explore the temporal variations of
indi-14
vidual activity networks. We differentiate between workers vs. non-workers. The study
15
allows us to uncover how and to what extent passengers vary their activity networks
us-16
ing transit, over days, weeks, and months (Kim et al., 2018, Kumar and Levinson, 1995,
17
Parthasarathi et al., 2015, Susilo and Axhausen, 2014, Susilo and Kitamura, 2005, Zhou
18
and Long,2014). Accumulated expansion of activity networks, considering their growth
19
over time, is also examined to see if any substantial patterns exist, in addition to the
day-20
to-day transition, considering the daily changes. The research enables findings on
tem-21
poral load-balancing, long-term cumulative expansion in travel demands of workers and
22
non-workers, re-balancing the distribution of existing workplace and residential location
23
opportunities, and constructing transit-oriented developments with mixed functions over
24
time (J¨arv et al.,2014,Rai et al.,2007).
25
To be more specific, three major topics are investigated in this paper:
• Workers and non-workers are automatically identified based on their different daily
27
routines, and further compared to demonstrate how they differ in changing and
28
expanding activity networks.
29
• Features concerning space coverage, distance coverage and frequency coverage, are
30
extracted on a per-day basis for a passenger to describe daily transition patterns
31
in activity network. They are statistically analyzed over days, weeks, and months,
32
based on the hypotheses that every feature varies by time, and that workers and
33
non-workers differ in a systematic way.
34
• The above features are re-extracted with an accumulation of days for a passenger
35
to describe the temporal expansion patterns in activity networks. How the features
36
evolve with time are modeled to interpret how activity networks of workers and
37
non-workers expand over days, weeks, and months.
38
For the remainder of this paper, the related work on activity network analysis is
sum-39
marized in Section 2, with data and methods elaborated in Sections 3 and 4. Statistical
40
analysis is conducted in Section5, followed by discussions and conclusion given in
Sec-41
tion6.
42
2
Literature Review
43
2.1
Characterization of Activity Networks
44
Activity networks are generally characterized with three indices, namely, space coverage,
45
distance coverage, and frequency coverage.
46
Space coverage is defined by specific geometry shapes, such as ellipses (Rai et al.,2007,
47
Sch ¨onfelder and Axhausen,2003), convex hulls (Fan and Khattak,2008, Lee et al.,2016),
48
or space-time prism (Miller, 1991, Newsome et al., 1998). The attributes of these shapes,
49
such as area and perimeter for ellipses or convex hulls, or activity duration and travel
time ratio for space-time prism, were extracted as features describing activity space.
Con-51
vex hulls are more spatially approximate to the actual space coverage of individual daily
52
activity than ellipses (Chen and Dobra, 2017) by drawing a minimum boundary around
53
visited locations, but remain vulnerable to outliers (Lee et al.,2016). Space-time prisms
al-54
low modeling the continuous activity space formed by individual behavioral possibilities,
55
restricted by spatial and temporal constraints (H¨agerstrand, 1970, Kitamura et al., 2006,
56
Lenntorp,1976,Miller,1991). But they may trigger uncertainty in dealing with data with
57
limited geographical logs and experience exponential computational complexity when
58
dealing with a massive amount of data (Leung et al.,2016).
59
Coverage of the activity network has normally been quantified using travel distance
60
directly (Duncan et al., 2009, Wiehe et al., 2008, Zhou and Long, 2014). J¨arv et al. (2014)
61
measured the frequency coverage, specifically for transit users, as transit trip counts, to
62
explore its variability over space and time.
63
2.2
Temporal Variations in Activity Network
64
Many studies have explored temporal variations daily, weekly, or monthly, from the
per-65
spective of space coverage, distance coverage, and frequency coverage of individual
ac-66
tivity patterns.
67
Sch ¨onfelder and Axhausen (2004) investigated the variability of activity networks in
68
space coverage using the travel diaries generated by observations in six European cities,
69
and confirmed the existence of temporal variability in activity network. J¨arv et al. (2014)
70
also found a clear monthly variation in daily space coverage, using phone records
gener-71
ated by 1,310 subjects in 12 consecutive months. The effectiveness of sequential data was
72
validated bySch ¨onfelder and Axhausen(2002) in revealing travelers’ time-variant spatial
73
variation in a multi-centred urban structure, with the daily travel routines of 317 subjects
74
extracted over a six-week period.
Wiehe et al. (2008) investigated distance coverage variability among 15 female
ado-76
lescents by time-of-day and day-of-week, who were requested to carry GPS-enabled cell
77
phones for one week. Significant different distributions in distance coverage were found
78
with the presence of their parents or not.
79
Raux et al. (2016) employed a Sequence Alignment Method (SAM), introduced by
80
Wilson (1998) to compare frequency coverage of activity travel patterns considering the
81
embedded sequential information, and found a marginal systematic variability of
intra-82
personal daily trip numbers, based on the trip count observations of 707 individuals over
83
seven days.
84
Moiseeva et al.(2014) studied the trip and location counts of 27 individuals over eight
85
weeks, which further demonstrated the time-variant frequency coverage, while more
se-86
vere variations appeared when inter-personal variability was considered.
87
3
Data Collection
88
The smart card transaction records used in this paper were collected by automated fare
89
collection (AFC) systems in Beijing in 2015, when passengers swiped their smart cards
90
to board or alight 1. At the time, the Beijing public transit network contained 41,970 bus
91
stops and 325 subway stations connected by 763 bus routes and 16 subway lines. Fig.
92
1 displays the Beijing subway system. The dataset contains 105 days of records in July,
93
August, September, and November, after excluding traditional Chinese holidays.
94
Invalid transaction records are filtered out, such as replicated records, records
gener-95
ated by transport staff, or records whose entry time equals exit time. This cleaning process
96
dropped 3.67% of the total, leaving 769,413,168 valid records.
97
Note that raw subway data fully report passengers’ spatio-temporal information on
98
boardings and alightings, while raw bus data failed to store boarding time of bus riders,
99
Figure 1: Beijing Subway Map 2015 (Source: https://map.bjsubway.com/)
so need to be inferred based on the median alighting time generated at the same stops.
100
Details can be found inZhao et al.(2019).
101
A trip chain is defined as a series of consecutive trips made by a traveler in short time
102
gaps and close spatial distances on a daily basis to accomplish a single type of activities
103
(Kieu et al., 2015, Ma et al., 2013). Ma et al. (2013) set the thresholds of the above time
104
gap and spatial distance as 30 min and 1km, respectively, to generate a trip chain, based
105
on which, trip chains are constructed for every passenger based on their travel itineraries.
106
Travelers who made at least one trip chain by transit in five different days across every
107
month are selected to track their spatial patterns of daily activities; thus infrequent
travel-108
ers are excluded.
109
The whole data processing is implemented based on a distributed Hadoop platform
110
2.7.3, built on 8 PCs, with each having a CentOS system (1-CPU with 8 cores, 2.1 GHz
with Quad-Core, and 8GB main memory). In total, 741,163,566 trip chains generated by
112
3,922,131 cards are extracted during the study period. Each trip chain contains attributes
113
like smart card IDs, entry/exit bus stops or subway stations IDs and the coordinates, and
114
the corresponding timestamps.
115
4
Methodology
116
4.1
Identification of Workers and Non-workers
117
Workers and non-workers are separated in this study, as the patterns of their activity
net-118
work features are expected to be distinct. Previous studies have used a series of rules for
119
smart card data (Huang et al.,2019,2018,Long and Thill,2015,Ma et al.,2017,Wang and
120
Chai,2009,Wang and Xu,2010,Zhou and Long,2014,Zou et al.,2018), which we employ
121
here and summarize as follows2,
122
1 For each traveler, visited stations are identified, at which the activity duration time
123
between two consecutive trip chains, determined by the arrival time of the former
124
trip chain and the departure time of the next trip chain, is longer than six hours. This
125
temporal benchmark is set based on the statistical annual report on average
work-126
ing hours in Beijing (Huang et al., 2019). All potential workplaces of the traveler
127
compose a set of workplaces.
128
2 The visited station that has the highest frequency in the set built byRule 1, is labeled
129
as home, which, comparatively, should have a higher visiting frequency than the
130
workplace (Zhou and Long,2014).
131
3 The next most visited station, (i.e. excluding home as defined by Rule 2), visited
132
more than three days a week, is stamped as the workplace (Long and Thill, 2015).
133
2The identification of workers and non-workers is fulfilled based on the Hadoop platform 2.7.3 and Java 1.8.
The corresponding week is denoted as a working week in this paper. An implicit
134
assumption given here is that an individual who earns a living conducts work trips
135
to the same station at least three days a week. This obviously undercounts some
136
kinds of workers with jobs at variable locations (for instance construction and
cer-137
tain kinds of service and sales workers who report to varying field locations) and
138
part-time workers, and may identify some non-workers who visit the same location
139
regularly (for instance volunteers, carers of relatives, and students) as workers.
140
4 When the working weeks, identified byRule 3, exceed five weeks, one third of the
141
total, which is the 95th percentile ratio of working weeks among all smart card
hold-142
ers, we classify the traveler as a worker; otherwise, as a non-worker. Again this
143
undercounts certain kinds of workers and misidentifies some non-workers. We
an-144
ticipate this classification error is small.
145
Accordingly, 1,820,344 workers (46.41%) and 2,101,787 non-workers are detected.
146
4.2
Feature Extraction
147
Features in terms of space, distance, and frequency coverage are extracted to describe how
148
large, how long, and how frequent the activity network is for each passenger. The features
149
are measured from the perspectives of day-to-day transition and an accumulation of days,
150
which are named as ‘transition features’ and ‘expansion features’, respectively, to model
151
transition and cumulative expansion patterns in activity network for transit users.
152
The transition-based space coverage of an activity network is approximated as a
con-153
vex hull drawn along the boundaries of a daily-updated location set, which contains the
154
bus stops or subway stations visited by individualion dayj (the location set is zeroed on
155
a daily-basis.). It is characterized by four convex-hull-based features, area (Ai,j), perimeter
156
(Pi,j), vertical (Dv,i,j) and horizontal (Dh,i,j) distances of an associated convex hull (Ωi,j),
157
the latter two of which refer to the maximum distance coverage in the ‘North-South’ and
‘East-West’ directions, respectively. To characterize the cumulative expansion-based space
159
coverage, instead, the convex hull (Ωi,P
j) employs all the visited bus stops or subway
sta-160
tions by day j (the location set is NOT zeroed along the calculation, so is growing over
161
time). Its area (Ai,P
j), perimeter (Pi,P
j), vertical (Dv,i,P
j) and horizontal (Dh,i,P
j)
dis-162
tances are measured as the cumulative expansion-based features.
163
Distance coverage is quantified as the total travel distance on transit network for a
164
passenger. As the road and subway networks in Beijing, see Figure 1, are largely grids,
165
the Manhattan distance (Dm,i,j) is used to approximate the distance coverage for
passen-166
ger i on day j by summing up the Manhattan distances of all the trips in his network.
167
Frequency coverage (Ni,j) gives the number of trips generated by passengerion dayj.
168
Worker and non-worker examples are selected, shown in Figs. 2and 3, respectively,
169
to illustrate how space coverage, as well as distance coverage described by network
dis-170
tance vs. Manhattan distance, experience a day-to-day transition in seven days of a week
171
(August 10∼August 16), and illustrate how they expand in the corresponding week and
172
month in the year 2015. Only unique route choices are visualized to avoid visual clutters.
173
The effectiveness of using Manhattan distance to represent the network distance is also
174
validated, for the case of Beijing, that the relative differences between those two are about
175
0.42% for the worker and 0.97% for the non-worker in the month.
Space Coverage Network Manhattan distance distance (a) Monday Space Coverage Network Manhattan distance distance (b) Tuesday Space Coverage Network Manhattan distance distance (c) Wednesday Space Coverage Network Manhattan distance distance (d) Thursday Space Coverage Network Manhattan distance distance (e) Friday Space Coverage Network Manhattan distance distance (f) Saturday Space Coverage Network Manhattan distance distance (g) Sunday Space Coverage Network Manhattan distance distance (h) The Week Space Coverage Network Manhattan distance distance
(i) The Month
Figure 2: An Example Illustrating the Activity network of A Worker. (Red star refers to the home location. Space coverage, network distances and Manhattan distances are distinguished by blue, red, and green)
Space Coverage Network Manhattan distance distance (a) Monday Space Coverage Network Manhattan distance distance (b) Tuesday Space Coverage Network Manhattan distance distance (c) Wednesday Space Coverage Network Manhattan distance distance (d) Thursday Space Coverage Network Manhattan distance distance (e) Friday Space Coverage Network Manhattan distance distance (f) Saturday Space Coverage Network Manhattan distance distance (g) Sunday Space Coverage Network Manhattan distance distance (h) The Week Space Coverage Network Manhattan distance distance
(i) The Month
Figure 3: An Example Illustrating the Activity Network of A Non-worker. (Red star refers to the home location. Space coverage, network distances and Manhattan distances are distinguished by blue, red, and green)
5
Results
177
5.1
Description of Travel Features
178
Descriptive statistics of the travel features describing transition and cumulative statistics
179
generated up to the last day of the research period (November 30, 2015) are summarized
180
in Table1.
181
For all smart card holders, it shows that the daily average trip count is 1.81 with an area
182
coverage of8.78km2 and a perimeter of25.13km. The mean value of Manhattan distance
183
is21.99kmper day of trips. As to cumulative expansion, trip count increases up to 171 in
184
15 weeks with an average accumulated area covering 290.30km2 and having a perimeter
185
of71.15km. The aggregated Manhattan distance is1036.91kmover the four months.
186
Comparatively, workers possess higher average values in all features, especially in
187
Manhattan distance, than non-workers, which illustrates that workers travel more and
188
wider than non-workers.
189
Notably, for workers, their average vertical distance is higher than the horizontal
dis-190
tance in either daily transition or final cumulative expansion, indicating a larger coverage
191
in the ‘North-South’ direction, where more developed workplaces can be found, such as
192
the Chinese Silicon ZhongGuanCun, the mega biopharmaceutical base XiErQi, the giant
193
residence districtsTianTongYuanandYiZhuang.
194
The Pearson correlations among the travel features describing transition and
cumula-195
tive expansion patterns are examined, with the results shown in Table2.
196
A finding that area-related space coverage presents a weak correlation with distance
197
coverage is consistent with the observation that many workers (46.4%) travel only
be-198
tween home and work with limited area coverage per day in Beijing, indicating that a
199
longer travel distance is not necessarily associated with a wider coverage area of activities
200
in transition.
201
Manhattan distance (distance coverage) is highly correlated to other distance-related
space coverage features, such as, perimeter, vertical and horizontal distances, in terms
203
of daily transition, but not considering the expansion. It is probably because the
ever-204
increasing frequency coverage, which greatly accelerates the expansion of distance
cov-205
erage based on their mutual high correlation (0.82), trigger the extremely en-even
devel-206
opment of space coverage and distance coverage in expansion and transition in a long
207
term.
208
The vertical and horizontal distances are weakly correlated with each other, implying
209
that no obvious connection exists between traveling in the ‘North-South’ and the
‘East-210
West’ directions in Beijing. Except this weak correlation, the four features describing space
211
coverage exhibit higher correlations with each other in expansion than in transition due
212
to their ever-increasing strong influences with the accumulations of days.
Table 1: Descriptive Statistics of Activity Network Features Regarding to Transitions and Expansions
Statistics Manhattan Distance Trip Count Area Perimeter Vertical Distance Horizontal Distance
T E T E T E T E T E T E All Cardholders Mean 21.99 1036.91 1.81 171.56 8.78 290.30 25.13 71.15 7.98 22.79 7.57 22.25 Std Dev 17.12 1079.03 0.81 122.24 25.89 219.73 16.32 25.16 7.13 10.62 6.73 9.11 Min 0.00 0.00 0.00 16.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 50% 18.26 617.00 2.00 130.00 0.15 237.17 22.62 69.11 6.03 21.77 5.72 21.53 75% 31.03 1431.41 2.00 258.00 4.23 385.40 34.52 87.26 11.69 29.42 11.13 28.66 Max 672.09 13776.74 22.00 1480.00 1292.50 2692.35 166.38 206.49 73.86 74.76 58.71 59.22 Workers Mean 25.14 1715.15 1.91 261.41 9.07 312.39 27.32 73.95 8.77 23.94 8.11 22.86 Std Dev 17.10 1202.30 0.77 113.46 26.03 218.00 15.66 24.12 7.18 10.31 6.77 8.90 Min 0.00 0.00 0.00 18.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 50% 22.02 1441.99 2.00 264.00 0.16 262.98 25.15 72.25 7.18 23.14 6.59 22.44 75% 34.30 2348.66 2.00 348.00 4.78 411.06 36.41 89.32 12.81 30.30 11.90 29.03 Max 672.09 13776.74 22.00 1480.00 1164.60 2540.54 166.38 196.12 73.70 73.92 58.64 59.22 Non-workers Mean 15.52 449.09 1.61 93.69 8.17 271.15 20.65 68.73 6.34 21.79 6.46 21.71 Std Dev 15.22 419.98 0.85 60.63 25.58 219.42 16.72 25.79 6.74 10.79 6.51 9.26 Min 0.00 0.00 0.00 16.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 50% 10.65 333.28 1.00 80.00 0.12 214.08 16.25 66.16 4.16 20.56 4.38 20.79 75% 21.43 563.02 2.00 118.00 3.27 360.09 29.33 85.15 8.82 28.32 9.29 28.07 Max 307.00 11762.96 22.00 1362.00 1292.50 2692.35 162.77 206.49 73.86 74.76 58.71 59.21
‘T’ and ‘E’ are the abbreviations for transition and cumulative expansion, respectively.
Table 2: Correlations of Travel Features Regarding to Transitions and Expansions
Manhattan Distance Trip Count Area Perimeter Vertical Distance Horizontal Distance
T E T E T E T E T E T E Manhattan Distance T 1.00 - - - -E 0.40 1.00 - - - -Trip Count T 0.53 0.04 1.00 - - - -E 0.04 0.82 0.04 1.00 - - - -Area T 0.42 0.21 0.45 0.04 1.00 - - - -E 0.19 0.31 0.04 0.25 0.19 1.00 - - - -Perimeter T 0.87 0.16 0.37 0.04 0.49 0.19 1.00 - - - - -E 0.21 0.33 0.04 0.23 0.19 0.92 0.19 1.00 - - - -Vertical Distance T 0.67 0.12 0.25 0.05 0.37 0.12 0.74 0.12 1.00 - - -E 0.22 0.29 0.05 0.20 0.11 0.78 0.11 0.84 0.40 1.00 - -Horizontal Distance T 0.63 0.17 0.28 0.02 0.37 0.17 0.73 0.17 0.12 0.13 1.00 -E 0.23 0.24 0.02 0.17 0.16 0.72 0.16 0.78 0.14 0.35 0.33 1.00
Values above0.6 are in bold.
‘T’ and ‘E’ are the abbreviations for transition and cumulative expansion, respectively.
5.2
Temporal Transitions in Activity Network
214
Multivariate ordinary least squares regressions (OLS) are conducted for the six features
215
describing transition patterns for both workers and non-workers under the temporal
vari-216
able, with an aim to explore temporal transition patterns of activity networks. Each
regres-217
sion model considers every feature as a dependent variable, and the temporal element as
218
an independent variable. Sunday serves as the reference group among the days, and
219
November serves as the reference group among the months. Altogether, 2 (workers or
220
non-workers)×6 (travel features) = 12 groups of OLS are conducted, with specific results
221
illustrated in Table3, where all coefficients are significant at the 0.001 level.
222
Care needs to be taken that huge cardinality of data can make insignificant variables
223
appear ‘statistically significant’ in practice (Ermagun and Levinson,2017), thus no light is
224
further thrown on significance analysis in this paper. Yet, R2, although small, is still
sig-225
nificantly different from zero, and possesses an explanatory power to explain how, and to
226
which extent, the independent variables impact dependent variables by their coefficients
227
(Huberty,1994,Neter et al.,1996).
228
Day-of-week Transition As shown in Table3, weekdays have positive effects on all the
229
other features, except area, compared to Sunday. This is not surprising, as, on one hand,
230
workers travel more and generally longer during weekdays, for work-dominant
regu-231
lar routines, many of which are two-stop-only trips, home and workplace, so cover less
232
area (the width of the activity space is about 0); while, on the other hand, workers are
233
expected to travel shorter and less often during weekends, but visit multiple stops,
possi-234
bility around home, like grocery stores, restaurants, or gyms, which expands the area of
235
activity network.
236
The largest coverage of area happens on Friday, for workers, along with a high
exten-237
sion of other features as well, due to diverse out-of-work recreational activities or
gather-238
ings taking place, to celebrate the upcoming weekends. The longest travel distance arises
Table 3: Regressions of Transition Features on Temporal Elements
Independent Dependent Variables
Variables Manhattan
Distance
Trip
Count Area Perimeter
Vertical Distance Horizontal Distance Workers Constant 22.93 1.87 9.74 26.47 8.51 7.85 Monday 2.05 0.05 -0.88 0.65 0.15 0.26 Tuesday 2.20 0.08 -0.77 0.49 0.10 0.21 Wednesday 2.23 0.06 -0.64 0.66 0.17 0.24 Thursday 2.17 0.08 -0.70 0.51 0.12 0.20 Friday 2.09 0.09 1.02 1.09 0.29 0.39 Saturday -0.22 -0.01 0.64 0.09 0.04 0.02 July 0.81 -0.08 -1.28 0.53 0.23 0.08 August 0.20 0.02 0.03 0.07 0.03 0.01 September 0.59 -0.03 -0.25 0.52 0.22 0.09 Adj. R2 0.003 0.004 0.001 0.001 0.000 0.000 Non-workers Constant 15.56 1.65 8.68 21.22 7.85 6.63 Monday -0.62 -0.02 -0.31 -1.18 -0.26 -0.31 Tuesday -1.24 -0.02 -0.32 -2.26 -0.21 -0.63 Wednesday -0.60 -0.02 -0.12 -1.42 -0.24 -0.39 Thursday -1.13 -0.01 -0.23 -2.08 -0.20 -0.60 Friday -0.81 -0.01 0.04 -0.90 0.39 -0.24 Saturday 0.36 0.05 0.58 0.02 0.02 0.02 July 2.65 -0.11 -0.79 2.72 0.08 0.72 August -0.73 -0.02 -0.80 -0.94 0.01 -0.27 September 1.28 -0.03 -0.36 1.56 0.09 0.43 Adj. R2 0.007 0.003 0.000 0.009 0.000 0.004
All coefficients are significant at the 0.001 level.
on Wednesday, when mid-week breaks for leisure activities occur, as well as on-sale events
240
or membership benefits granted by shopping malls, that might induce workers to shop.
241
For non-workers, weekends have positive effects on Manhattan distance, trip count,
242
and area, during which longer trips are generated more frequently than workers to
con-243
duct leisure activities with a wider coverage of areas. Yet, on weekdays, they experience
an area diminution due to a substantial reduction of trip distance and frequency in the first
245
four weekdays. This phenomenon occurs because non-workers, who possess more
flexi-246
bility in travel choices, don’t have to sacrifice their comfort by competing with workers in
247
a crowded and congested transport environment on weekdays.
248
Monthly Transition Compared to November, the coefficients of Manhattan distance are
249
positive in July and September, indicating a relatively longer travel distance, no matter the
250
employment status. Weather changes, e.g., comfortable temperatures in July and
Septem-251
ber (16◦C ∼ 25◦C), high temperatures in August (34◦C ∼ 40◦C) and low temperatures
252
in November (−5◦C ∼ 1◦C), might help to explain the above phenomena, but need to be
253
confirmed in future studies.
254
Similar monthly transition trends occur in other distance-related space coverage
fea-255
tures, which can be explained by their high correlations, see Table 2. Their coefficients
256
are found to have larger fluctuations in each month for non-workers, compared to
work-257
ers, which might be signified by the fact that non-workers possess more flexible travel
258
demands and can easily adapt the travel needs.
259
Workers experience a shrinkage of area in July and September, with making repeated
260
long-distance trips but at a lower frequency. For non-workers, area shrinks with a
sub-261
stantial reduction on trip count in the first three months, especially in July (coefficient
262
-0.11), compared to November. Interesting to note that, for workers in August and
work-263
ers and non-workers in November, area grows due to the generation of frequent short
264
trips around home or workplaces for diverse activities.
265
Curve Fitting Fig. 4 depicts each transition feature by its probability density function
266
(PDF) and an optimum curve, which owns the highest goodness of fit (R2) out of five
267
potential distribution models, exponential, gamma, lognormal, levy, and weibull. Fig.
268
4 presents the ones with the best fit. Besides, a two-term Kolmogorov-Smirnov test is
269
conducted to analyze the similarity between each two optimum curves in each temporal
category on weekdays and weekends, with the results shown in Table4.
271
The PDFs of Manhattan distance (Fig.4a), trip count (Fig. 4b), and perimeter (Fig. 4d),
272
present a∩-shape.
273
Specifically, the distributions of Manhattan distance for workers present significant
dif-274
ferences from non-workers at 0.001 significance level. The modes for workers lie between
275
10kmto20km, considerably higher than non-workers, which are around4km. These
cor-276
roborate the finding that workers generate a longer travel distance on average than
non-277
workers in Beijing (Zhou and Long,2014). Interesting to note that, the curve of trip count
278
for workers on weekdays differs from those of others, at 0.05 significance level, as shown
279
in Table4, which is much in line with the finding shown in Fig. 7that the average hourly
280
number of trips of workers on weekdays is dramatically higher than other types.
281
The distributions of area, as shown in Figure 4c, decline dramatically under the
in-282
fluence of the gravity theory (Fotheringham and O’Kelly, 1989), where distance induces
283
impedance. The exponential distribution well fits the PDF of area for both workers and
284
non-workers (Gonzalez et al.,2008).
285
Similarly, the vertical and horizontal distances also possess striking declining trends,
286
and are well fitted by the exponential distribution. Besides, as shown in Table4, the curves
287
of vertical and horizontal distances of non-workers significantly differ from workers on
288
weekdays.
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0 10 20 30 40 50 60 70 80 90 100 P ro b ab il it y
Manhanttan Distance Bins (km) per Card per Day Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend Worker-Weekday-Gamma Worker-Weekend-Gamma Non-worker-Weekday-Lognormal Non-worker-Weekend-Lognormal
(a) Manhattan Distance
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 P ro b ab ilit y
Trip Count Bins per Card per Day Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend Worker-Weekday-Lognormal Worker-Weekend-Lognormal Non-worker-Weekday-Lognormal Non-worker-Weekend-Lognormal (b) Trip Count 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 1 101 201 301 401 501 601 701 801 901 1001 1101 1201 P ro b ab ilit y
Area Bins (km2) per Card per Day
Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend Worker-Weekday-Exponential Worker-Weekend-Exponential Non-worker-Weekday-Exponential Non-worker-Weekend-Exponential
(c) Areas More than two Stations
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0 10 20 30 40 50 60 70 80 90 100 P ro b ab ilit y
Perimeter Bins (km) per Card per Day Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend Worker-Weekday-Gamma Worker-Weekend-Gamma Non-worker-Weekday-Gamma Non-worker-Weekend-Gamma (d) Perimeter 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0 10 20 30 40 50 60 70 80 90 100 P ro b ab ilit y
Maximum Vertical Distance Bins (km) per Card per Day Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend Worker-Weekday-Exponential Worker-Weekend-Exponential Non-worker-Weekday-Exponential Non-worker-Weekend-Exponential
(e) Vertical Distance
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0 10 20 30 40 50 60 70 80 90 100 P ro b ab ilit y
Maximum Horizontal Distance Bins (km) per Card per Day Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend Worker-Weekday-Exponential Worker-Weekend-Exponential Non-worker-Weekday-Exponential Non-worker-Weekend-Exponential (f) Horizontal Distance
Figure 4: Probability Distributions of Transition Features of Workers and Non-workers on
Table 4: Similarities of Pairwise Fitting Curves Describing the Distribution of the Transition Features on Weekdays and Weekends
Category Worker Non-worker Worker Non-worker
Week-day Sig Week-end Sig Week-day Sig Week-end Sig Week-day Sig Week-end Sig Week-day Sig Week-end Sig
Manhattan Distance Trip Count
Worker Weekday 1.00 - - - 1.00 - - -Weekend 0.68 1.00 - - 0.04 ∗ 1.00 - -Non-worker Weekday 0.00 ∗∗∗ 0.00 ∗∗∗ 1.00 - 0.01 ∗ 0.99 1.00 -Weekend 0.00 ∗∗∗ 0.00 ∗∗∗ 0.68 1.00 0.01 ∗ 0.82 0.99 1.00 Area>1km2 Perimeter Worker Weekday 1.00 - - - 1.00 - - -Weekend 0.92 1.00 - - 0.68 1.00 - -Non-worker Weekday 0.61 0.08 . 1.00 - 0.26 0.19 1.00 -Weekend 0.92 0.21 0.99 1.00 0.34 0.26 0.89 1.00
Vertical Distance Horizontal Distance
Worker Weekday 1.00 - - - 1.00 - -
-Weekend 0.99 1.00 - - 1.00 1.00 -
-Non-worker Weekday 0.00
∗∗∗ 0.00 ∗∗ 1.00 - 0.02 ∗ 0.04 ∗ 1.00
-Weekend 0.02 ∗ 0.03 ∗ 0.79 1.00 0.19 0.34 0.89 1.00
∗∗∗p-value<0.001,∗∗p-Value<0.01,∗p-Value<0.05,.p-Value<0.1. ‘Sig’ is short for significance
5.3
Temporal Expansions in Activity Network
290
All the Mondays from July 6th, 2015 to November 30th, 2015 are selected as a temporal
291
series to represent the continuous independent variables considering the expansion days.
292
These days are numbered from 1 to 120 with an interval of 7, referring to the Nth day of
293
accumulation. Similar expansion patterns are found for workers and non-workers, so we
294
only show the PDFs for workers over the sampled time series in Fig. 5.
295
As expected, each expansion-based feature increases by its average value when days
296
accumulate. Their PDFs present a ∩-shape and exhibit a larger skewness and longer tail
297
when time evolves.
298
How the average value of each expansion-based feature changes over time is further
299
modeled based on several potential distribution models, linear, exponential, lognormal,
300
and power. Fig. 6depicts the ones with the best fit.
301
The linear function best fits the temporal regularity of the expansion-based features
302
regarding distance coverage and frequency coverage, while the power function best
ex-303
plains space coverage. Particularly, the space-coverage-based four features, i.e., area (Fig.
304
6c), perimeter (Fig. 6d), vertical (Fig. 6e) and horizontal (Fig. 6f) distances, experience a
305
much smaller increasing trend than the distance coverage and the frequency coverage do
306
(Fig.6a). This indicates that the expansion of distance coverage is faster than that of space
307
coverage, due to the greater acceleration of frequency coverage (Fig.6b) in expansion than
308
in transition.
309
Convergent trends can be found in the space-coverage-based four features among
310
workers and non-workers, suggesting that each feature, though expanding at a fast or
311
slow speed over time, will reach a summit value. We expect there is only so much space or
312
distance someone will ever cover on the Beijing public transport system over the course of
313
a life, or at least over the life of a smart card, and this value is asymptotically approached
314
as the amount of time considered increases, as the gravity effect implies farther places will
315
be interacted with less often.
(a) Manhattan Distance (b) Trip Count
(c) Areas More than two Stations (d) Perimeter
(e) Vertical Distance (f) Horizontal Distance
y = 13.711x + 62.356 R² = 0.9995 y = 3.6024x + 8.6995 R² = 0.9976 0 200 400 600 800 1000 1200 1400 1600 1800 2000 0 20 40 60 80 100 120 140 A v er ag e Ma n h attan Dis tan ce L o g ( k m )
Expansion Day Log
Worker Non-worker
(a) Manhattan Distance
y = 2.098x + 7.4239 R² = 0.9992 y = 0.7638x - 0.4181 R² = 0.9935 1 51 101 151 201 251 301 0 20 40 60 80 100 120 140 A v er ag e T rip C o u n t L o g
Expansion Day Log
Worker Non-worker (b) Trip Count y = 25.531x0.5108 R² = 0.9747 y = 11.997x0.6419 R² = 0.9737 0 50 100 150 200 250 300 350 0 20 40 60 80 100 120 140 A v er ag e A rea L o g ( k m 2)
Expansion Day Log
Worker Non-worker
(c) Areas More than two Stations
y = 28.671x0.189 R² = 0.9424 y = 20.221x0.2461 R² = 0.927 0 10 20 30 40 50 60 70 80 0 20 40 60 80 100 120 140 A v er ag e P er im eter L o g ( k m )
Expansion Day Log
Worker Non-worker (d) Perimeter y = 9.1357x0.192 R² = 0.9412 y = 6.2375x0.2511 R² = 0.9271 0 5 10 15 20 25 30 0 20 40 60 80 100 120 140 A v er ag e V er tical D is tan ce L o g ( k m )
Expansion Day Log
Worker Non-worker
(e) Vertical Distance
y = 8.5936x0.1955 R² = 0.9438 y = 6.2577x0.2505 R² = 0.9271 0 5 10 15 20 25 0 20 40 60 80 100 120 140 A v er ag e Ho rizo n tal Dis tan ce L o g ( k m )
Expansion Day Log
Worker Non-worker
(f) Horizontal Distance
Figure 6: Modeling how the Expansion-based Features Evolve with Time for Workers and Non-workers.
6
Conclusion
317
This study explores the temporal variations in activity network in terms of space,
dis-318
tance, and frequency coverage for transit users categorized as workers and non-workers.
319
Unlike previous studies which merely focused on day-to-day transition patterns using
ther small-scale datasets with limited samples or occurring in a short period of time, our
321
research investigates the expansion patterns of activity network considering the
accumu-322
lation of days, in addition to the daily transition patterns, using a large-scale smart card
323
dataset, generated by around four million passengers over 105 days in Beijing. Six travel
324
features are extracted specifically for workers and non-workers, whose transition patterns
325
are statistically and compared by time, with the expansions features modeled with time.
326
The main findings of this research and potential implications are summarized as
fol-327
lows.
328
• Workers travel longer (distance coverage) and more often (frequency coverage) than
329
non-workers, but cover less area (space) on weekdays. Opposite patterns occur
dur-330
ing weekends.
331
The commute plays a dominant role in affecting the generation of non-work-related
332
travels. Workers spend more time traveling than non-workers. This is caused by
333
the imbalanced distribution of housing-working opportunities in Beijing. Hence,
334
re-balancing the distribution of existing housing-working opportunities and
con-335
struction of transit-oriented districts with mixed functions (e.g., residence, working,
336
recreation, education, etc) might be the focus of land use and transport planning in
337
the coming years.
338
Non-workers contribute to a considerably large proportion of non-work trips, whose
339
impact on public transport facilities cannot be ignored. The fact that a number of
340
non-workers still compete with workers in peak hours to take public transport
con-341
gests weekday travel. Yet, the finding that non-workers possess a larger fluctuation
342
in distance coverage provides guidance for load-balancing travel demands of
work-343
ers and non-workers smoothly over time. For example, transport agencies may vary
344
prices to induce non-workers to travel during non-peak hours on weekdays.
345
• Traveling in the ‘North-South’ direction is weakly correlated with traveling in the
‘East-West’ direction. Workers in weekdays, as well as non-workers in weekends,
347
travel long in the former direction, where better housing-job matches can be found.
348
The expansion of distance coverage is faster than that of space. A majority of
pas-349
sengers, who may be impeded by the gravity effect or constrained by their social
350
roles, cover larger areas or longer distances, as the amount of time considered
in-351
creases. This highlights the importance of injecting more diversity in land use
func-352
tions in Beijing, especially in the ‘East-West’ direction by evening out the
distribu-353
tion of workplaces, housing, entertainment, and education, to attract workers and
354
non-workers simultaneously.
355
• Workers and non-workers experience similar distributions in either transition or
ex-356
pansion. As to the transition patterns, Manhattan distance, trip count, and perimeter
357
present a∩shape in their PDFs, while the remaining features decline dramatically,
358
with PDFs fit by the Exponential distribution. As to the expansion patterns, the
359
polynomial distribution model best fits the evolution regularity of each expansion
360
over time. These derived curves can be adopted to predict the transition or
expan-361
sion patterns of space coverage or distance coverage of workers or non-workers in
362
the future given a specific day, week or month attribute.
References
364
Chen, Y.C., Dobra, A., 2017. Measuring human activity spaces with density ranking based
365
on gps data. arXiv preprint arXiv:1708.05017 .
366
Dharmowijoyoa, D.B., Susiloc, Y.O., Karlstr ¨om, A., 2017. Analysing the complexity of
day-367
to-day individual activity-travel patterns using a multidimensional sequence alignment
368
model: A case study in the bandung metropolitan area, indonesia. Journal of Transport
369
Geography 64, 1–12.
370
Duncan, M.J., Badland, H.M., Mummery, W.K., 2009. Applying gps to enhance
under-371
standing of transport-related physical activity. Journal of Science and Medicine in Sport
372
12, 549–556.
373
Ermagun, A., Levinson, D., 2017. Transit makes you short: On health impact assessment
374
of transportation and the built environment. Journal of Transport & Health 4, 373–387.
375
Fan, Y., Khattak, A., 2008. Urban form, individual spatial footprints, and travel:
Examina-376
tion of space-use behavior. Transportation Research Record: Journal of the
Transporta-377
tion Research Board , 98–106.
378
Fotheringham, A.S., O’Kelly, M.E., 1989. Spatial interaction models: formulations and
379
applications. volume 1. Kluwer Academic Publishers Dordrecht.
380
Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.L., 2008. Understanding individual human
381
mobility patterns. nature 453, 779.
382
H¨agerstrand, T., 1970. What about people in regional science? Papers in regional science
383
24, 6–21.
384
Huang, J., Levinson, D., Wang, J., Jin, H., 2019. Job-worker spatial dynamics in Beijing:
385
Insights from smart card data. Cities 86, 83–93.
Huang, J., Levinson, D., Wang, J., Zhou, J., Wang, Z.j., 2018. Tracking job and housing
387
dynamics with smartcard data. Proceedings of the National Academy of Sciences 115,
388
12710–12715.
389
Huberty, C.J., 1994. A note on interpreting anr2value. Journal of Educational and
Behav-390
ioral Statistics , 351–356.
391
J¨arv, O., Ahas, R., Witlox, F., 2014. Understanding monthly variability in human activity
392
spaces: A twelve-month study using mobile phone call detail records. Transportation
393
Research Part C: Emerging Technologies 38, 122–135.
394
Kang, H., Scott, D.M., 2010. Exploring day-to-day variability in time use for household
395
members. Transportation Research Part A: Policy and Practice 44, 609–619.
396
Kieu, L.M., Bhaskar, A., Chung, E., 2015. Passenger segmentation using smart card data.
397
IEEE Transactions on intelligent transportation systems 16, 1537–1548.
398
Kim, J., Rasouli, S., Timmermans, H.J., 2018. Social networks, social influence and
activity-399
travel behaviour: a review of models and empirical evidence. Transport Reviews 38,
400
499–523.
401
Kitamura, R., Yamamoto, T., Susilo, Y.O., Axhausen, K.W., 2006. How routine is a
rou-402
tine? an analysis of the day-to-day variability in prism vertex location. Transportation
403
Research Part A: Policy and Practice 40, 259–279.
404
Kumar, A., Levinson, D., 1995. Temporal variations on allocation of time. Transportation
405
Research Record , 118–127.
406
Lee, J.H., Davis, A.W., Yoon, S.Y., Goulias, K.G., 2016. Activity space estimation with
407
longitudinal observations of social media data. Transportation 43, 955–977.
Lenntorp, B., 1976. Paths in space-time environments: A time-geographic study of
move-409
ment possibilities of individuals. Lund Studies in Geography Series B Human
Geogra-410
phy .
411
Leung, Y., Zhao, Z., Ma, J.H., 2016. Uncertainty analysis of space–time prisms based on
412
the moment-design method. International Journal of Geographical Information Science
413
30, 1336–1358.
414
Long, Y., Thill, J.C., 2015. Combining smart card data and household travel survey to
an-415
alyze jobs–housing relationships in Beijing. Computers, Environment and Urban
Sys-416
tems 53, 19–35.
417
Ma, X., Liu, C., Wen, H., Wang, Y., Wu, Y.J., 2017. Understanding commuting patterns
418
using transit smart card data. Journal of Transport Geography 58, 135–145.
419
Ma, X., Wu, Y., Wang, Y., Chen, F., Liu, J., 2013. Mining smart card data for transit riders’
420
travel patterns. Transportation Research Part C: Emerging Technologies 36, 1–12.
421
Miller, H.J., 1991. Modelling accessibility using space-time prism concepts within
geo-422
graphical information systems. International Journal of Geographical Information
Sys-423
tem 5, 287–301.
424
Moiseeva, A., Timmermans, H., Choi, J., Joh, C.H., 2014. Sequence alignment analysis
425
of variability in activity travel patterns through 8 weeks of diary data. Transportation
426
Research Record 2412, 49–56.
427
Neter, J., Kutner, M.H., Nachtsheim, C.J., Wasserman, W., 1996. Applied linear statistical
428
models. volume 4. Irwin Chicago.
429
Newsome, T.H., Walcott, W.A., Smith, P.D., 1998. Urban activity spaces: Illustrations
430
and application of a conceptual model for integrating the time and space dimensions.
431
Transportation 25, 357–377.
Parthasarathi, P., Hochmair, H., Levinson, D., 2015. Street network structure and
house-433
hold activity spaces. Urban Studies 52, 1090–1112.
434
Rai, R.K., Balmer, M., Rieser, M., Vaze, V.S., Sch ¨onfelder, S., Axhausen, K.W., 2007.
Cap-435
turing human activity spaces: New geometries. Transportation Research Record 2021,
436
70–80.
437
Raux, C., Ma, T.Y., Cornelis, E., 2016. Variability in daily activity-travel patterns: the case
438
of a one-week travel diary. European Transport Research Review 8, 1–14.
439
Sch ¨onfelder, S., Axhausen, K., 2002. Measuring the size and structure of human activity
440
spaces. Transp Res. https://doi. org/10.3929/ethz-a-004444846 .
441
Sch ¨onfelder, S., Axhausen, K.W., 2003. Activity spaces: measures of social exclusion?
442
Transport policy 10, 273–286.
443
Sch ¨onfelder, S., Axhausen, K.W., 2004. Structure and innovation of human activity spaces.
444
Arbeitsberichte Verkehrs-und Raumplanung 258, 1–40.
445
Susilo, Y.O., Axhausen, K.W., 2014. Repetitions in individual daily activity–travel–location
446
patterns: a study using the herfindahl–hirschman index. Transportation 41, 995–1011.
447
Susilo, Y.O., Kitamura, R., 2005. Analysis of day-to-day variability in an individual’s
ac-448
tion space: exploration of 6-week mobidrive travel diary data. Transportation Research
449
Record 1902, 124–133.
450
Wang, D., Chai, Y., 2009. The jobs– relationship and commuting in Beijing, China: the
451
legacy of Danwei. Journal of Transport Geography 17, 30–38.
452
Wang, D., Xu, F., 2010. A study on the commuting problems in Beijing–based on the
453
investigation to the citizens of Beijing. Urban Studies 12, 18.
Wiehe, S.E., Hoch, S.C., Liu, G.C., Carroll, A.E., Wilson, J.S., Fortenberry, J.D., 2008.
Ado-455
lescent travel patterns: pilot data indicating distance from home varies by time of day
456
and day of week. Journal of Adolescent Health 42, 418–420.
457
Wilson, W.C., 1998. Activity pattern analysis by means of sequence-alignment methods.
458
Environment and Planning A 30, 1017–1038.
459
Xianyu, J., Rasouli, S., Timmermans, H., 2017. Analysis of variability in multi-day gps
460
imputed activity-travel diaries using multi-dimensional sequence alignment and panel
461
effects regression models. Transportation 44, 533–553.
462
Zhao, X., Zhang, Y., Liu, H., Wang, S., Qian, Z.S., Hu, Y., Yin, B., 2019. Detecting
463
pickpocketing gangs on buses with smart card data. IEEE Intelligent Transportation
464
Systems Magazine URL: https://ieeexplore.ieee.org/document/8745669,
465
doi:10.1109/MITS.2019.2919525.
466
Zhou, J., Long, Y., 2014. Jobs-housing balance of bus commuters in Beijing: Exploration
467
with large-scale synthesized smart card data. Transportation Research Record 2418,
468
1–10.
469
Zou, Q., Yao, X., Zhao, P., Wei, H., Ren, H., 2018. Detecting home location and trip
pur-470
poses for cardholders by mining smart card transaction data in Beijing subway.
Trans-471
portation 45, 919–944.
A
Appendix
Fig. 7illustrates the distribution of average hourly number of trips generated by workers
and non-workers by the time of a day on weekdays and weekends. Workers present sharp morning and evening peaks on weekdays for work affairs, along with two slight peaks generated by non-workers, who might head toward government agencies or enterprises in office hours for business, or head toward schools to deliver(/pick up) kids in mornings or afternoons. Daily trips are more evenly distributed over a day for workers and non-workers during the weekends.
0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 100,000 110,000 120,000 130,000 140,000 0:0 0 0:30 1:00 1:30 2:00 2:30 3:00 3:30 4:00 4:30 5:00 5:30 6:00 6:30 7:00 7:30 8:00 8:30 9:00 9:30 10:0 0 1 0 :3 0 1 1 :0 0 1 1 :3 0 1 2 :0 0 1 2 :3 0 1 3 :0 0 1 3 :3 0 1 4 :0 0 1 4 :3 0 1 5 :0 0 1 5 :3 0 16 :00 1 6 :3 0 1 7 :0 0 1 7 :3 0 1 8 :0 0 1 8 :3 0 1 9 :0 0 1 9 :3 0 2 0 :0 0 20 :30 2 1 :0 0 2 1 :3 0 2 2 :0 0 2 2 :3 0 2 3 :0 0 2 3 :3 0 2 4 :0 0 A v era g e Ho u rly Nu m b er o f T rip s Hours Worker-Weekday Worker-Weekend Non-worker-Weekday Non-worker-Weekend
Figure 7: Average Hourly Number of Trips Generated by Workers and Non-workers. Potential distribution models are selected to model the transition or expansion patterns in activity network for travelers according to the following criteria.
• The model shares a similar distribution to the scatter plot.
• The parameters of the model are easy to deduce.
• The model follows a non-negative distribution.
Five probabilistic models, as well as their essential parameters, are shown in Table 5
to fit the PDF of the transition features, with their specific distributional parameters on
another five models are chosen to model how the expansion-based features evolve with
time, with the results shown in Table5.
Table 5: Description of the Chosen Distribution Models.
Distribution Probability Density Function Arg Loc Scale
exponential f(x) = λe−λx - - λ gamma f(x) = x β α−1e−xβ Γ(α)β α ? β levy f(x) = p c 2π e− c 2(x−µ) (x−µ)3/2 ? µ c lognormal f(x) = 1 x√2πln(σ)e −[ln(x)−ln(µ)]2 2[ln(σ)]2 ? µ σ weibull f(x) = αβhxβi α−1 e−(βx) α α ? β
The ‘Arg’, ‘Loc’ and ‘Scale’ signify that how a distribution is shaped, shifted or scaled.
Table 6: PDFs of Transition Features on Weekdays
Feature Manhattan Distance Trip Count Area>1km2 Perimeter Vertical Distance Horizontal Distance
Category W N W N W N W N W N W N Exponential R 2 0.81 0.96 0.43 0.52 0.92 0.87 0.63 0.91 0.83 0.86 0.92 0.88 Scale 25.56 15.17 1.93 1.60 27.32 25.28 27.43 20.16 8.80 6.17 8.16 6.32 Gamma R 2 0.98 0.50 0.95 0.98 0.50 0.50 0.98 0.95 0.57 0.50 0.50 0.50 Arg 2.52 0.48 14.80 5.45 0.08 0.60 5.03 1.37 0.95 0.61 0.71 0.76 Loc -1.88 0.00 -0.90 -0.22 1.00 1.00 -7.32 -0.40 0.00 0.00 0.00 0.00 Scale 10.91 3.40 0.19 0.33 3.93 38.25 6.91 14.96 8.91 10.18 12.12 6.13 Levy R 2 0.66 0.87 0.17 0.27 0.71 0.84 0.42 0.71 0.63 0.77 0.72 0.78 Loc -1.13 -0.75 -0.08 -0.08 0.28 0.40 -1.45 -1.24 -0.56 -0.32 -0.45 -0.33 Scale 14.65 5.88 1.59 1.23 6.32 4.69 17.96 9.31 3.61 1.78 3.23 1.91 Lognormal R 2 0.50 0.98 0.95 0.99 0.79 0.86 0.97 0.93 0.75 0.78 0.83 0.79 Arg 2.94 0.90 0.20 0.37 1.34 1.52 0.40 0.66 0.64 1.11 0.70 1.06 Loc 0.00 -1.43 -1.79 -0.49 0.66 0.85 -9.77 -4.82 -2.76 -0.43 -1.99 -0.51 Scale 1.73 11.38 3.65 1.95 12.65 9.76 33.63 20.24 9.53 3.93 8.07 4.25 Weibull R 2 0.50 0.50 0.87 0.93 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 Arg 0.34 0.33 2.72 2.05 0.30 0.65 0.68 0.43 0.78 0.79 0.70 0.87 Loc 0.00 0.00 -0.08 -0.05 1.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 Scale 2.07 2.28 2.25 1.86 23.38 40.97 2.56 2.99 3.02 5.25 3.69 6.00
Valuesin boldare optimum fit.
Table 7: Expansion-based Features Evolving with Days
Category Manhattan Distance Trip Count Area>1km2 Perimeter Vertical Distance Horizontal Distance
W N W N W N W N W N W N exponential R2 0.858 0.897 0.859 0.890 0.817 0.800 0.869 0.858 0.872 0.862 0.865 0.855 SE 61.004 15.270 8.084 2.037 9.919 8.281 1.040 1.275 0.336 1.079 0.961 0.336 linear R2 0.999 0.998 0.999 0.994 0.975 0.950 0.933 0.926 0.936 0.927 0.932 0.921 SE 2.581 1.565 0.525 0.543 2.664 2.655 0.747 0.839 0.239 0.260 0.860 0.238 lognormal R2 0.723 0.697 0.719 0.686 0.826 0.800 0.890 0.866 0.887 0.864 0.890 0.866 SE 63.590 17.503 9.807 3.781 8.427 8.472 0.957 1.178 0.318 0.382 0.880 0.302 power R 2 0.959 0.918 0.957 0.909 0.976 0.974 0.942 0.932 0.941 0.937 0.944 0.932 SE 26.277 9.140 4.052 1.959 3.133 2.234 0.605 0.706 0.201 0.228 0.823 0.188
’SE’ is short for standard errors.