2.5 Aggregate Models 1 Overview
2.5.6 Issues with Aggregate Modelling
2.5.6.1 Catchment Area Specification and Population
As mentioned previously, catchment area specification is a major issue for most aggregate models. The simplest models do not even consider catchment areas, instead using dummy variables to represent origin and destination populations, enabling fare and service
elasticities to be estimated (Whelan & Wardman, 1999b). However, such methods are only suitable for time-series models as flows can only be forecast for stations included in the set of calibration flows. To make models transferable it is necessary to consider the origin and/or destination characteristics during calibration; in other words to use a gravity model. It is assumed when aggregating the populations that the number of decisions made per year is constant over all individuals within a zone and that the journey times to the station and thus the utility to travel are similarly constant across the zone. The most
straightforward way of incorporating population is to replace the dummy variables with the population within a certain straight line distance of the station and such models can still be calibrated using linear regression. Defining catchment areas in this way is inevitably arbitrary as there is no clear boundary in reality, and is also unrealistic as in reality catchments are defined by access and egress times rather than by distance (Krygsman et al., 2004). Using two or more weighted distance bands (Preston, 1987) with differing elasticities does though allow for a limited distance decay effect.
The use of non-linear regression (summation) models allows more complex and realistic model forms to be adopted, by allocating the population around the station to a number of non-overlapping zones, and gives the following generic model form:
γ λ β α δ α µ ij b bj bj ai ai ij P A P E GC V =
∑
∑
(2.13) Where:Pai is the usually resident population in zone a (related to station i)
Pbj is the usually resident population in zone b (related to station j)
Aai is the drive time from zone a to the origin station i
Ebj is the drive time from zone b to the destination station j
GCij is the generalised cost of rail travel from station i to station j
λ is the egress elasticity
α, β, γ, and μ are constants determined by calibration
Whelan & Wardman (1999b) used a GIS to allocate populations to a set of ‘doughnut’ zones separated by lines of equal travel time (see Figure 2.2). However, this type of zoning system implies that catchments are identical for all journeys starting or finishing at a station, when in fact they will be affected by the distance and direction of the destination (Lythgoe, 2004). If the majority of journeys from a station are to a particular destination or in a particular direction, then it may be possible to use a parabolic catchment area
boundary as suggested by Farhan & Murray (2005) for park and ride services, although this still does not consider the distance to the destination. Such catchments, which may be particularly suitable for urban local stations, have been termed ‘commuter-sheds’ (Dickins, 1991), as illustrated by Figure 2.3.
Figure 2.2: Doughnut zones
Figure 2.3: Commuter-sheds
Source: Dickins (1991)
Transit terminal Local station Road access Transit terminal Rail line Transit terminal To the CBD Transit terminal Commutershed catchment for terminal Transit terminal Commutershed catchment for intermediate station
Transit terminal
Station
Zones: 1 2 3 4 5
A further refinement would be to define catchments in terms of the generalised cost of access to and from the station (Sargious & Janarthanan, 1983), but while attractive in terms of realism this would complicate model calibration and would require a large amount of additional data, particularly if multimodal access was considered. Similarly the
disaggregate approach developed by Farhan & Murray (2005) which considers travel times and distances for individual users requires more information than is generally available.
Lythgoe (2004) developed a new zoning system to allow models to deal more effectively with differing destination distances and directions. A grid of zonal ‘seed points’ is fixed in location relative to the station, arranged in a series of squares increasing in size, and zones are then defined by allocating units of population to the nearest seed point. Access times and distances from the zonal centres of population should thus give a reasonable
representation of these variables for individual residents of zones to any destination, not just to a particular station as with doughnut zones (Wardman et al., 2007). The system allows larger populations with larger variations in the utility of travel in zones further away from the station, although given that the model is looking at competition between stations it is questionable how realistic this is. While this system is an improvement on previous models, there are problems with applying it to local rail journeys. Lythgoe (2004)
acknowledges that the probability of individual (or population unit) a travelling to location
b via stations i and j would not be constant for all individuals residing in a zone if
demographic and socio-economic variables were included in the model. The inclusion of such variables would seem to be particularly desirable for local stations as they are likely to possess relatively more explanatory power at this smaller scale. This is because smaller zonal populations mean that there will be less intra-zonal variation (and correspondingly more inter-zonal variation) making it more straightforward to estimate significant parameters for such variables in demand equations. Another problem is the size of the zones used, as those defined by Lythgoe (2004) tended to be at least 6 km across, which is much too large when modelling short-distance local flows. While the zone sizes could be reduced, the aggregation of population units into zones would then become problematic as the areas covered by the population units would not be significantly smaller than the zones, with many units likely to overlap the notional zonal boundaries.
One way around this problem would be to use the population units (for example census output areas or wards) as zones in the modelling process. Despite their irregular size and shape, they might give better results than artificially created zones which do not accurately
reflect the actual distribution of the population. An alternative solution would be to spatially disaggregate the population data into small regular-shaped zones, using a combination of GIS and microsimulation models. Spiekermann & Wegener (2000) produced a raster-based zonal dataset using a spatial interpolation method to create
probabilistic population allocations. Another method was suggested by Zhao et al. (2003), who used extremely detailed GIS data on the spatial distribution of households to define catchments using a distance decay function. However, all these methods would be extremely complicated to apply, and the resulting improvements in accuracy might not justify the extra effort involved.
It has been argued that inappropriate catchment specification will lead to inflated rail demand elasticities, because stations with good services will have relatively large
catchment areas and failing to capture this size effect will lead to the extra demand being attributed to service quality (Whelan & Wardman, 1999b). If the increased catchment size is ultimately a result of service quality this should not be a problem, but catchment size may also be affected by the quality of access and egress modes.