4 A method for estimating benchmark mobility levels
4.6 Specification of a final “best” model
I develop a final model (Model 5) using the hurdle framework (as in Model 4), but differing from Model 4 in the exact mix of explanatory variables included. This version differs importantly from Model 4 in that it allows different explanatory variables for the binary and count portions of the hurdle model, and that only the high-access segment of sample was considered in
experimenting with alternative specifications. As such, it designed to be a best predictive model for that segment of the data.
After experimenting with many variations of instrumenting the predictors in Table 26, I opted to retain variables in either the “any” or “volume” parts of the model only if their coefficient had a statistical significance at p<0.10. Given comparably significant alternatives, I opted for versions of variables (or combinations of variables) that were theoretically meaningful or easier to interpret. I retained a few variables whose main effects were not significant but interaction effects were, as well as some that were close to significant and theoretically meaningful and/or part of a set of dummy variables that belonged together.
In general, when a variable is included as a predictor in both the any and count parts of the model, two different coefficients are estimated to capture that variable’s influence on the probability of making any trips, on the hand, and on the probability of making successively higher volumes of trips above one, on the other; together, and with the other variables in the model, the two effects contribute to the overall expected volume of trips for a given individual.
When a variable is only included in one part of the model and not the other, it means that the factor only contributes to whether the individual makes any trips and not the volume of trips, or vice versa.
Table 37 shows the final specification. With respect to various performance measures, this version is improved only slightly over Model 4c, previously shown in Table 35: McFadden’s (adjusted) pseudo-R2 increases by 0.003 (from 0.025 to 0.028); information-criteria type measures such as the AIC decrease by 0.012 (from 3.941 to 3.929); the average absolute difference between predicted and actual values decreases by 0.039 trips (from 1.568 to 1.529);
and the percent of cases whose prediction is within 1 trip of their actual increases by 1.1 percentage points (from 38.0% to 39.1%). In the final specification, the categories of predictors
from Table 26 are operationalized as follows and have the following estimated effects. Recall that the model was estimated only using the segment of the sample with “unlimited” vehicle access, as defined in Table 24, and therefore these effects apply specifically to that subgroup.
Built environment: Dummy variables for each community type (urban, second city, and suburban) all have a positive effect on both the probability of any trips and higher volumes of trips, relative to town/country areas, which was omitted as the reference group (where the largest portion of households lives, 39.8% of the weighted sample). Of the three types, suburban environments contribute most to making any trips but not as much to the volume of trips (though still more than town/country areas). Urban and second city types contribute more to the volume of trips than do suburban areas, but not as much to the probability of making any trips (with the urban coefficient only significant at p=0.179). Transit score for the MSA, continuously valued (but taking on only 26 unique values) has a negative effect on both the probability of any trips and higher volumes of trips. This may be serving as an indicator for major urban area rather than a reflection of the effect of transit service per se. Density of employees per square mile and housing units per square mile for the Census tract of the home residence, valued at the midpoint of eight categories and treated as continuous variables in the model, have a positive effect on the probability of any trips only (and housing units only with significance at p=0.112 in the presence of the other measures of the built environment).
Economic status: Dummy variables for each level of household income per adult household member, in $5,000 increments, have an increasingly larger effect on both any trips and the volume of trips, but especially the former (in part due to different reference groups used for the any and volume components of the model: <$7,500 per adult for making any trips and <$17,500 per adult for the volume of trips). Dummy variables for educational attainment affect both any trips and volume of trips, with a seemingly larger effect on the probability of
making any trips. Relative to the reference group of those with either a high school diploma and/or some college but less than a four-year degree (these were grouped together because of no significant difference between their effects), those with less than high school make fewer trips and those with graduate degrees make more, and those with college degrees but not graduate degrees make the most, all else equal. Homeownership, reflecting economic status (but likely also correlated with residence in a detached single-family homes, which did not have a significant coefficient in either part of the model in the presence of the other
built-environment measures) significantly and positively affects only the probability of any trips but not volume of trips. As another indication of economic status, having a household member in a manufacturing/construction/maintenance/farming occupation has a negative effect on both the probability of any trips and volume of trips, while having a household member in a
professional/managerial/technical occupation has a negative effect on making any trips. (Note additional effects for those personally employed in certain occupational categories, described below.)
Vehicles and driving: Among this segment including only drivers in households with high-ownership levels, the number of additional drivers in the household was not significant.
Higher numbers of vehicles (instrumented as a continuous measure of the number of vehicles per household member excluding non-driving children —so number of vehicles divided by the number of all adults plus any minors who drive) has a significant effect on the volume of trips only, and positively. This is potentially endogenous (those with greater mobility needs or more diverse mobility needs, may be more inclined to own more cars, such as specific types of vehicles for different activities). Having a limiting medical condition has a negative effect on making any trips and on trip volume, and having one specifically affecting driving has an additional negative effect on both.
Lifecycle stage: Age is instrumented as a series of dummy variables for each decade, retaining in each part of the model only those with significant coefficients. Among those making any trips, peak trip-making occurs among those in their 50s and 60s, with those both younger and older making fewer trips, with significant additional negative effect among older females.
Age effects (on their own) are less pronounced for the probability of making any trips, though with decreased probability among those in their 80s and 30s, all else equal (and no significant interaction with gender).
Children, employment, and household roles: Those in households with (more) children have a greater probability of making any trips and more trips (instrumented as a continuous variable for total number of children), an effect that is even greater for females with children (instrumented as the interaction of female and dummy variable for any children). However, the effect on making any trips (but not volume, given any) is attenuated for those with very young children (age 0 to 5), again even more so among females. People in multi-adult households (three or more adults) are more likely to make any trips but fewer trips, perhaps reflecting more sharing of household duties, more time spent at home with multiple adults for companions, or less need for travel among the type of people in this living situation (e.g. young people living with roommates, or households including an elderly parent). Relative those living with one or more other adults, lone adults make a higher volume of trips, an effect that is even greater among those with children (who are also mostly female), perhaps reflecting the need to single-handedly perform household-maintenance duties. In general, people who are employed (versus not working) have a greater probability of making any trips but fewer trips. Those with multiple jobs have an even greater probability of making any trips, but those with only part-time with less. Additionally, the effect on trip volume is attenuated (that is, less negative) among those employed part-time, in multiple jobs, and/or in certain occupation categories: sales/service,
clerical/admin support, professional/managerial/technical (but not
manufacturing/construction/maintenance/farming or other). This may have to do with the nature of the work, the opportunities for trip-chaining to and from work, or the likelihood of these jobs being located in certain built environments, such as more urban areas. Interactions between employment and gender, or employment and presence of children were not
significant. The overall effect of gender is that females have a less probability of making any trips but a more probability of making a higher volume of trips, consonant with their frequent role in performing more childcare and household maintenance duties in and outside of the home, regardless of employment status.
Race, ethnicity, and foreign-born status: Hispanics of any race have a greater probability of making any and more trips, while Blacks (American and mixed African-Americans) have a greater probability of making any trips (relative to non-Blacks of any other race), and Asians have a greater probability of making more trips (relative to non-Asians of any other race). These may reflect cultural differences, patterns in the types of the built
environments where they live, or be compensating for some other effect in the model that affects these groups differently, such as income (that is, these groups may make more trips with less income). Recent immigrants are more likely than less-recent immigrants and the native-born to make any trips, but also a lesser volume of trips. The effect on volume diminishes with time spent in the United States (shown by the diminishing magnitude of the coefficients for immigrating less than 6 years ago, 6-10 years, and 11 or more years ago, relative to the native-born).
Day of the week: Weekday trip-making is somewhat reduced on Mondays and
somewhat greater on Fridays (in volume), with no significant differences among those who are working versus not working. The probability of weekend trip-making differs by employment
status. Among non-workers, the probability of any trips and the volume of trips is less on weekends than on weekdays; among workers, the probability of any trips is even less on weekends versus weekdays than among those who do not work, but if they do make any trips, they are somewhat more likely to make more trips than are non-workers, especially on Saturdays.
Table 37. Results for model (5), a final best hurdle model of total trip volume, using only the high-access segment Explanatory variable
Vehicles per person (excluding non-driving minors) 0.034 ***
Has HH member in manuf/farm/const/maint -0.184 *** -0.029 *
Has HH member in prof/mang -0.076 * 0.015 *
Limiting medical condition (vs. none) -0.678 *** -0.099 ***
Condition results in giving up driving (vs. no condition or not this result) -1.219 *** -0.250 ***
Employed (vs. not working) 1.208 *** -0.206 ***
Explanatory variable
Metro-level transit-score -0.001 ** -0.001 ***
Community type (vs. town/country)
4.7 Summary and conclusions
This chapter describes the development of a model for estimating benchmark mobility levels, for a given demographic profile. Mobility is measured as the number of trips made in a day (defined as anytime someone moves between addresses, except for returning to home or work), as a proxy for the overall volume of out-of-home activities in which someone engages. This is a coarse way to capture activity-participation, treating all types of activity (outside the home) the same, when in reality some sorts of activities may be more important than others in their contribution to (or detraction from) the overall welfare of individuals.
Using this measure of activity, there are apparently meaningful differences in the sorts of factors influencing whether someone makes any trips (versus staying in the same place all day) and those influencing the volume of activity on a given day. I account for these by using a two-part hurdle model that allows different explanatory variables (with separately estimated coefficients) for each part. However, because both parts are difficult to predict with a high level of precision, the two-part hurdle model is only slightly improved over a simpler single-equation count model, such as the Negative Binomial.
The sorts of factors used as predictors of trip volume are those available in the NHTS dataset, including: household attributes such as income, household size, homeownership, occupations of household workers; individual attributes such as gender, employment, occupation type, education level, immigrant status, and race/ethnicity; attributes of the built environment such as community type (e.g. urban, suburban), density, and MSA-level transit availability; and day of the week. A potential limitation of the model is the overarching difficulty in predicting individual activity levels on any given day deterministically based solely on these sorts of predictors. That is, actual behavior appears idiosyncratic, and model fit is poor, though not necessarily worse than other disaggregate models of travel behavior. Even with noise in the
model (only 38.2% of cases have predicted values within +/-1 trip of the actual value), there may be meaningful patterns at the aggregate level. That is, aggregating will help cancel out some of the noise evident at the individual level, allowing for overall comparisons of outcomes among groups.
The group of people used as the basis for benchmark mobility levels are those whose vehicle access is “unlimited,” defined as adults over age 18, who drive, and who live in a household with at least as many cars as people excluding non-driving minors (so including driving teens plus all adults, whether or not they drive). (It also includes a small minority who experience medical conditions or handicaps making travel outside the home difficult, and even a very few who report giving up driving as a result of the condition, but who apparently still drive.)
A consideration in light of poor model fit is what might be omitted from the model, especially if it is something that might have a different role among the high-access owners used to calibrate the model versus the low-car households to whom the model will be applied. For instance, Ory and Mokhtarian (2005) have examined the role of overall travel liking. If travel liking were an important predictor of trip volume, and differed — for instance, is generally greater — among the high-access owners, then the benchmark model might systematically overestimate trip volumes for those in low-car households. This sort of bias would be less important if it impacted the entire group of low-car households equally. However, if impacting some more than others, it could potentially bias the sorts of subgroup comparisons explored in Chapter 5, whose focus is to evaluate mobility fulfillment among members of no-car and low-car households.