• No results found

2.6. Data sources and methods

2.6.2. Methods

In order to evaluate changes in travel behaviour of travellers and impacts after the introduction of car sharing services, different methodologies were adopted, depending on the available data source and the aim of the analysis. Adopted approaches can be grouped in: statistical methods, both

34

descriptive and inferential, random-utility based models, data mining techniques, simulation and optimization approaches.

Statistical approaches

Descriptive statistics is the most adopted techniques and it was often used to perform preliminary analysis of the sample (Becker et al., 2017a; Cervero, 2003; Costain et al., 2012b, 2012a; Dill et al., 2019; Guirao et al., 2018; Lane, 2005; Shaheen et al., 2015). On the other hand, statistical inference allows drawing conclusions about unknown properties of the population, based on a random sample of that population (Alonso-Almeida, 2019; Burghard and Dütschke, 2019; Celsor and Millard-Ball, 2007; Clark et al., 2015; Seign et al., 2015). Several authors adopted these techniques to understand the differences between two samples of the population. For instance Cooper et al. (Cooper et al., 2000) used descriptive statistics to show differences between socio-economic characteristics and travel behaviour of citizens of Portland and car sharing members of the same city. Martin et al.

(Martin and Shaheen, 2016) analysed data of car2go members to highlight the effect of joining car sharing. Becker et al. (Becker et al., 2018) adopted a difference-in-difference approach to estimate changes in car ownership, comparing a sample of car sharing members and a control group. Shaheen et al. (Shaheen et al., 2006) described the market dynamics of car sharing services in North America.

Several authors adopted linear (Lempert et al., 2019) and logistics regressions (de Lorimier and El-Geneidy, 2012; Hu et al., 2018b). Nobis et al. (Nobis, 2006) used linear regression to understand variables related to car sharing acceptance, and logistic regression to model monomodal and multimodal behaviours. Similarly, Zheng et al. (Zheng et al., 2009) developed logistic regression models to evaluate willingness to participate in two car sharing plans. Ko et al. (Ko et al., 2019) and Le Vine et al. (Le Vine and Polak, 2019) used logistic regressions to analyse impacts on car ownership.

Other authors adopted factor analysis (Nobis, 2006). For instance, Efthymiou et al. (Efthymiou et al., 2013) used this technique to synthetize variables affecting the decision to own a car in Greece.

Kim et al. (Kim et al., 2015) applied factor analysis to identify factors influencing the satisfaction of the current electric car sharing program, in Seoul. Similarly, in the same city, Ko et al. (Ko et al., 2019) adopted this method to create variables describing the satisfaction level of car sharing, which were introduced in car disposal and purchase models.

Random-Utility based Models

In order to understand and simulate travel behaviour of users, starting from the work of McFadden (McFadden, 1974), models based on Random Utility Maximization theory has been extensively adopted (Tang et al., 2015; Yamamoto et al., 2007), in particular multinomial logit (Hagenauer and Helbich, 2017; Moons et al., 2007; Sekhar et al., 2016; Xie et al., 2007) (MNL). However, these models are based on several statistical and mathematical assumptions on data used to calibrate them (Chen et al., 2018; Yamamoto et al., 2007); if these assumptions are violated, errors in the estimation of parameters might occur, leading to biased prediction results (Chen et al., 2018; Hagenauer and Helbich, 2017; Lindner et al., 2017; Xie et al., 2007). In particular, MNL requires independence of irrelevant alternatives (IIAs) (Chen et al., 2018; Hagenauer and Helbich, 2017; Lindner et al., 2017;

Tang et al., 2015; Xie et al., 2007), i.e. the effect of attributes are compensatory (Xie et al., 2007;

Yamamoto et al., 2007). Several models were developed in order to overcome these limitations, such

35

as probit models (Train, 2003). Furthermore, in order to introduce correlation effects among alternatives, nested logit, cross-nested logit, ordered generalized extreme values and mixed logit models were implemented (Zhu et al., 2018).

In case results of the prediction were binary outcomes, several authors adopted binomial logit or probit models. For example, Cervero (Cervero, 2003), Cervero and Tsai (Cervero and Tsai, 2004) and Cervero et al. (Cervero et al., 2007) used binomial logits to model car ownership and car sharing usage. Costain et al. (Costain et al., 2012b, 2012a) adopted this type of model to understand the decision of car sharing members to buy carbon offsetting and collision deductible in Toronto. Habib et al. (Habib et al., 2012) and Morency et al. (Morency et al., 2012) used a binary probit to estimate the probability to be an active member in any month in Montreal. Becker et al. (Becker et al., 2017a) and Juschten et al. (Juschten et al., 2017) developed a binomial logit model to predict car sharing membership in Basel and in Switzerland, respectively. With the same aim, Becker et al. (Becker et al., 2017c) introduced latent variables in a multivariate probit model, which simultaneously models multiple correlated binary outcomes. Dill et al. (Dill et al., 2019) estimated a binomial logit to predict whether car owners of Peer-to-peer car sharing reduced the use of their vehicle.

On the other hand, if the outcome was an ordered variable, ordinal logit and probit models were used. For instance, Efthymiou et al. (Efthymiou et al., 2013) adopted an ordered logit model to understand the satisfaction of current travel patterns of young Greek travellers. Kim et al. (Kim et al., 2015) analysed willingness to use car sharing electric vehicles and to dispose a private car, through an ordered probit. Efthymiou and Antoniou (Efthymiou and Antoniou, 2016) implemented an ordered logit to model the willingness of young Greeks to join car sharing, introducing latent variables. Becker et al. (Becker et al., 2017a) developed an ordered probit to estimate the frequency of use of car sharing in Basel. In order to evaluate car sharing frequency of use in Washington State, Dias et al. (Dias et al., 2017) adopted a bivariate ordered probit model. Recently, Ko et al. (Ko et al., 2019) implemented an ordered probit to study car ownership changes after participating in car sharing, in Seoul.

When the variable to predict was categorical with multiple levels, Multinomial Logit models were developed. In San Francisco, Cervero et al. (Cervero et al., 2006) analysed car sharing adoption including this mode in a MNL. Catalano et al. (Catalano et al., 2008) adopted MNL to forecast mode choice in the city of Palermo, Italy, adopting four alternatives: private car, public transport, car sharing and car pooling. Costain et al. (Costain et al., 2012b, 2012a) developed a MNL to analyse the choice of vehicle type for car sharing members in Toronto. Carrol et al. (Carroll et al., 2017) used a MNL to model mode choice of citizens of Dublin, considering private car, car sharing and car pooling.

Other authors adopted a mixed logit model, in which some coefficients of variables in the utility formula are modelled as random variables. Rotaris et al. (Rotaris et al., 2019) used a mixed logit to understand how the adoption of car sharing would change among college students in Milan and Rome, Italy, varying the attributes of the current service. Zoepf et al. (Zoepf and Keith, 2016) compared the performances of MNL and mixed logit in quantifying how members trade off service attributes in car sharing reservation decision. Some authors adopted other types of logit models, such as nested logit.

Winter et al. (Winter et al., 2017) used a nested logit to model mode choice in the Netherlands, considering also shared autonomous vehicles. De Luca and Di Pace (de Luca and Di Pace, 2015) used MNL, cross-nested logit and mixed logit to forecast the effects of an inter-urban car sharing service near Salerno, Italy.

36

Data mining

Overcoming previously explained drawbacks (Lindner et al., 2017), data mining techniques do not require any statistical and mathematical assumption on data structure (Chang and Chen, 2005;

Tang et al., 2015; Thill and Wheeler, 2007; Zhang et al., 2017). Furthermore, they have a more flexible structure (Tang et al., 2015; Wang and Ross, 2018; Xie et al., 2007; Yamamoto et al., 2007;

Zenina et al., 2018), rather than traditional logit models, extracting significant patterns from the dataset and leading to a deeper understanding of relationship among explanatory variables (Chang and Chen, 2005; Chen et al., 2018; Hagenauer and Helbich, 2017; Lindner et al., 2017; Pitombo et al., 2011; Tang et al., 2015; Xie et al., 2007; Yamamoto et al., 2007; Zenina et al., 2018). Moreover they can be easily applied to large databases (Zhu et al., 2018), even with high unbalanced data (Wang and Ross, 2018).

On the other hand, results that are quite useful for planning and forecasting purposes and that are commonly derived through an econometric approach, such as the Value Of Time and demand elasticities, cannot be obtained from such techniques, which are very sensitive to training data (Zhu et al., 2018). Furthermore they often lack of interpretability, indeed they tend to focus more on predictive accuracy rather than on counterfactual analysis (Waddell and Besharati-Zadeh, 2019).

Even if, recently, some authors were able to extract interpretable economic information, such as elasticities, from a data mining approach (Wang and Zhao, 2018). In transportation analysis, data mining techniques were mostly used to reproduce existing scenarios (Pitombo et al., 2011; Wang and Kim, 2019), modelling users’ choice based on current conditions and options (Yamamoto et al., 2007;

Zhang et al., 2017). Although traditional mode choice models are based on random utility maximization theory, data mining techniques were used to predict future travel behaviour of users, and in particular, mode choices of travellers (Pitombo et al., 2015). Following this approach, mode choice can be defined as a pattern recognition task in which multiple behavioural attributes described by explanatory variables determine the prediction of the choice among different alternatives (Pitombo et al., 2015; Xie et al., 2007). Therefore data mining approach can be adopted for modal analysis and prediction.

For example, Morency et al. (Morency et al., 2007) used cluster analysis to identify different type of car sharing users in Montreal. Schmöller et al. (Schmöller et al., 2015) adopted the same technique to define group of days with similar spatial booking patterns in Munich and Berlin, Germany. Lee et al. (Lee et al., 2016) applied Association Rules to show relationships among variables related to a car sharing service in Cagliari, Italy (such as rate plan and vehicle type). Wang et al. (Wang et al., 2017) implemented a hierarchical tree based regression to understand factors affecting the choice to join a car sharing electric service in China. With a similar aim, Hu et al. (Hu et al., 2018b) applied a Random Forest in Shanghai, China.

Simulation

Some of the previously explained approaches were introduced as mode choice models in travel simulators, estimating travel demand and its effect in traffic networks. For instance, Rodier and Shaheen (Rodier and Shaheen, 2003) introduced the car sharing option in a four-step model in the Sacramento region, California, in order to estimate travel demand, emissions and economic benefits for travellers. Li et al. (Li et al., 2018) developed an Activity-Based model, including a One-way Free-floating service, to model the dynamic choice of car sharing. Ciari et al. (Ciari et al., 2014) used

37

an Agent-Based model to evaluate the use of One-way Free-floating and the existing one-way Station-based, in Berlin. Fagnant and Kockelman (Fagnant and Kockelman, 2014) adopted an Agent-Based simulation technique to study the impacts of shared autonomous vehicles in car ownership and traffic pollution. Ciari et al. (Ciari et al., 2015) implemented an Activity-based Multi-Agent model to estimate the effects in travel demand of different pricing strategies of a car sharing service in Zurich.

Heilig et al. (Heilig et al., 2017) developed a combined destination and mode choice model, which included car sharing, in order to introduce it in an Agent-Based model in Stuttgart, Germany.

Martinez et al. (Martínez et al., 2017) used the same type of model to simulate the daily operation of a car sharing service in Lisbon.

Optimization approaches

Some authors adopted optimization techniques to solve problems related to car sharing services and operations. For instance Correira and Antunes (Correia and Antunes, 2012) developed an optimization model to depot location with different trip selection schemes of an One-way car sharing operator in Lisbon, maximizing its profits. Later, Correira et al. (Correia et al., 2014) improved the previous work considering the user’s flexibility in choosing a car sharing station. Jorge et al. (Jorge et al., 2015a) developed an algorithm to optimize a car sharing system which can work both as Round-trip and One-way, in case of a specific generator of high travel demand; moreover they applied their model to the airport of Boston. Jorge et al. (Jorge et al., 2015b) adopted an optimization approach to design the best trip pricing for a One-way car sharing service in Lisbon, in order to maximize its profits. Recently, considering time-dependent and uncertain travel demand, Hua et al. (Hua et al., 2019) proposed a model to optimize both long-term and real-time operations of an electric vehicle car sharing operator in New York. The formers are related to infrastructure planning, such as the location of charging stations and fleet distribution, whereas the latter consider fleet operations, such as relocation of cars and charging decisions.

38