Chapter 4 Methodological Framework
5.2 Data Cleaning, Coding, and Manipulation
There was a sizeable proportion of respondents in the BSA datasets from 2011 to 2014 who declared as non-car users at the time of completing the survey, which created some redundancy in the datasets. Therefore, those samples were removed as this research focussed on the attitudes of car users, whether as a driver or passenger. The data available for the car users were grouped into three types, namely socio-demographic characteristics, travel behaviour, and travel attitudes, based on the questionnaires completed during the annual surveys. At the initial stage of this analysis, responses such as “not answered”, “skip this question” and “can’t choose” were removed during the data cleaning process, bringing the final dataset to a total of 1509 car users for the period of four years commencing in 2011. Removal of data was considered not to cause bias, based on the assumption that they occurred at random. Table 5.1 shows the coding system that was used for each variable in the dataset and, once each questionnaire was checked, they were assigned a serial number. These records were retained in a database which formed the basis for all statistical analyses.
Coding Variables /Questions Categories /Answer coding
Serial Serial Number -
HH# Number living in household, including
respondent 1 = One, 2 = Two,
Drive May I just check, do you yourself drive
a car at all these days? 1 = Yes,
2 = No.
Cong_MWs How serious a problem for you is congestion on motorways?
1 = A very serious problem, 2 = A serious problem,
3 = Not a very serious problem, 4 = Not a problem at all.
Cong_cities How serious a problem for you is traffic congestion in towns and cities?
1 = A very serious problem, 2 = A serious problem,
3 = Not a very serious problem, 4 = Not a problem at all.
Exhaustfumes How serious a problem for you are exhaust fumes from traffic in towns
Car_driver How often nowadays do you usually travel by car as a driver?
Car_passenger How often nowadays do you usually travel by car as a passenger?
Bus_usage How often nowadays do you usually travel by local bus?
1 = Every day or nearly every day, 2 = 2-5 days a week,
3 = Once a week,
4 = Less often but at least once a month,
Train_usage How often nowadays do you usually
Bike_usage How often nowadays do you usually travel by bicycle? 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
ReducTravCar I am willing to reduce the amount I travel by car (To help reduce the impact of CC).
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
CCView View on climate change and causes.
1 = I don t believe that CC is taking place,
CartoWalk Many of the short journeys that I now make by car I could just as easily walk.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
CartoBus Many of the short journeys that I now make by car I could just as easily go by bus.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
CartoBike Many of the short journeys that I now make by car I could just as easily cycle.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
HiTaxforCarUse For the sake of the environment, car users should pay higher taxes.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
AllowCarUse People should be allowed to use their cars as much as they like, even it is cause damage to the environment.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly ReducCarUse For the sake of the environment,
everyone should reduce how much they use cars.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
ReducCarUse_NP There is no point in reducing my car use to help the environment unless others do the same.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
CarBetterPayLess People who drive cars that are better for the environment should pay less to use roads.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly CycDang It is too dangerous for me to cycle on
the roads.
1 = Agree strongly, 2 = Agree, 3 = Neither agree nor disagree, 4 = Disagree, 5 = Disagree strongly.
Bike_own Bike ownership.
1 = Own bicycle yourself, 2 = Have regular use of a bicycle owned by someone else,
3 = Have no regular use of a bicycle.
Bike_ride Have you ridden a bicycle during the
last 12 months? 1 = Yes,
2 = No.
*CC = Climate change, FT = Full time, PT = Part time
Being mindful that it is important not to over-parameterise the model when setting up the multivariate probit model (MPM) in Chapter 8. Therefore, to overcome this problem, the employment status variables were grouped and allocated new codes into 4 categories. Table 5.2 presents the new codes for employment status and the description of each category are as follows:
1. In work: this should take the value 1 if the person is in work (current categories 1 – 5), and -1 otherwise (current categories 6 – 11)
2. Employee: takes the value 1 for current categories 1 and 2 and takes the value of -1 for current categories 3 and 4. Those in current category 5 – 11 should get a value of 0 for this.
3. Full time: takes the value 1 for current categories 1 and 3 and takes the value -1 for current categories 2 and 4. Those in current category 5 – 11 should get a value of 0 for this.
4. Non-employed status: treat this as a 3-category variable. Category 1 would be current categories 6 and 7 (unemployed and waiting to take up work). Category 2 would be current categories 8 and 9 (looking after the home and retired). Category 3 would be current categories 10 and 11 (in full-time education and other).
Respondents in current categories 1 – 5 get the value 4.
No. Variable Coding Description
Table 5.2: Manipulation of employment status variable