• No results found

4.5 Dataset 4 Rain data

4.6.3 Matching smart card data with bus data

This step describes the trip matching of the smart card dataset with the actual bus trips. Like the trip planner trips, we will match the smart card trips using the start location, end location, start time and end time, see Figure 4.19. Unlike matching the trip planner trips, we will not use a line number, since there is none available. The trip matching process is not an easy task for multiple reasons:

Different factors influence the punctuality of the bus, see Figure 4.20, therefore the recorded passage times often differ from the planned times. Furthermore, it could be that a passenger checks in after the bus has departed, especially if the bus driver has to make up for a delay and the passenger does not have the smart card at hand. In such cases the check-in time is later as the time of departure. The same applies to when a passenger checks out just before arriving at a stop.

It is not made easier by the determination of check in and check out stop by the bus equipment. Circular geofences with a radius of 15 meter are used to detect if a bus has arrived at a stop or if the bus has departed. When a bus leaves this geofence the current bus stop is already set to the next stop. So, it could be that if a person checks in when the bus already departed, the check in is registered for the next stop. And if a passenger checks out as soon as the bus departs, it could be that the check out is registered to the previous stop.

We will match the transaction trips with the actual bus trips using two methods. In both cases we will use the absolute time difference between the departure times and the arrival times as performance measure. For the bus times we will use the recorded variant where available, otherwise the target times are used. Using target times has consequences since in only 45% of the bus passages recorded in the dataset the bus is within one minute of the planned time, see Figure 4.2.

The first method is picking the best result from the bus trips which departs at the same stop and later arrives at the same stop with a target departure time that is within an interval of plus and minus 4 hours of the check in and end time. We will use this method as a baseline: it is not likely that a traveler checks in 4 hours before or after the bus departs a stop, however it is useful to validate the next method.

The second method also picks the best result from the bus trips which departs at the same stop and later arrives at the same stop. However instead of the 8 hour period, the check in and check out should have happened between the departure time of the previous stop and the departure at the next stop. The results of this method still need a constraint afterwards because the departure time at the previous stop and at the next stop is sometimes based on the planned time and sometimes on an arbitrary interval of 30 minutes instead of the time as recorded, see also

paragraph 4.2.3. Therefore, a constraint is needed to limit the max time difference in order to reduce faulty matches and thus noise. We will determine this limit based on the elbow method.

In theory a check in time and a check out time after a passage is not possible since the bus would most likely have left the 15 meter geofence and the transaction would be recorded to the next stop. However, because both systems are

independent and keep their own time, we will allow matches within a margin to allow for errors.

In case a transaction matches best with multiple bus trips we use the following order to pick the best:

1. check in time and check out time are both earlier as their respective bus departure time,

2. check in time is later and check out time is earlier as their respective bus departure time,

3. check in time is earlier and check out time is later as their respective bus departure time,

4. both check in time as check out time are later as their respective bus departure time.

The matching of the transaction is not biased towards the date, hour, hour type and date type as is shown in the table in Appendix LMatching trip planner trips to bus trips. However, some origins, destinations, OD pairs and travel times are less present after the matching. Furthermore, method 1 and method 2 show similar characteristics, where method 2 has the most matches. However, the quality of these matches is less assured because of the less constrained matching method. Thus, we will use the matches from method 2 with a maximum summed time difference of 10, as seen in Figure 4.24.

Figure 4.24: The number of transaction matched relative to the maximum summed absolute difference between departure time and check in time and arrival time and check out time

Related documents