CHAPTER 3. MULTIPLE CHANGEPOINTS DETECTION OF SPEED TIME
3.3 Data Descriptions
3.3.1 Data Collection and Data Reduction
The major effort on the data collection part of this research was identifying work zone locations within SHRP 2 data. The RID contains 511 data for the most states involved in NDS for the duration of study (October 2010 to November 2013). The 511 data and collected variables were very different among the states. A major field in 511 data that contain
information about the potential work zones was the traffic event description. This field was queried for potential work zones by using key words such as “road work”, “lane closure”, “construction”, “maintenance”, “cross over”, or “head-to-head”. There were about two million records that needed to be searched for the potential work zones. The RID did not have 511 data for the state of Indiana, so this state was not included in the analysis.
The 511 data also contain information on the beginning and end of traffic events. Based on that, the duration of events which were work zones in our case were calculated. The work zones with durations of less than three days were removed due to the low possibility of having sufficient number of NDS time series traces for the short term work zones. As a result, 9,290 potential work zones were identified. The identified work zones were overlaid on NDS trip density data and were mapped to the corresponding roadway link ID in the RID. The identified locations for 9,290 potential work zones were sent to VTTI to acquire the number of NDS time series traces, unique drivers, and driver demographic data associated with the links of interest that occurred within the duration of work zones.
VTTI provided a list of potential trips associated with the links of interest along with driver information on those trips. The data were examined and work zones with at least 15 potential trips were selected, resulting in 1,680 potential work zones. The next step was requesting time series data associated with identified potential work zones. The estimation of
the physical extent of each potential work zone was needed to increase the likelihood that the actual work zone was included. For this purpose, the identified roadway links were mapped to RID and the corresponding link extracted. The dynamic segmentation function in ArcMap was utilized to add links to the upstream and downstream of each identified work zones.
The next step on this extensive data reduction effort was to submit a list of identified link IDs to acquire a sample time series trace and corresponding forward video for each potential work zone. About 3,000 traces were received and the forward video was reviewed to determine if a work zone was actually present. Data collected from forward videos are shown in Table 3.1.
Table 3.1 Extracted work zone characteristics from forward videos
Presence of work zone (yes or no) Locations of channelization Lane closure Right or left Type of channelization
Number of lanes closed Spatial locations of work zone start and end points
Shoulder closures Right, left, or both Presence and locations of workers Dynamic message sign Presence and locations of equipment Types and locations of barriers (e.g.,
barrels) Lane shift
Work zone speed limit Active work zone
A set of criteria used to identify an active work zone included lane closure, shoulder closure, worker present, and equipment present. In some locations, where barrels were present along the side of roadway, the work zone was considered inactive and was excluded. At this stage two main criteria to request the final set of time series data was set and
confirmed. The forward videos were used to identify the true beginning and end points of each work zone and confirm if the work zone was actually active. A set of 118 coded active work zones including various work zone configurations (such as lane closure and shoulder
closure) and types (such as multi-lane divided and 4-lane divided) were requested. Around 4,800 time series traces with associated forward/rear video images were received from VTTI. At this stage traces with more than 25% of missing network speed data were removed from the dataset. Speed traces with missing values were interpolated assuming a constant increase or decrease. All congested traces were removed and only traces with free flow conditions were kept in the analysis. Also traces with very poor image quality were excluded due to the inability of identifying the vehicle’s position or confirming if indeed it was an active work zone.
The final step of the process was to identify work zone features such as work zone signage, the start of the work zone, the start of the taper, and the start of work area. The location of features identified in the forward video were spatially located by noting the nearest video time stamp. The time stamp was then matched with the one in the time series data utilizing interpolation. The location of features relative to the start of the taper, which was identified as zero, were calculated using the speed of the vehicle. In addition, the position of the vehicle relative to each safety feature was calculated using the same technique.