CHAPTER 2: HALOGEN SUBSTITUTION PATTERNS AMONG DISINFECTION
3.3 ANALYSIS PROCEDURES
3.3.3 Data Screening
The data review conducted for this work addressed both questionable and missing records, encompassing Aux 1 monthly plant and treatment data, and water quality results for influent through finished water samples. Questionable data were flagged and missing data for categorical variables were replaced (see the following) with appropriate values where possible. In a few cases questionable numerical data were replaced with corrected values using methods described in the Results section. Resulting flags, replacement values, and explanations were compiled in a project quality assurance table.
Due to the presence of a few unrealistically high reported bromide concentrations, all bromide results from plants having any bromide result above 1 mg/L were screened. TOC data were examined for all plant/months where reported TOC concentration at the first point of chlorine addition was 1 mg/L higher than at the plant influent. Similarly, UV254 data were examined where this difference exceeded 0.5 cm-1. Aux 1 values for both UV254 and TOC are averages of reported duplicate sample results and validated ICR data had to meet a 20% sample pair relative percent difference limit for these analytes. Assuming this would reduce the chance of utility data entry error, anomalous Aux 1 data for these parameters were flagged only in cases of extreme discrepancy. All Aux 1 pH results below 4.0 or above 11.0 were examined to determine their plausibility. pH values below 2.0, considered implausible for full-scale treatment, were flagged. Otherwise, results were compared with upstream and downstream values, and with results for the same location in other sampling periods. Unit process flow and volume records were screened as a quality assurance measure for derived chlorine contact time estimates. For each distinct process at each plant, the relative ranges for flow and volume were calculated as (maximum value - minimum value)/(average value).
Data were examined further if the relative range for flow or volume exceeded 1.0 for a unit process.
Null (missing) values for categorical variables describing source water type (MSRC_CAT), plant disinfectant type (WTP_DIS), and distribution system disinfectant type (DS_DIS) were traceable to deviations from various Aux 1 requirements for underlying primary information. Appropriate replacement values for these entries could usually be determined with confidence based on underlying water resource and process information. Additionally, some Aux 1 WTP_DIS and DS_DIS entries that were incorrect due to faulty underlying disinfectant process information were replaced with corrected values. Data were also examined wherever MSRC_CAT, WTP_DIS, DS_DIS, or MWTPTYPE (treatment plant type) values varied across the 18-month ICR period for an individual plant.
Graphical analysis was used to aid in reviewing water quality and treatment information. Plots of monthly data were scanned for unusual fluctuations in water quality or disinfectant dose across sampling months and data were examined more closely as needed. Plots of individual plant/month data, illustrating patterns of water quality data and disinfectant doses through the process train, were used to assess data consistency within and between sampling months.
3.4 RESULTS
3.4.1 Data Screening
Results of the data screening and review effort are summarized in Table 3.1, which lists the total number of Aux 1 records and the number and percent of entries flagged or replaced for each parameter considered.
Table 3.1 Aux 1 Data Screening and Review Results
Parameter Number of Aux 1
records Number of records replaceda Number of records flaggedb % of records affected Bromide 8,720 12 — 0.13 pH 50,350 — 20 0.04 TOC 31,895 — 37 0.12 UV254 31,930 1 32 0.10 MWTPTYPE 8,953 3 — 0.03 MSRC_CAT 8,953 482 — 5.4 WTP_DIS 8,953 1,797 — 20.1 DS_DIS 8,953 227 — 2.5 Chlorine dose 14,489 — 296 2.0 Ammonia dse 3,030 — 25 0.8 Residual free Cl2 30,139 — 164 0.5 Residual total Cl2 31,490 — 101 0.3
Unit process volume 39,595 — 217 0.5
Unit process flow 43,000 — 491 1.1
aMissing or questionable value replaced in analytical data set.
bValue tagged as questionable in analytical data set.
Very few data were flagged for bromide, TOC, UV254, or pH. Seven questionable bromide results for one plant were replaced after determining that decimal placement data entry errors had been made (this was verified with utility personnel at the plant in question). Five additional bromide results from four other plants were replaced based on decimal placement data entry errors. Flagged TOC and UV254 results involved substantial and
implausible increases or decreases in the parameter value across a treatment train for a plant/month or across sampling months at the same plant. For example, a plant having flocculation tank effluent TOC concentrations typically in the 2V3 mg/L range reported 30 mg/L at this location in one month, although influent TOC was below 5 mg/L. In another case, an extremely high flocculation tank effluent UV254 result of 0.810 cm-1, far outside the expected range for this parameter, was five times greater than the plant influent result. Besides the 14 pH results below 2.0, six other pH results were flagged based on inconsistencies with data for upstream/downstream samples or with samples for the same location in other sampling periods.
As indicated in Table 3.1, replacement values were determined for substantial numbers of MSRC_CAT and WTP_DIS entries (5 and 20% of ICR records, respectively). These were mostly replacement of missing values. Some corrections to WTP_DIS (42) and DS_DIS (51) entries were made after discovering errors in process train disinfection information. Three MWTPTYPE values of “OTH” (other) for one treatment plant were corrected after verifying with the plant that the same conventional process was used throughout the ICR period.
Reasons for flagged flow and volume data included apparently mistaken loss or gain of digits in data entry, substantial change in the process flow across plant-months without corresponding change in volume (or vice versa), intermittent zero flow entry for a major process with normal volume entry (or vice versa), and unit process flow values substantially higher than a plant’s finished water flow for the same month. Unit process flow and volume data were employed to estimate chlorine contact times. Less than 1% of calculated chlorine contact times for unit processes downstream of chlorine addition were affected by flagged
flow or volume records. Reasons for flagging chlorine and ammonia dose and chlorine residual records included large discrepancies in doses across sampling months at a given plant (e.g., 50 mg/L Cl2at a plant normally dosing 5 mg/L), chlorine doses of zero reported with concurrent measurable downstream chlorine residual levels for processes where a plant normally applied chlorine, and anomalous chlorine residual patterns across treatment trains within a given plant/month. Overall, 2% of chlorine dose values and less than 1% of ammonia dose values and chlorine residual measurements were flagged (see Table 3.1).