Statistical process control - Temporal surveillance

2.3 Temporal surveillance

2.3.2 Statistical process control

The essential difference between modelling data via time series methods (above) and using statistical process control methods is that time series analysis accounts for the fact that data points taken over time may have an internal structure (such as autocorrelation, trend, or seasonal variation) that should be accounted for. Statistical process control can be thought of as a plotted time series with control limits applied.

In the 1920s, Walter Shewhart developed a number of business production analysis techniques, which were designed to detect changes in the quality of the output from contin- uous production processes (Shewhart 1931). Statistical process control (SPC), or quality control, is widely used in many industries to facilitate objective evaluation of business op- erations and production processes. SPC is used to monitor the level of a production trait, and to give a notification when the level changes beyond some predefined limit. The basic control charts are designed to monitor a process that is expected to be constant, although it allows for some random fluctuation.

The use of SPC has gone beyond industry to monitor any process, and identify when it changes to being ‘out of control’. As early as 1942, Deming proposed the potential value of SPC for disease surveillance and rare events monitoring (Deming 1942). An overview of the use of SPC in health care and surveillance is given by Woodall (2006).

Consider the following as a time series of surveillance dataX =X(t) :t= 1,2, ... Using this example, a standard Shewhart control chart would considerXas a continuously vary-

ing quality (e.g. number of thermometers sold in Auckland per day) with a mean ofµX

and a standard deviation ofσX. Upper control limits would be set as UCL =µX +KσX, the centre line at µX, and the lower control limits at LCL = µX - KσX. Historically,

K = 3has become an accepted standard in industry.

It is important to remember that although statistical process control charts are among the most prevalent and valid methods for monitoring time series data, their use usu- ally requires observations to be random variables when the process is in statistical control. Health surveillance data are not random variables and present problems that are not present in the case of industrial process control; health data often exhibit correlation, non- stationarity (in the mean and/or variance), and seasonality. However, these limitations may be substantially overcome by using one of two techniques (Stoumbos et al. 2000). Firstly, past-behaviour of the series can be corrected by inclusion of seasonal or historical adjustments. In other words, the original data is presented as a standard control chart but the control limits are adjusted for the autocorrelation in the series. A good example of this is the Early Aberration Reporting System (EARS).2This applies aberration detection algorithms to surveillance data and flags anomalies. Two methods are implemented: (1) a seasonally adjusted quality control statistic; and (2) a historical limits model that com- pares the current 4-week total to the mean of nine 4-week periods (using the previous, comparable and subsequent 4-week periods over the past three years). These methods can result in three different flags: (1) or (2), above, or (3), when both the models ex- ceed the established thresholds. EARS uses Shewhart variants that use a moving sample average and sample standard deviation to standardise each observation.

The EARS system can be applied to daily, weekly and monthly data and allows for strat- ification of the data e.g. by geographic region and specified threshold limits. For rare diseases such as typhoid, the system can be set to flag every occurrence of a case. EARS is used in the national notifiable disease surveillance system (EpiSurv) in New Zealand. Figure 2.15 shows the system in use for flagging high numbers of campylobacteriosis cases. Flags consistently occur over the period December 2006 to February 2007 but not at Christmas/New Year time. As the historical mean also is reduced at Christmas/New Year time, this discrepancy is more likely a result of fewer notifications, due to people taking holidays, rather than fewer actual cases of disease.

tion of cusum). They found that the use of a negative binomial distribution accommodated the over-dispersion evident in disease notification data, and provided a lower rate of false alarms for a given sensitivity. However, these advantages were associated with decreased early timeliness performance when using the negative binomial cusum algorithm.

The second option for overcoming the problem of the lack of independence is to plot the residuals from a time series model on a standard control chart (Stoumbos et al. 2000). Williamson & Weatherby Hudson (1999) combine statistical process control with ARIMA time series modelling to detect aberrations in hepatitis A, meningococcal disease, typhus fever, and other infectious diseases.

One problem with this methodology is the need to have sufficient baseline data to produce a stable model. Generally to account for seasonality three years of data is considered the minimum (Diggle 1990). To overcome this problem methods have been developed that incorporate short seven-day baseline periods for threshold comparisons (Hutwagner et al. 2005). These thresholds were based on a cusum calculation and the baseline was varied to give different sensitivities. Cusum is a cumulative sum calculation as follows:

St =max(0, St−1+ ((X(t)−(µX+KσX))0/σX)) (2.6)

with a decision value of St > 2, where X(t) is the count or percent e.g. number of

thermometers sold in Auckland per day. The other parameters are described above, but here K is the detectable shift in the mean and not necessarily 3 as it is for a Shewhart chart.

Figure 2.15: Use of EARS for Campylobacteriosis surveillance in New Zealand. Flags consistently occur over the period Dec. 2006 to Feb. 2007 but not at Christmas/New Year. Source: Institute of Environmental Science and Research Limited.

For example, the method with the least sensitivity used a baseline from the previous seven days in closest proximity to the current value. This is because if a flag which denotes an aberrant value is noted on a particular day, t, then the next day, t+1, it is less likely to produce a flag as the high count from the previous day is immediately incorporated into the new baseline. The methods were designed for enhanced bioterrorism surveillance to identify aberrations quickly e.g. within the first day or two of a special event such as the Olympic games. Since these methods are based only on current information, they would not be useful for identifying an infectious disease event that occurs gradually e.g. as in the start of the influenza season.

Generally changes will be of the step form, where a parameter changes from one constant level to another, but changes can also be linear, exponential or gradual; the latter

methods, such as EARS, are not well suited for the syndromic surveillance problem in which outbreaks do not occur instantaneously and are transient.

Further information on the use of SPC methods for spatio-temporal surveillance is provided by Rogerson (2005).

In document Epidemiological investigations of surveillance strategies of zoonotic Salmonella : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy at Massey University (Page 74-78)