Statistical considerations for interpreting threshold data

3.A Appendix: Supplemental case study information

4 Evaluating sensor characteristics in the case of threshold sensors

4.4 Results and Discussion

4.4.5 Statistical considerations for interpreting threshold data

A key objective of this chapter is to investigate how a system-level analysis may help identify an optimal mix of sensor characteristics for populating a sensor network. The sensor type selected for this illustration is alarm-type or binary signal sensors. While the objective of this chapter was not to develop an optimal algorithm for interpreting binary type data, it is worth discussing some related points that may in turn provide insight if further enhancement of the approach is desired.

Threshold sensors produce a binary output, only indicating when a contaminant’s concentration is above or below a threshold. The likelihood function compares modeled concentrations and sensor signals at each time to the respective sensor threshold level. Hence, the only information that is needed — and used — from the library is the knowledge of whether the modeled concentration is above or below the threshold level. Therefore the information content for each sensor in each realization in the library can be described with parameters that indicate the time at which a threshold level has been crossed. Most sensors may experience two threshold level crossings: when the concentration is first exceeded as concentrations rise and the time that concentrations fall below the threshold level. For sensors with high threshold levels the threshold level may never be exceeded; and for others, oscillatory behavior may cause a threshold level to be crossed multiple times. In each case the amount of data that describes the threshold level crossing events is less than the concentration time series data.

Rather than implementing a likelihood function based on comparing thresholds to modeled concentrations, it may be possible to compare each sensor signal to the respective time that the threshold level is crossed for each realization in the library. Two possible advantages may result from a time-based likelihood function: (1) the library information for each realization can be consolidated from large sets of data to fewer parameters, which has implications for computational efficiency; and (2) the uncertainty at which the threshold level is crossed can be directly treated by the likelihood function.

Turning to another issue, the concentration data shown for Experiment 1 indicates that the concentration profiles does not consist of smooth curves, particularly, at early times following the release (Figure 4.1). Oscillatory behavior of the contaminant

concentration can result from spatial variability of the contaminant concentrations within each respective room. After some time these oscillations decay, and the concentration follows a well-mixed profile. The current statistical treatment does not account for oscillations about a threshold level in any special way. Because the realizations are generated using a multizone model that assumes the contaminant is well-mixed within a room, the realizations do not replicate any kind of oscillatory behavior.

In its current form, the presence of oscillations is a liability for the BMC algorithm. However, oscillatory behavior is potentially rich in information, irrespective of the cause for the oscillations, which can be error, or variability of the within-room concentrations. Oscillations about a threshold level suggest that the true concentration, which is unknown, is close to the threshold level. It is unlikely that any modeling approach can capture these oscillations with accuracy – even computational fluid dynamics is unlikely to capture the frequency and time of oscillations exactly, unless a large eddy simulator is used.

Hence, the burden is on the algorithm to capture the information-richness exhibited by oscillations. One approach may be to implement backwards Bayesian updating. In backwards Bayesian updating, previous rounds of data can be recycled to re-estimate current posterior probabilties of the realizations based on current information. Thus, while oscillations may be a liability as they are being interpreted in real-time by the algorithm, they become useful once they cease, and negative effects can be transformed through backwards updating.

4.5 Conclusion

The premise of this chapter is that the selection of sensor characteristics is best performed from a systems perspective. Hence, the primary purpose of this chapter is to illustrate the relationship between sensor characteristics and sensor system performance. Here, I have demonstrated — albeit for a limited set of circumstances — that a network of single-level threshold sensors can be used to determine the location and magnitude of a short-term release quickly and accurately using a Bayes Monte Carlo framework.

Sensor networks with sensor response times ranging from 20 s to 120 s and threshold levels ranging from 0.0023 to 0.11 could identify the release location to a probability exceeding 90% within 2 min. Sensors with very high and and low-threshold levels have little discriminating power and could not characterize the release to useful levels. The BMC algorithm could estimate the release mass to a narrow confidence interval within 4 to 10 min across different threshold levels and response times. Higher threshold levels reduced the uncertainty of the released mass more quickly than lower threshold levels. Sensor response time had comparably less influence on the estimation of the mass.

When the sensor signals were spiked with random error of 10% and 30%, the systems were able to identify the release location to a 90% probability, for a majority of networks. Higher threshold levels, however, increased the risk of failure because they reduce the posterior probabilities of a greater number of realizations in the library as compared to lower threshold levels. Networks with more accurate but slower sensors resulted in more reliable predictions of the release location as compared to networks with

faster but less accurate sensors. This result suggests that the sensor response times evaluated are adequate for capturing the dynamics of the contaminant transport.

More important than the specific results, this chapter demonstrates that treating the network as a system may lead to better choices for sensor characteristics, like response time and error, than might be the case when considering sensors individually.

With respect to the specific investigations, the actual rate of false readings (i.e., false positives or negatives) for real sensors is likely to be less than 30%. Probabilistic results reached by the Bayesian algorithm based on the assigned confidence in this chapter are therefore likely to be conservative for release conditions considered in the library.

Several questions emerge from the research reported in chapter. Would networks consisting of fewer sensors have success at characterizing the release? And, if fewer sensors are adequate, where should they be placed? If only a few chemical sensors are available, and inadequate for achieving system performance goals, can other types of sensors be incorporated into the system to help achieve those goals in a more cost- effective manner? This chapter only investigated an instantaneous release. How well would the algorithm work against a full array of release conditions, including slow, steady releases? Currently, the algorithm does not advantageously use sensor oscillations, which are potentially information rich. Such oscillations are not easily accounted for in the multizone transport and fate model. To what extent does this behavior, or other unmodeled contaminant transport behavior present challenges, or opportunities, for developing more robust sensor systems? These questions inspire the investigations reported in Chapters 5 and 6.

Chapter 5

5 Influence of transport and mixing time scales

In document Bayesian based design of real-time sensor systems for high-risk indoor contaminants (Page 186-191)