2.1 Spatial Trend
2.1.1 Stream Distance Based Trend
As literature in the area of modelling river network data has focused on using stream distance to come up with a more intuitive covariance structure, it is a natural extension to use stream distance in order to fit the trend. This avenue of investigation has not been explored before, as the literature has not considered the option of detrending using either distance metric prior to analysis. This section will detail a novel approach to using stream distance to construct a trend based on the tail-up model. The tail-up stream distance based covariance model was designed with the intention of being able to incorporate characteristics such as stream distance, relative size of rivers (via flow data) and whether locations were flow connected, in order to produce a more accurate reflection of the processes going on in that particular river. However this kind of structure is not accounted for in the detrending process that was used in the previous section. It would seem more appropriate to detrend the data using stream distance as the distance metric and to take into account the flow-connectivity network.
CHAPTER 2. ESTIMATING TREND AND COVARIANCE 46
In order to detrend the data in a way that would remain more faithful to the ethos of the tail-up model, a novel non-parametric smooth estimate was constructed by estimating the trend at point x according to (2.5). This is based on a very simple local mean smooth function. A local linear approach was also considered for constructing this trend, but it had the drawback that it was not possible to estimate a trend on any streams more than one bandwidth distance away from all of the monitoring stations. There is an argument that one would not wish to make predictions at such streams anyway as there is little information on which predictions can be based. However, in order to have complete results on the network the trend was estimated using (2.5). Here, w(d; h) is a Normal kernel density function N(0,h) and is given by (2.2); yi is the overall average nitrate level at station i; d is the stream distance between point x and station xi; δi(x) is an indicator function which takes value 1 if station i is flow connected to point x and 0 if it is not. The h parameter in the weight equation corresponds to the desired bandwidth.
ˆ mriv(x) = Pn i=1yiw(d; h)δi(x) Pn i=1w(d; h)δi(x) (2.5)
This predictor can also be formulated in a similar, though simpler fashion to that shown for the Euclidean distance trend in (2.1) and can again be rearranged into vector matrix form so that it can be solved using (2.4). In this equation, when using stream distance, X now denotes a vector of 1’s, while W has elements w(d; h) on the diagonal where d is the stream distance between point x and point xi. This will be explicitly stated and discussed in Section 5.1.1. It is worth noting that here, both stream distance and the indicator function are symmetric, so that the stream distance between points A and B is equal to the distance between points B and A, while the indicator function will just depend on whether the two sites are flow connected and not where the points lie in relation to one another.
CHAPTER 2. ESTIMATING TREND AND COVARIANCE 47
It should be noted that neither the indicator function, δi(x), nor the weight, w(d; h), contain the flow related weightings included in the tail-up model. It was initially considered that these could form part of the indicator function, and such an approach is adopted for the smoothing function used in the additive models in Section 5.1.1, but this approach was ultimately rejected for defining the trend
here. Including the flow based weightings seemed to go against the wish to
keep the trend fairly general. This is because the smooth estimate is much more dominated by nearby observations as further away observations tended to be after one or several points of confluence and the flow weighting then attaches much less importance to them. Fitted values incorporating flow based weightings tend to track the data too closely and so it seemed inappropriate for the estimated trend. This property may be desirable as part of the covariance structure in kriging, but is not ideal when the object is to keep the trend general.
(a) Stream distance based trend (b) Euclidean distance based trend
Figure 2.5. Estimated stream and Euclidean distance based trends
Figure 2.5(a) shows the stream distance based trend fitted to the Tweed data with a bandwidth of 15km. This was chosen to provide a reasonable balance between generality and accuracy. As mentioned previously, Section 3.3 explores the effect of changing bandwidth in more detail, and suggests that the results of kriging are not very sensitive to small adjustments in bandwidth.
CHAPTER 2. ESTIMATING TREND AND COVARIANCE 48
Comparing Figure 2.5(a) to the estimated Euclidean distance based trend, replicated in Figure 2.5(b), characterises the differences that are seen when using a flow connected stream distance, as opposed to a Euclidean based approach. The Euclidean approach produces a surface that takes no account of the structure of the river. In the stream distance based approach, the effect of the connectivity can be clearly seen, with some minor streams in the centre of the plot tending to the mean value (an orange colour) despite all the surrounding streams being a much lower (green) value. This reflects the fact that the low nitrate levels that cause the surrounding streams to have low values are observed at locations that are not flow connected to these minor streams. Also, the stream distance based trend has some quite sharp changes in nitrate levels at points of confluence. This is another feature that is brought about due to the flow connectivity information, and is likely to better reflect the behaviour of a river network, rather than the smooth changes seen in the Euclidean trend that occur regardless of the locations of the joins in the stream segments.
Neither nonparametric trends nor trends based on stream distance have been considered in the literature involving prediction on river networks using the tail- up model. Literature such as Ver Hoef and Peterson (2010) has tended to account for tend in the data using universal kriging to fit a simple polynomial function. Given the irregular nature of environmental data in general, a nonparametric trend would seem to be an obvious extension of existing work. A trend based on stream distance would also seem an intuitive direction in which to extend this work. Later chapters will investigate furtherthe possibility of using a nonpara- metric, stream distance based trend for the first time, and will aim to assess the benefits it would bring over a Euclidean distance based trend.
CHAPTER 2. ESTIMATING TREND AND COVARIANCE 49