6.3 Index Support for Interval-Focused Queries
6.3.2 Distance Estimation Using Interval Boxes
Let us assume, a time series is represented by a set of interval boxes. In this section we show how the exact interval-focused distance between a query objectQand anyX ∈ Dcan be estimated by means of an upper and a lower
bound using the information ofrep(X).
At each relevant time sloti, we can lower bound the i-th summand of the Lp-norm by the well-known M IN DIST function.
Denition 6.7 (MINDIST).
Let Q be a query time series and qi the amplitude value of Q at time slot ti.
LetX ∈ Dbe a time series and letr ∈rep(X)be an interval box that overlaps
ti, i.e. lr ≤ti ≤ur. Then theM IN DIST betweenqi andr = (lr, ur, lvr, uvr)
is dened as M IN DIST(qi, r) = lvr−qi if qi ≤lvr qi−uvr if qi ≥uvr 0 else.
are several interval boxes r ∈ rep(X) with lr ≤ it ≤ ur, we determine the
maximal value over all the corresponding MINDIST values. We will use the MINDIST value to dene a lower bound for the interval-focused similarity. In order to get a tight lower bound (i.e. a value as high as possible), the maximal possible MINDIST value has to be used at every time slot.
Denition 6.8 (Lower Bound at a Single Time Slot).
Let Q and X be time series of length N and let X be represented by a col-
lection of interval boxes rep(X). Then LBi(Q, X) for time slot i is dened
as
LBi(Q, X) = max{0, max
{r|r∈rep(X),lr≤i≤ur}
(M IN DIST(qi, r))}.
Now we extend the lower bound at each time slot i to an interval.
Denition 6.9 (Lower Bound for a Single Interval).
Let X and Q be time series and let I = (lI, uI) be a time interval. Then the
lower bound LBI(Q, X) is dened as
LBI(Q, X) = p v u u t uI X i=lI (LBi(Q, X))p.
Finally we can dene a lower bound value for a set of non-overlapping relevant time intervals.
Denition 6.10 (Lower Bound for Interval-Focused Similarity). Let X and Q be time series and let I be a set of relevant time intervals.
Then the lower bound LBI(Q, X) is dened as
LBI(Q, X) = p s
X
I∈I
6.3 Index Support for Interval-Focused Queries 119 Lemma 6.1 (Lower Bounding Property of LBI).
Let X and Q be time series and let I be a set of relevant time intervals.
Then LBI(Q, X) is a lower bound for LIp(Q, X), i.e.
LBI(Q, X)≤LIp(Q, X)
Proof. Let X and Qbe two time series. Let I = (lr, ur) be a time interval.
Let us assume an overlapping time interval box r = (lr, ur, lvr, uvr) exists,
i.e. ∀xi, l ≤i≤u:lv≤xi ≤uv. At rst, we show M IN DIST(qi, r)≤ |qi−xi|: 1) qi ≥uvr: M IN DIST(qi, r) = qi −uvr ≤ qi − xi ≤ |qi − xi| (because uvr ≥xi) 2) qi ≤lvr: M IN DIST(qi, r) = lvr−qi ≤xi−qi ≤ |qi−xi| (because lvr ≤ xi) 3) lvr < qi < uvr: M IN DIST(qi, r) = 0≤ |qi−xi|.
So, M IN DIST(qi, r) ≤ |qi −xi|. This holds for all r ∈ rep(X). If no
box is available, the summand for the corresponding time slot equals 0 (see Denition 6.8). Therefore LBi(Q, X)≤ |q i−xi|. It follows uj X i=lj (LBi(Q, X))p ≤ uj X i=lj |qi−xi|p
This equivalent to (LBI(Q, X))p ≤ (LIp(Q, X))p. We apply this observation to a sequence of intervals I: X I∈I (LBI(Q, X))p ≤X I∈I (LIp(Q, X))p ⇓ p s X I∈I (LBI(Q, X))p ≤ p s X I∈I (LI p(Q, X))p ⇓
time
ti ti+1 ti+2 ti+3 ti+4 ti+5 ti+6 ti+7 ti+8 ti+9
Q
MIN
UBi(Q,X)
Figure 6.4: Lower and upper bounding theLp-distance within the interval
(ti, ti+9).
LBI(Q, X)≤LIp(Q, X)
2 Analogously, an upper bounding distance estimation can be dened. At each relevant time slot i, we now need to use the M AXDIST between qi
and any interval box r ∈ rep(X) that overlaps i to dene an upper bound
of the i-th summand of Lp(Q, X). The M AXDIST function is dened as
follows:
Denition 6.11 (MAXDIST).
Let Q be a query time series and qi the amplitude value of Q at time slot
ti. Let X ∈ D be a time series and let r ∈ rep(X) be an interval box that
overlaps ti, i.e. lr ≤ ti ≤ ur. Then the M AXDIST value between qi and
r= (lr, ur, lvr, uvr) is dened as
M AXDIST(qi, r) = max{|qi−lvr|,|qi−uvr|}
If there is an interval box r ∈ rep(X) that overlaps time slot ti, we can
upper bound the true distance betweenqi andxi using the MAXDIST value.
If there are several interval boxes r ∈ rep(X) with lr ≤ ti ≤ ur, we com-
6.3 Index Support for Interval-Focused Queries 121 approximation to the actual distance. If no overlapping box is available, we can only estimate the upper bound for a certain time slot by a value
u:=max{|qi−M AX|,|qi−M IN|}.
Denition 6.12 (Upper Bound at a Single Time Slot).
Let Q and X be time series of length N and let X be represented by a col-
lection of interval boxes rep(X). Then U Bi(Q, X) for time slot i is dened
as
U Bi(Q, X) = min{u, min
{r|r∈rep(X),lr≤i≤ur}
(M AXDIST(qi, r))}
Now we extend the denition of the upper bound at a single time slot i
to an interval.
Denition 6.13 (Upper Bound for a Single Interval).
Let X and Q be time series and let I = (lI, uI) be a time interval. Then the
upper bound U BI(Q, X) is dened as
U BI(Q, X) = p v u u t uI X i=lI (U Bi(Q, X))p.
Finally we can dene an upper bound value for a set of non-overlapping relevant time intervals.
Denition 6.14 (Upper Bound for Interval-Focused Similarity). Let X and Q be time series and let I be a set of relevant time intervals.
Then the upper bound U BI(Q, X) is given by
U BI(Q, X) = p s
X
I∈I
(U BI(Q, X))p.
Lemma 6.2 (Upper Bounding Property of U BI).
Let X and Q be time series and let I be a set of relevant time intervals.
Then U BI(Q, X) is an upper bound for LIp(Q, X), i.e.
The value for the upper bound can only be estimated as U Bti+6(Q, X) = max{|qti+6 − M AX|, |qti+6 − M IN|}. In contrast to that, at time ti+1 the interval box r = (ti, ti+3, lvr, uvr) ∈ rep(X) approximates the time se-
ries X. So we can estimate LBti+1(Q, X) = M IN DIST(q
ti+1, r) = 0 and
U Bti+1(Q, X) = M AXDIST(q
ti+1, r) = |qti+1−lvr|.