• No results found

6.3 Index Support for Interval-Focused Queries

6.3.2 Distance Estimation Using Interval Boxes

Let us assume, a time series is represented by a set of interval boxes. In this section we show how the exact interval-focused distance between a query objectQand anyX ∈ Dcan be estimated by means of an upper and a lower

bound using the information ofrep(X).

At each relevant time sloti, we can lower bound the i-th summand of the Lp-norm by the well-known M IN DIST function.

Denition 6.7 (MINDIST).

Let Q be a query time series and qi the amplitude value of Q at time slot ti.

LetX ∈ Dbe a time series and letr ∈rep(X)be an interval box that overlaps

ti, i.e. lr ≤ti ≤ur. Then theM IN DIST betweenqi andr = (lr, ur, lvr, uvr)

is dened as M IN DIST(qi, r) =      lvr−qi if qi ≤lvr qi−uvr if qi ≥uvr 0 else.

are several interval boxes r ∈ rep(X) with lr ≤ it ≤ ur, we determine the

maximal value over all the corresponding MINDIST values. We will use the MINDIST value to dene a lower bound for the interval-focused similarity. In order to get a tight lower bound (i.e. a value as high as possible), the maximal possible MINDIST value has to be used at every time slot.

Denition 6.8 (Lower Bound at a Single Time Slot).

Let Q and X be time series of length N and let X be represented by a col-

lection of interval boxes rep(X). Then LBi(Q, X) for time slot i is dened

as

LBi(Q, X) = max{0, max

{r|r∈rep(X),lr≤i≤ur}

(M IN DIST(qi, r))}.

Now we extend the lower bound at each time slot i to an interval.

Denition 6.9 (Lower Bound for a Single Interval).

Let X and Q be time series and let I = (lI, uI) be a time interval. Then the

lower bound LBI(Q, X) is dened as

LBI(Q, X) = p v u u t uI X i=lI (LBi(Q, X))p.

Finally we can dene a lower bound value for a set of non-overlapping relevant time intervals.

Denition 6.10 (Lower Bound for Interval-Focused Similarity). Let X and Q be time series and let I be a set of relevant time intervals.

Then the lower bound LBI(Q, X) is dened as

LBI(Q, X) = p s

X

I∈I

6.3 Index Support for Interval-Focused Queries 119 Lemma 6.1 (Lower Bounding Property of LBI).

Let X and Q be time series and let I be a set of relevant time intervals.

Then LBI(Q, X) is a lower bound for LIp(Q, X), i.e.

LBI(Q, X)≤LIp(Q, X)

Proof. Let X and Qbe two time series. Let I = (lr, ur) be a time interval.

Let us assume an overlapping time interval box r = (lr, ur, lvr, uvr) exists,

i.e. ∀xi, l ≤i≤u:lv≤xi ≤uv. At rst, we show M IN DIST(qi, r)≤ |qi−xi|: 1) qi ≥uvr: M IN DIST(qi, r) = qi −uvr ≤ qi − xi ≤ |qi − xi| (because uvr ≥xi) 2) qi ≤lvr: M IN DIST(qi, r) = lvr−qi ≤xi−qi ≤ |qi−xi| (because lvr ≤ xi) 3) lvr < qi < uvr: M IN DIST(qi, r) = 0≤ |qi−xi|.

So, M IN DIST(qi, r) ≤ |qi −xi|. This holds for all r ∈ rep(X). If no

box is available, the summand for the corresponding time slot equals 0 (see Denition 6.8). Therefore LBi(Q, X)≤ |q i−xi|. It follows uj X i=lj (LBi(Q, X))p ≤ uj X i=lj |qi−xi|p

This equivalent to (LBI(Q, X))p ≤ (LIp(Q, X))p. We apply this observation to a sequence of intervals I: X I∈I (LBI(Q, X))p ≤X I∈I (LIp(Q, X))p ⇓ p s X I∈I (LBI(Q, X))p p s X I∈I (LI p(Q, X))p ⇓

time

ti ti+1 ti+2 ti+3 ti+4 ti+5 ti+6 ti+7 ti+8 ti+9

Q

MIN

UBi(Q,X)

Figure 6.4: Lower and upper bounding theLp-distance within the interval

(ti, ti+9).

LBI(Q, X)≤LIp(Q, X)

2 Analogously, an upper bounding distance estimation can be dened. At each relevant time slot i, we now need to use the M AXDIST between qi

and any interval box r ∈ rep(X) that overlaps i to dene an upper bound

of the i-th summand of Lp(Q, X). The M AXDIST function is dened as

follows:

Denition 6.11 (MAXDIST).

Let Q be a query time series and qi the amplitude value of Q at time slot

ti. Let X ∈ D be a time series and let r ∈ rep(X) be an interval box that

overlaps ti, i.e. lr ≤ ti ≤ ur. Then the M AXDIST value between qi and

r= (lr, ur, lvr, uvr) is dened as

M AXDIST(qi, r) = max{|qi−lvr|,|qi−uvr|}

If there is an interval box r ∈ rep(X) that overlaps time slot ti, we can

upper bound the true distance betweenqi andxi using the MAXDIST value.

If there are several interval boxes r ∈ rep(X) with lr ≤ ti ≤ ur, we com-

6.3 Index Support for Interval-Focused Queries 121 approximation to the actual distance. If no overlapping box is available, we can only estimate the upper bound for a certain time slot by a value

u:=max{|qi−M AX|,|qi−M IN|}.

Denition 6.12 (Upper Bound at a Single Time Slot).

Let Q and X be time series of length N and let X be represented by a col-

lection of interval boxes rep(X). Then U Bi(Q, X) for time slot i is dened

as

U Bi(Q, X) = min{u, min

{r|r∈rep(X),lr≤i≤ur}

(M AXDIST(qi, r))}

Now we extend the denition of the upper bound at a single time slot i

to an interval.

Denition 6.13 (Upper Bound for a Single Interval).

Let X and Q be time series and let I = (lI, uI) be a time interval. Then the

upper bound U BI(Q, X) is dened as

U BI(Q, X) = p v u u t uI X i=lI (U Bi(Q, X))p.

Finally we can dene an upper bound value for a set of non-overlapping relevant time intervals.

Denition 6.14 (Upper Bound for Interval-Focused Similarity). Let X and Q be time series and let I be a set of relevant time intervals.

Then the upper bound U BI(Q, X) is given by

U BI(Q, X) = p s

X

I∈I

(U BI(Q, X))p.

Lemma 6.2 (Upper Bounding Property of U BI).

Let X and Q be time series and let I be a set of relevant time intervals.

Then U BI(Q, X) is an upper bound for LIp(Q, X), i.e.

The value for the upper bound can only be estimated as U Bti+6(Q, X) = max{|qti+6 − M AX|, |qti+6 − M IN|}. In contrast to that, at time ti+1 the interval box r = (ti, ti+3, lvr, uvr) ∈ rep(X) approximates the time se-

ries X. So we can estimate LBti+1(Q, X) = M IN DIST(q

ti+1, r) = 0 and

U Bti+1(Q, X) = M AXDIST(q

ti+1, r) = |qti+1−lvr|.