Subset Pruning - Pruning Changepoint Vectors

4.7 Pruning Changepoint Vectors

4.7.2 Subset Pruning

We have seen how retrospective pruning can be used to remove previous changepoint vectors from future considerations. However, supposing we are at some current time-

CHAPTER 4. MULTIVARIATE CHANGEPOINT DETECTION 102 point τ∗ _{within the algorithm, this method of pruning does not prune any of the}

cτ∗ ∈ ¯C_τ∗ which each have to be considered at τ∗. Pruning these vectors would

reduce the amount of vectors cτ∗ ∈ ¯C_τ∗ for which h_c

τ ∗(c) has to be calculated for

each c ∈ Cτ∗₋₁(c_τ∗). Within this section we introduce further theory which allows for

the pruning of such vectors at each time-point τ∗_{, which we refer to herein as subset}

pruning.

Before continuing, we define some new notation in order to accommodate this theory. We use fj(t) to denote the minimum cost from time 0 up to time t in variable

j, including the α penalties but not the β penalties. We exclude these because fj(t)

represents a univariate cost, whereas β represents a multivariate penalty. Also, recall that for some changepoint vector c ∈ Cn, M(c) is the number of changepoint locations

occurring in any variable up to and including those in c. Hence, for some changepoint vector (t1, t2, . . . , tp), we can decompose F (·) as follows:

F (t1, t2, . . . , tp) = p X j=1 fj(tj) + βM (t1, t2, . . . , tp) .

Further, for a given J ∈ {1, . . . , p}, we use ¯CJ

τ∗ to denote the distinct subsets of ¯C_τ∗

such that ¯C_τJ∗ contains only the c_τ∗ ∈ ¯C_τ∗ which have J variables changing at time

τ∗, so that Pp

j=1I(c

τ∗ = τ∗) = J. This can be expressed by

¯ C_τJ∗ =    cτ∗ ∈ ¯C_τ∗ : p X j=1 I(cjτ∗ = τ∗) = J    . (4.7.6)

Note that ¯C_τp∗ = {(τ∗, τ∗, . . . , τ∗)}. For ease of notation, we define P to be the set of

all variables, so that P = {1, . . . , p}.

The motivation behind subset pruning is the consideration of the following sce- nario. Suppose that we have some p-variate series X of length n, time-points w and

τ∗ such that τ∗ < w, and some cw ∈ ¯Cw. Suppose further that we make the assump-

tion that the minimum cost to cw from the changepoint vector (τ∗, τ∗, . . . , τ∗) is lower

than the minimum cost from all changepoint vectors cJ ∈ ¯CτJ∗, for some J ∈ P with

(τ∗_{, τ}∗_{, . . . , τ}∗_{) to c}

w is lower that the minimum cost from all ci ∈ ¯Cτi∗, for i < J, to

cw. If such a property holds true, then this would allow for the pruning of different

subsets of affected variables, depending on the number of variables they contain which are changing at τ∗_.

We will see in the following proposition that this characteristic does indeed hold under certain conditions. Before examining this result, it is necessary to introduce some further notation. For a given time-point τ∗ _{and changepoint vector c}

τ∗, define

Pτ∗(c_τ∗) to be the set of variable indices of c_τ∗ such that cj_τ∗ = τ∗, so that |P_τ∗(c_J)| = J

for each cJ ∈ ¯CτJ∗. That is,

Pτ∗(c_τ∗) =

j ∈ P : cj_τ∗ = τ∗

. (4.7.7)

Finally, for a given cτ∗ ∈ ¯CJ ∗

τ∗, for J < J∗ define the following set:

E_τJ∗(c_τ∗) = n c ∈ ¯C_τJ∗ : cj ≤ cj_τ∗ ∀ j ∈ P o , (4.7.8) so that EJ

τ∗(c_τ∗) is the set of previous time-point vectors which are ‘viable’ for being

changepoint vectors prior to cτ∗. Proposition 4.7.2 establishes that, under certain

conditions regarding the changepoint vectors with one variable changing at some time- point τ∗_{, then we can prune the changepoint vectors which have i variables changing}

at τ∗_.

Proposition 4.7.2. Suppose that for some J ∈ {1, . . . , p} and each cJ ∈ ¯CτJ∗, we

have for every cJ −1 ∈

E_τJ −1∗ (c_J) : cj_{J −1} = cj_J ∀ j ∈ P \ P_τ∗(c_J)} that

hcw(cJ) < hcw(cJ −1) (4.7.9)

for some future vector cw ∈ ¯Cw, where w > τ∗.

Suppose further that we have changepoint vectors {cJ −1,j₁∗, cJ −1,j₂∗, . . . , cJ −1,j_i∗} ∈

E_τJ −1∗ (c_J) such that for each x = 1, . . . , i, we have cj ∗ x J −1,j∗ x = tj ∗ x and c j∗ x J = τ ∗ _(with tj∗ x < τ ∗_{), and c}j J −1,j∗ x = c j J for all j ∈ {P \ Pτ∗(c_J)}.

CHAPTER 4. MULTIVARIATE CHANGEPOINT DETECTION 104

Then if it holds that (i − 1)M(cJ) ≥Pix=1M(cJ −1,j∗

x), we have

hcw(cJ) < hcw(cJ −i) (4.7.10)

for every cJ −i ∈ {EτJ −i∗ (c_J) : c_{J −i}j = cj_J ∀ j ∈ P \ P_τ∗(c_J)}, i = 2, . . . , J − 1.

Proof. See Appendix A.3 for a full proof.

Proposition 4.7.2 implies that we do not need to calculate any of the hcw(cJ −i) for

any cJ −i. Hence, these cJ −i can be ‘pruned’ from our considerations for cw. Otherwise,

it is not necessarily true that hcw(cJ) < hcw(cJ −i), and so we are not able to use such

an inequality for pruning purposes.

In document Changepoint detection for acoustic sensing signals (Page 112-115)