2.4 A Lower Bound
2.4.2 Properties of C LR and C RL
In this section I present properties of compatible intervals and provide proofs. My first the- orem states that “The covers CLR and CRL have the same number of intervals k, and k
is the minimal number of intervals possible for any complete cover.” A second theorem identifies certain core subintervals are common to all completek-covers of a given SNP set (Figure 2.2a). These subintervals are the intersections of corresponding intervals fromCLR
andCRLwhich I have called cores (Figure 2.4b). This implies that ak-cover must containk
any interval that does not contain an entire core is not part of any completek-cover and the second is that theith core is only contained within theith interval of anyk-cover. Cores have several interesting properties worth noting. All SNPs in a core are compatible because each core is an intersection of two compatible regions. Adjacent cores must contain at least one pair of incompatible SNPs.
LEMMA 1. Anyc-interval cover covering the range[1, ec], withec≤m, satisfiesec≤eLc,
whereeL
c is the endpoint of thecthinterval ofCLR over the same sequence.
PROOF. By induction, a single-interval cover must end at, or short of, eL
1 (an overage
would indicate LRScan closed the interval prematurely). Assume that the Lemma holds for ani-interval cover, thus ei ≤ eLi. This implies for any(i+ 1)-interval cover bi+1 ≤ bLi+1.
We now proveei+1 ≤eLi+1. Since there exists a SNP,s, in the(i+ 1)th interval ofCLR(i.e.
within[bLi+1, eLi+1]) wheresis incompatible withseL
i+1+1,[bi+1, ei+1]cannot be a compatible
interval ifei+1 > eLi+1. Therefore, we haveei+1 ≤eLi+1.
A symmetric Lemma forCRLstates that anyC-interval cover covering the range[b1, m],
withb1 ≥ 1, satisfies bc ≥ bRc, wherebRc is the start of thecth interval ofCRL over the same
sequence.
THEOREM 1. The coversCLR andCRLhave the same number of intervalsk, andk is the
minimal number of intervals possible for any complete cover.
PROOF. Assume kCLRk = i. According to Lemma 1, for any c-interval cover, C, with
c < iand starting from the left-most SNP, the end of the cover’s rangeec ≤ bLc < bLi =m.
Therefore,C cannot be a complete cover if kCk < kCLRk, implying that a complete cover
must have at leastkCLRkintervals. A similar conclusion can be drawn forkCRLkusing the
symmetric version of Lemma 1. Therefore, we have kCLRk = kCRLk = k, and k is the
LEMMA 2. For alli= 1. . . k,Corei 6=∅.
PROOF. By induction. Assume CLR = {I1L, . . . , IkL}, and CRL = {I1R, . . . , IkR}. By
definitionCorei is[bLi, eiR]; therefore, it is sufficient to prove bLi ≤ eRi . This is trivially the
case fori = 1;bL
1 ≤ eR1. Assume thatbLi ≤eRi holds fori, we now prove it holds for i+ 1.
From bL
i ≤ eRi , we know bLi < bRi+1. If biL+1 ≤ eRi+1 does not hold, implying eRi+1 ≤ eLi ,
then we knowIiL ⊃ IiR+1, and SNP seR i (∈ I
L
i )must be compatible with all SNPs in IiR+1.
Figure 2.5b illustrates this proof. seR
i is outside and adjacent to I
R
i+1. By definition of the
Right-to-Left cover, SNP seR
i must be incompatible with at least one SNP in I
R
i+1, which
results in a contradiction.
THEOREM 2. For any completek-coverC1,m={I1, . . . , Ik}, theith interval contains the
entire ith core: Corei ∩ Ii = Corei, and, it does not contain any part of another core
Corej∩Ii =φ,1≤j ≤k, j 6=i.
PROOF. Since Corei = [bLi, eRi ], to prove the ith interval Ii = [bi, ei] contains Corei,
it is sufficient to prove bi ≤ bLi and ei ≥ eRi . Since {I1, . . . , Ii−1} is an (i −1)-cover
covering the range[1, ei−1]starting from the leftmost SNP, according to Lemma 1, we have ei−1 ≤eLi−1, which meansbi ≤bLi . Similarly, we can proveei ≥eRi (see Fig. 2.5c). To prove
Corej ∩Ii = ∅ for any other coreCorej, it is sufficient to prove ei < bLi+1 and bi > eRi−1.
Since{I1, . . . , Ii} is an i-cover covering the range[1, ei], according to Lemma 1, we have
ei ≤eLi < bLi+1. Similarly, we can provebi > eRi−1.
Stated formally, Corei = IiL ∩ IiR. According to Lemma 1, eRi ≤ eLi and bLi ≥ bRi
(Figure 2.5a), therefore Corei = [bLi, eRi ]. Theorem 2 states, “For any complete k-cover,
C{1, m} = {I1, . . . , Ik}, theith interval contains the entire ith core: Corei∩Ii = Corei,
and, it does not contain any part of another core Corej ∩Ii = Φ,1 ≤ j ≤ k, j 6= i”.
This is due to the interleaving of the non-overlapping intervals of CLR and CRL. Corei
is necessarily compatible with both IL
i and IiR and cannot be extended beyond the outside
PROPERTY 1. All SNPs in a core are compatible, since each core is an intersection of two compatible intervals
PROPERTY 2. Adjacent cores must contain at least one pair of incompatible SNPs
PROOF. This can be demonstrated by contradiction. Assume that two adjacent cores,corei
andcorei+1, are compatible. By definition,corei = [bLi , eRi ]. Since corei ⊆ IiL = [bLi, eLi],
corei is compatible with [eRi + 1, eLi ]. Similarly, corei+1 is compatible with [eRi + 1, eLi].
Therefore,corei∪[eRi + 1, eLi]∪corei+1 = [bLi , eRi+1]is a compatible interval containing both
cores and contradicting Theorem 2 (Fig. 2.5d).