• No results found

Join Index Summary (JIS)

In document Updating compressed column-stores (Page 170-173)

7.4 Join Index Maintenance

7.4.3 Join Index Summary (JIS)

To find the foreign-RID (FRID) belonging to an arbitrary child-side tuple at LRID=i, we need to sequentially scan and decompress the parent-side JI column from the start, until we reach the i’th decompressed value (counting from 0), or, alternatively, sum the JI counts until we find the bucket that covers i. For range scans, this is undesirable. The join index summary (JIS), already depicted in Figure 7.2, alleviates this problem by providing a clustered, sparse index on (FSID, LSID), i.e. the uncompressed image of the the stable part of the JI column, thereby providing sync points for the decompression process. JIS partitions the SID space of table C into SID ranges, in such a way that the start of a partition coincides with a change of value in C’s foreign key column, C.FK, which we call a cluster head.

cluster head refers to the first child tuple, in terms of sort key ordering, within a cluster of child tuples that match on foreign key.

sync point is a special (FSID, LSID) pair that marks the start of a JIS par- tition, with the requirement that LSID refers to a cluster head. A con- sequence is that parent tuples that are not referenced by any stable child tuple, and therefore have a stale P.JI count of 0, are never used as a sync point.

For each partition, JIS stores LSID, as SID_C, and the corresponding FSID, as SID_P, which can be used to match the count in P.JI to its first child-side reference, allowing us to initiate decompression of the join index.

158 CHAPTER 7. INDEX MAINTENANCE

0

g

O

S

j

1

1

2

2

2

2

1

1

1

2

JK SK JI

Table P

JK

SID FSID LSID

Table C

SK2 FK

e

X

1

0

0

X

6

O

S

1

1

1

O

(a) Safe insert

2

e

X

0

JK SK JI

Table P

JK

SID FSID LSID

Table C

SK2 FK

X

6

0

0

O

S

2

2

1

1

1

2

7

g

O

S

j

1

1

1

2

X

0

1

(b) Dangerous insert

Figure 7.4: PDT inserts, shown in gray, that share the SID of a sync point (dashed line). Note that the join index column (JI), shows up-to-date values.

Unfortunately, we can not always rely on a sync point to provide a valid offset to start a scan. The reason being that JIS indexes SIDs, and SIDs are not guaranteed to be unique as soon as PDT inserts are involved This might lead to FSID references crossing the boundary defined by a sync point, rendering that sync point invalid. In Figure 7.4 we see a minimal example of such a scenario.

Figure 7.4 shows two distinct inserts into the child table, both ending up at the same child position, at LSID=1. We do not show the entire JIS structure, but restrict ourselves to a single partition boundary, assumed to be defined in the JIS, but here only marked by the dashed line. This corresponds to a sync point of (FSID=1, LSID=1), meaning that both child inserts share their LSID with the sync point.

The crucial difference between the inserts (shown in gray) in Figure 7.4a and 7.4b is that they refer to different tuples in the parent. Both (’O’, 1) and (’X’, 7) sort directly before the (’O’, 2) sync point. However, the (’O’, 1) tuple belongs to the same FK cluster as the stable (’O’, 2) tuple, in fact introducing a new cluster head, one that shares its SID with the former (’O’, 2) cluster head. Given that the (’O’, 1) insert also increments the JI count of the parent-side sync point, at FSID=1, this sync point remains valid, or stable.

The (’X’, 7) insert, on the contrary, is the new tail of the ’X’ cluster that originates in the preceding partition, as indicated by its reference to FSID=0. We can not avoid this tuple from acquiring an LSID of 1, by definition of the way we handle inserts, and have to accept that updates can cause an FK-cluster to overlap a sync point. This scenario renders a sync point invalid, or dirty. A dirty sync point can not be used to initiate decompression, as the (’X’, 7) tuple would contribute towards the JI count of the parent-side sync point, at FSID=1,

7.4. JOIN INDEX MAINTENANCE 159

which would corrupt the decompression.

For completeness, note that one scenario is lacking from Figure 7.4: a sce- nario where a child-side tuple, γ, in some partition Pi, refers to a parent tuple,

π, in a succeeding partition, Pi+1. Assuming that insertion of child-side tuples

respects the joint sort-key ordering, such a situation can only arise in case the parent-side sync point of Pi+1, say tuple σ, is marked deleted, with π being the

first visible parent tuple directly after that deletion. Now assume that in such a scenario we could insert tuple γ, which refers to π, into Pi. This would require

all tuples referring to σ to still be present in the child, to force γ into Pi in

terms of sort ordering. However, due to referential integrity, such a scenario can not occur, as all tuples referring to σ need to be deleted before we can delete σ. This forces γ to the end of the corresponding delete chain in the child, moving γ into π’s partition, Pi+1. (Recall that parent tuples that are not referenced,

i.e. with a stable JI count of 0, are ignored in the construction of sync points in the JIS).

JIS Maintenance and Lookup

Given that PDT updates can render a sync point invalid, we need a JIS main- tenance mechanism that marks dirty sync points, together with lookup mecha- nisms that avoid usage of them. This is why we have a third field, MIN_P, in the JIS layout of Figure 7.2. For each JIS partition, MIN_P holds the current minimum parent SID referenced by child tuples in that partition, and is the only field in JIS that can change under updates. SID_P and SID_C are kept stable.

MIN_P starts out with a value equal to SID_P. As soon as MINP < SIDP,

the sync point for given partition is invalid. Note that MIN_P can only become smaller than SID_P, as we learned in Section 7.4.3 that child-side FK references in partition Pj can only refer to partitions Pi such that i ≤ j. Also note that,

in general, i can indeed be smaller than j − 1, meaning that child tuples can refer to parent tuples in any preceding partition3. This happens if a block of

consecutive child-side tuples that covers more than one sync point is deleted, after which new child-side inserts that sort within the deleted range end up as PDT inserts at the tail of that delete chain. Such an out-of-order insert renders all the sync points in the deleted range invalid, by updating their MIN_P field to the FSID of the newly inserted tuple.

A JIS index can be maintained by passing it an (FRID, LRID) pair for every child-side insert. Such a pair represents the RID of the parent tuple being referenced, FRID, together with the child-side insert position, LRID. It is generated during a (very efficient)4 join between the parent and child table,

required to find the correct insert position within the clustered child. After converting (FRID, LRID) into (FSID, LSID), using RidToSid (Algorithm 11), we locate the JIS partition LSID falls into, and update MIN_P in case it is bigger than FSID.

When performing a JIS lookup, the goal is to, given either an FSID or an LSID, locate the nearest preceding stable sync point, which allows us to initiate

3

Which is the reason for maintaining MIN_P SIDs for every sync point, rather than a simple “dirty bit”.

4This join can be computed by a linear scan of the parent- and child-side sort key attributes,

160 CHAPTER 7. INDEX MAINTENANCE

Algorithm 14 JIS.findSyncPoint(f sid)

For an arbitrary parent SID, fsid, this routine searches backwards through the partitions in a JIS, until a non-dirty sync point is found, and returns the corresponding stable sync point, (SIDP, SIDC).

1: sidP ← 0

2: sidC ← 0

3: for i ← this.size() − 1; i >= 0; i ← i − 1 do

4: sidP ← partition[i].SIDP 5: sidC ← partition[i].SIDC

6: if partition[i].M INP ≤ f sid then

7: break {Note that M INP is unique due to FK clustering} 8: end if

9: end for

10: while i > 0 and partition[i].SIDP 6= partition[i].M INP do 11: i ← i − 1

12: sidP ← partition[i].SIDP 13: sidC ← partition[i].SIDC

14: end while{Search backwards for nearest non-dirty sync-point}

15: return (sidP, sidC)

a range scan. For a given FSID, this process is outlined in Algorithm 14. The algorithm first searches the JIS for the partition that covers fsid, and returns the nearest non-dirty sync point it finds. In case of a child-side LSID, we first locate the partition it falls into, and use the MIN_P field of that partition as an input to Algorithm 14.

In document Updating compressed column-stores (Page 170-173)