9.2 Heuristic k 2 raster: k
10.1.1 Basic components of the algorithm
Next, we present some basic elements that will be used later in the description of the algorithm.
10.1.1.1 Pointers
Being pr a pointer to an R-tree node, pr.MBRreturns the MBR of that node and
pr.ref returns the list of references to its children. Being pk a pointer to a k2-raster
node, pk.quadreturns the quadrant of that node.
10.1.1.2 Checking the overlapping
The most frequent operation of the algorithm will be to check whether the MBR of a node of the R-tree overlaps a region of the k2-raster having cells with values in
the queried range. For doing that, we need first to identify the smallest quadrant of the raster completely overlapping the MBR; and then, we need to check if any cell inside the quadrant, and overlapping also the MBR, stores one of the query values. A negative result means that it is possible to prune subtrees of both indexes.
It is critical that such a check be fast. Being a recurrent task in the algorithm, a slow check would spoil the running times. So, we will perform, when possible, a fast and "course-grained" check (checkQuadrant), and only when this is not enough, a more thorough, and thus more costly, "fine-grained" check (checkMBR).
The operation checkQuadrant(pr, pk, Range) takes a pointer pr to an R-tree
node, a pointer pk to a node of the k2-raster, and the query range. It returns a pair
hpkd, typeOverQuadi. The component pkd is a pointer to the deepest descendant
of the node pointed by pk that completely contains pr.M BR. The component
typeOverQuad can have one the following values:
• None means that pkd.quaddoes not have cells with values in the queried range.
Therefore, we can conclude without any further inspection that pr.M BRdoes
not overlap a portion of the raster having cells with values in the queried range, and thus the subtree rooted at prcan be pruned.
• Possible means that pkd.quadcontains cells with values in the queried range,
but it also contains cells with values outside of the queried range. This value does not allow to take a decision and thus the algorithm has to perform a deeper analysis.
82 Chapter 10. Spatial join: k2-raster and R-tree
• Full means that pkd.quadcontains exclusively cells with values in the queried
range. Therefore, this value also allows to take a decision, as all the MBRs in leaves of the subtree rooted at pr and the overlapping cells are part of the
solution, actually of the definitive list, and the check of that subtree can be stopped.
The operation checkMBR(pr, pk,Range) takes a pointer pr to an R-tree node,
a pointer pk to a node of the k2-raster, and the query range. It returns a value of
typeOverMBR, with the following meaning:
• None means that the geometry of pr.M BRdoes not overlap cells having values
in the queried range. Therefore pr.M BRis not part of the solution.
• Partial means that pr.M BRoverlaps cells with values in the queried range,
but it also overlaps cells with values outside of the queried range. Therefore,
pr.M BRand its overlapping cells with values in the queried range are part of
the list of probable results.
• Full means that pr.M BRoverlaps exclusively cells with values in the queried
range. Therefore, pr.M BR and its overlapping cells are part of the list of
definitiveresults.
The operation checkQuadrant is very fast, since it only checks the min-max values of the internal nodes of the k2-raster. From the node of the k2-raster provided
as input, it traverses the tree downwards following the unique child that completely contains the input MBR, as long as the query range intersects the range delimited by the minimum and maximum values of the node. Once it reaches a node where none of its children completely contains the MBR or the query range does not intersect the range defined by the minimum and maximum values stored at the node, then the operation ends.
The operation checkMBR is more complex because it must navigate downwards
all the k2-raster branches that intersect with the MBR and retrieve all the cells
in the k2-raster that intersect the MBR, which may require extracting portions of
different quadrants.
Let us illustrate this operation with the example at Figure 10.1. Let us suppose that we have a pointer to the node with MBR b, a pointer to the root node of the k2-raster (which encloses the whole raster), and the query range [4–5]. Then,
checkQuadrantchecks if the min-max values of the root define a range overlapping or
including the queried range. Since min = 1 and max = 5, then the search continues and it checks if one of its children completely encloses b. This is true for the child corresponding to quadrant q2. Next, it checks if values from q2 are contained within
the queried range. Since min = 2 and max = 3 for q2, therefore, checkQuadrant
ends and the result is typeOverQuad= None. Observe that without checking the actual cells of the raster, and already at the second level of the k2-raster, we can
10.1. Spatial join 83
Now, let us suppose another example taking as input a pointer to the node with MBR d, a pointer to the root of the k2-raster, and the query range [4–5]. As before,
the range defined by the min-max values of the root overlaps (but not includes) the query range, then the search continues. The child of the root including d is
q3. Again, we check if the min-max values of q3overlap or include the query range.
Quadrant q3 is the deepest node that completely contains d and the min-max values
at q3 are min=1 and max=5, thus, typeOverQuad = Possible. Since checkQuadrant
cannot determine the result of the join for this case, it will be necessary to use
checkMBR, which takes as input a pointer to the node of d and a pointer to q3. The
answer will be typeOverMBR= Full, and thus d and its overlapping cells are added to the definitive list.