PART 3 REPLICATION BASED RARE OBJECT LOCATION
3.5 Query Routing
To achieve a high search rate for both popular and rare objects, we propose a search mechanism that takes advantage of the concept of successful query path evolution and the overlay graph property- Connected Dominating Sets- to allow peers to make query routing decisions.
Upon generation or receipt of a query for an object o, a peer P estimates the popularity of the requested object. P’s action differs based on whether the object is poular or not.
3.5.1 Popular Object Search
In our work, an object o is popular if at least one copy of the object is within sufficiently reachable distance (bounded by TTL) to P. This is denoted by the either an existence of an entry for o in P’s routing index or its bloom filter data structure. If the object has records in its routing index P will compute a score for each its neighbors called a Path Gradient(PG) based on the evolution of successful query paths for o through each neighbor.
Best Path Gradient (BPG) Criterion
If the R-SPUN index of P already contains entries for the queried object, the neighbors leading to the most successful query path for object are chosen based on its local knowledge. We define a new metric called Path Gradient (PG) to quantitatively measure the success of a query path of a given neighbor N for a given object o as follows:
P GN =
Pd−1
i=1(RSRVNo[i]−RSRVNo[i−1])
d−1 (3.4)
vectorRSRVo
N. One of the major properties of a successful query path is the monotonically increasing property of success ratios assigned to edges between neighbors in the query path. PG metric captures this property by calculating the gradient of success ratios in a query path. A successful query path has a non-negative PG value with large magnitude. For example, if RSRVo
N = [30,40,50,60] then P GP=[(40-30)+(50-40)+(60-50)]/3 = 10. PG is calculated by P for each neighbor based on their RSRVs in Ps R-SPUN index for object o. The top k neighbors with highest PG values are then selected to deploy search walkers. However, when the RSRVo
N vectors of neighbors are of length d=1, only RSR0 values exist in P’s R-SPUN index for neighbors for object o, simply the RSR0 values for each neighbor is regarded as their corresponding score .
On the other hand if an entry does not exist for o at P’s index, P examine each of the bloom filters representing its neighbors to select the subset of neighbors that contain o. Then k neighbors are randomly selected to launch the query. While traditional Bloom Filters do not give an indication of the strength of peers, we may use Spectral Bloom Filters to avoid this situation. Spectral Bloom Filters provide an indication of the frequency of items in the Bloom Filter. The frequency usually represents the number of times the object was inserted into the Bloom Filter and thus can be considered as the number of replicas of o reachable through the neighbor. Therefore considering object frequency of the neighbor as its scores allows us to select the neighbors leading to replica rich areas for o thus increasing probability of finding o successfully.
3.5.2 Rare Object Search
If an entry for the object does not exist in P’s index or reachable Objects Bloom Filter, it decides that the object is rare. Then it changes the search mode to CDS routing. Here, the peer selects one of its CDS neighbors and sends the query to the CDS peer. Once in CDS mode CDS peer routes the query along the CDS backbone until TTL expires or a CDS member containing a reference to requested object is found.
Algorithm 3.2 R-SPUN Query Routing
Input:
Q= Query Message
P = Message receiver peer
N = Message sender peer
o= Requested Object
SearchM ode = the search mode
Index(P) = R-SPUN index of P
BReach(P) = Reachable Documents Bloom Filter of P
Candidates(Q) =N eighborhood(P)Q.V isitedP eers N eiborhoodCDS(P) = neighbors of P in CDS backbone
k= the walker count
Output: ‘
R = Retrieved documents and references
SearchM ode = the search mode
Selected(Q) = the selected peers for query routing
1: Selected(Q)←φ 2: Q.T T L←Q.T T L−1
3: R←LocalSearch(o)
4: 5 Found target Object
5: if R6=φ then
6: Return a HIT with R and terminate search
7: else if P ∈CDS AN D P has a ref erence to o then 8: R←P0s ref erence to o
9: Return a HIT with R and terminate search
10: else if Q.T T L≥0 then
11: Return a MISS with R and terminate search
12: else
13: if Index(P)contains entries f or o then 14: 5 Regular Search Mode
15: for each PeerN ∈Candidates(Q) do 16: Score←CalculateBP Gradient
17: end for
18: else if BReach(P)) contains entries f or othen
19: for each PeerN ∈Candidates(Q) do
20: Score←Get f requency of o f or N in BReach(P)
21: end for
22: SortCandidates(Q) based on scores
23: Selected(Q)←Top Candidates(Q) with highest score
24: SearchM ode←Regular 25: else
26: 5 CDS Search Mode
27: Selected(Q)←Randomly select 1 neighbor from CDS(P)T
Candidates(Q)
28: SearchM ode←CDS 29: end if
30: Return Selected(Q),SearchM ode andR 31: end if