Proactive Resource Contention Avoidance in InterGrid Gateway
4.2 Analytical Queuing Model
The queuing model that represents a gateway (IGG) along with several RPs is depicted in Figure. 4.1. We consider each RP as a non-dedicated Cluster (i.e., Cluster with shared resources between local and external requests). There are N Clusters where Cluster j receives requests from two independent sources. One source is a stream of local requests with arrival rate λj and the other source is a stream of external requests which are sent by IGG with arrival rate ˆΛj. IGG receives external requests from other peer IGGs [131] (G1,..,Gpeer in Figure 4.1).
Therefore, external request arrival rate to IGG is Λ = ¯Λ1+ ¯Λ2+ ... + ¯Λpeer where peer indicates the number of IGGs that can potentially send external requests to IGG.
Local requests submitted to Cluster j must be executed on Cluster j un-less the requested resources are occupied by another local request or by a non-preemptable external request (see Chapter 3). The first and second moments of service time of local requests in Cluster j are τj and µj, respectively. An external request can be allocated to any Cluster but it might be subject to future pre-emption. We consider θj and ωj as the first and second moments of service time of external requests on Cluster j, respectively. For the sake of clarity, Table 4.1 provides the list of symbols we use in this chapter along with their meaning.
The analytical model aims at distributing the total original arrival rate of external requests (Λ) amongst the Clusters. In this situation, if we consider each Cluster as a single queue and IGG as a meta-scheduler that redirects each incoming external request to one of the Clusters, then the problem of scheduling external requests in IGG can be considered as a routing problem in distributed parallel queues [132].
Considering this situation, the goal of the scheduling in IGG is to schedule
Table 4.1: Description of symbols used in the queuing model.
Symbol Description N Number of Clusters
Mj Number of computing elements in Cluster j where 1 ≤ j ≤ N Λ¯j Original arrival rate of external requests to Cluster j
Λˆj Arrival rate of external requests to Cluster j after load distribution Λ =Ppeer
i=1 Λ¯i=PN j=1Λˆj
θj Average service time of a external request on Cluster j ωj Second moment of external requests service time on Cluster j γj = θj· ˆΛj
λj Arrival rate of local requests on Cluster j
κj Arrival rate of local requests plus external requests to Cluster j τj Average service time of local requests on Cluster j
µj Second moment of local requests service time on Cluster j ρj = τj· λj
mj =Λˆκj
jωj+λκj
jµj
uj Utilisation of Cluster j (= γj+ ρj)
rj Average response time of local requests on Cluster j ηj Number of VM preemptions that happen in Cluster j T Average response time of all external requests
Tj Average response time of external requests on Cluster j
¯
vj Average number of VMs required by external requests d¯j Average duration of external requests
sij Processing speed (MIPS) of processing element i in Cluster j
Figure 4.1: Queuing model for resource provisioning in a Grid with N RPs (Clus-ters).
the external requests amongst the Clusters in a way that minimises the overall number of VM preemptions in a Grid. Therefore, our primary objective function can be expressed as follows:
min
N
X
j=1
ηj (4.1)
To the best of our knowledge, there is no scheduling policy for such an environment with the goal of minimising number of VM preemptions. However, several research works have been undertaken in similar circumstances to minimise the average response time of external requests.
Number of VM Preemption
Figure 4.2: Regression between the number of VMs preempted and response time of the external requests.
Initial experiments intuitively suggest that there is an association between response time and number of VM preemptions in the Grid. The regression analy-sis with least squares method (depicted in Figure 4.2 and shown in Equation 4.2) demonstrates the positive correlation between the two factors. In Equation 4.2, R and η indicate the response time of external requests and number of VM pre-emptions, respectively.
R = 3.09 + 0.012η (4.2)
Therefore, we expect that a reduction in the average response time has similar impact on the overall number of VM preemptions. Simulation results, which are discussed in Section 4.4.3, also confirm the correlation of response time and number of VM preemptions in the system. Details of the analysis are discussed over the next paragraphs.
For this purpose, we extend the approach developed by Li [5], which has been applied within a Cluster, for circumstances where there is a Grid system where some external requests are more valuable than others (i.e., they have different QoS levels).
Thus, we can define a new objective function that aims at minimising the average response time of the external requests (Equation 4.3):
T = 1
Given the M/G/1 queue for each Cluster, and considering preemption of external
requests in favour of local requests, the response time of external requests in Cluster j (Tj) is defined based on Equation 4.4 [133].
Tj = 1
The constraint for Equation 4.3 is:
N
X
j=1
Λˆj − Λ = 0 (4.5)
The Lagrange multiplier method is applied to minimise Equation 4.3. We consider Equation 4.3 as f ( ˆΛj), Equation 4.5 as g( ˆΛj) − c, and z as the Lagrange multiplier. Then, the Lagrange function is defined as follows:
h( ˆΛj, z) = f ( ˆΛj) + z· g( ˆΛj) − c = 1
By solving the equations resulted from partial derivatives of all ˆΛj(1 ≤ j ≤ N ) and z, the input arrival rate of each Cluster is calculated based on Equation 4.7:
Λˆj = (1 − ρj)
In fact, Equation 4.8 expresses the relation between different parameters of the system in which z is unknown. By solving Equation 4.8 for all Clusters and calculating z, Equation 4.7 can be solved. However, finding a generic closed form solution for z in Equation 4.8 is not possible [5]. Nonetheless, z can be found in the range [lb,ub] numerically. For this purpose, considering that ˆΛj ≥ 0 and from Equation 4.7, we can infer that:
z ≥ λjµj
2(1 − ρj)2 + θj
(1 − ρj) (4.9)
Therefore, for all 1 ≤ j ≤ N the lower bound (lb) of the interval is:
If we define φj(z) according to Equation 4.11:
φj(z) = 1 θj
s(1 − ρj)(ωj(1 − ρj)) + θjλjµj
2θj(1 − ρj)z + (ωj − 2θj2) (4.11) and considering Equation 4.8, then we have:
N
The upper bound also can be determined based on Equation 4.13. ub can be reached by doubling lb up until the following condition is met.
N
If condition in Equation 4.12 is not met, then lb has to be decreased by removing Clusters which are heavily loaded. Load of the Cluster j is comprised of local requests that have been received and external requests which are already assigned to the Cluster. The load can be calculated as follows:
ψj = λjµj
It is worth mentioning that Clusters exceeding the value of k would not receive any external request from IGG (i.e., ˆΛk+1 = ˆΛk+2 = ... = ˆΛN = 0).