• No results found

CHAPTER 5. PERFORMANCE ANALYSIS

5.6. Performance Model

Figure 16 shows the components of the average response time. These components are as follows:-

TP = Processing time per request at the manager

TR = Processing time at Resource (which is minimal for testing purposes) and represents the un-

marshalling cost of response. Our test measurement does not consider cost of processing the response, but simply the minimal cost associated with arrival of response (un-marshalling XML).

Apart from these factors, there is network latency LMB between Manager and Broker, and LBR

between Broker and Resource

Thus, for any event, the total event handling time TE is the sum of all above:

TE = TP + 2 * (LMB + TX + LBR) + TR ...(1)

Figure 16 Modeling components of response time as seen by the resources

Note that the multiplier 2 refers to request and response. We now discuss the various factors as follows:

To compute TX, note that a single broker (when not saturated, can give throughput up to 5000

messages /sec) for message sizes 512 bytes, and > 4500 messages/sec for message size 1024 bytes.

For sake of illustration, let us assume the maximum throughput to be about 4000 messages /sec.

For N concurrent requests, we have N responses and so as long as 2N < 4000, the broker is not saturated, i.e. the broker transit time can be ignored for N < 2000. Since we posit, that each broker can support up to a maximum of 800 resources we consider TX to be very small.

Further the Network latency will be considered as a constant L i.e.

L = 2*(LMB + LBR) ...(2)

Further, TR is constant and represents the un-marshalling costs of the response (about 1 msec).

We consider all constants as

K = L + 2*TX + TR ...(3)

Thus, the total event handling time (as seen by individual resources), from (1), (2) and (3) is:

TE = TP + K ...(4)

Further, TP can be broken down into time required for the processing thread to perform

additional operations (TEXTERNAL), pure processing (TCPUMANAGER) and an additional time spent in

process scheduling (TSCHEDULING).

The external operations include one or more registry or disk accesses. If external calls are blocking calls, this allows other requests to be handled while the thread blocks on response from external dependent components. In our experimental setup, there is no external dependency while handling the event and so TEXTERNAL = 0. So our model refers to a case where the only time

spent is processing the message and there is no dependence on external factors.

TSCHEDULING becomes significant when there are more processes than available processors. For

E.g.: This explains why the average response time when running 4 managers on single machine is slightly greater than running 2 managers on a single machine, where the machine, GridFarm, has 2 available processors per machine. While, further analysis of this factor is out of scope of our current work, we note that this factor should be considered when determining the number of managers that must be run per machine given a desired quality of service (such as average response time when all managed resources simultaneously generate events).

Finally, TCPUMANAGER is a Resource-specific activity that includes the necessary processing including un-marshalling of request and marshalling of corresponding response.

Thus, we get,

TP = TCPU

MANAGER + TEXTERNAL + TSCHEDULING

Further, on hyper-threaded processors, multiple requests can be processed simultaneously. If C is the number of threads that can be simultaneously active, then up to C requests can be processed in time TP. Thus, the average time required for processing C requests is TP/C and the total time for processing N requests (TPROC) is

TPROC = (N/C)*TP ...(5)

N = Number of concurrent requests

TP = Processing time per request on manager's side

C = Maximum number of threads that can execute simultaneously (C = 2 in our case, for hyper threaded processors)

As an illustration, we collected the average of event processing times for 150 resources (using a single manager on 1 machine) and we get an average value of TP = 8.37 msec. Thus,

TPROC = (N/C)*TP = (N/2)*8.37 4.2*N ...(6)

Thus, total observed time (theoretical, assuming TP = 8.37) for processing N concurrent events is

TOBV = TPROC + K

= (N/C)*TP + K

≈ 4.2*N + K...(7)

Here K represents the constant (that combines network latency, broker transit time and un- marshalling time on resource). Since the number of resources only affects the processing time at manager, this constant is independent of N as long as the broker is not saturated with processing messages.

Further, our test setup runs on Grid farm machines which has every processor hyper-threaded i.e. it can run a max of 2 threads simultaneously (C = 2). Thus, the maximum request processing rate by a single multithreaded manager process is

D = (C/TP) requests/sec

Further, the manager will not be overloaded as long as the total requests to be processed are <= maximum outgoing rate i.e. <= D. When the manager is managing more than D requests, the manager gets overloaded. Hence, we would see degradation in performance. Thus, D determines the maximum number of concurrent requests that a single manager can handle with linear increase in response time.

As an illustration, theoretically (for TP = 8.37 msec and C = 2),

D = (C/TP) * 1000 = (2/8.37) * 1000 239 requests / sec

To find the observed break point of manager we steadily increase the number of concurrent requests on 1 manager. The test setup puts the value of D ≈ 210, as observed in Figure 17.

Thus, we conclude that a single manager can be assigned (D ≈ 200) resources, subject to the conditions of test setup such as resource-specific event handling. While D could be much higher than 200, weighing other factors such as desired average response time, we limit the value of D to 200. For sake of illustration, we use the value of D = 200 in future calculations when determining percentage of additional management infrastructure required.

Related documents