Composability Analysis - Scenario Analysis

8.2 Scenario Analysis

8.2.3 Composability Analysis

The purpose of the composability analysis component in Figure 8.3 is to provide information about the timing behavior of different applications running concurrently. In order

152 Chapter 8. Multi-application Flow for this analysis to be fast, the schedules are not run on the actual platform nor in the TRM. Instead, the pre-processed utilization functions are used. Recall that this component processes all candidate use case configurations for a given use case UC produced by the first component (in Section 8.2.1).

Let ˆUUC ₌ S Ai∈UCUˆ RCAi SOC = { ˆϑRC A1 PE1 , . . . , ˆϑ RCAi PEj , . . . , ˆϑ RCAm

PEn } denote the pre-processed

utilization functions for a candidate use case runtime configuration RCUC. Two compos- ability functions are analyzed in this thesis. The first one is a mean-criterion, similar to the one in [144]. The second one, is a displacement-criterion which provides a more thorough analysis by testing different application starting times.

Mean-criterion Composability: This is a simple approach in which the mean utiliza-

tion of every application on every processor is computed. The combined utilization of every processor is then compared against a threshold. More formally, the combined mean utilization of processor PE due to a use case configuration RCUC is:

¯ ϑRC_PEUC = X A∈UC    1 N_PEA · NA PE−1 X k=0 ˆ ϑ_PERCA    (8.2)

where N_PEA is the number of samples in the utilization function of application A on processor PE. Whether the use case is feasible or not is a binary decision, given by:

ϑmax>maxPE∈P E( ¯ϑRC

PE ). ¯ϑmax∈ [0, 1] is a threshold provided by the user.

Note that the lower the mean utilization on each processor, the better the configuration is, i.e., the more likely it is that the constraints will be met. This is used to define the score of a use case runtime configuration,

ω_mcRCUC = ¯ϑmax− max

PE∈P E( ¯ϑ RCUC

PE ) (8.3)

Note that ω_mcRCUC ∈ [ ¯ϑmax− |SUC|, ¯ϑmax], so that the higher the value, the better the

configuration is. Configurations with a negative score ωRC_mcUC <0 are discarded.

Displacement-criterion Composability: Note that the previous approach provides a

coarse guarantee. The mean utilization criterion states that all the computations can be carried out in the available computation time, if the mean utilization is not above 100%. This would mean that, for example, the mean throughput of a process may be respected. However, this criterion does not say anything about the variation of the throughput along the entire execution. Additionally, instantaneous load situations may introduce path la- tency violations that cannot be detected with the mean criterion. This is illustrated with the example in Figure 8.6. Figure 8.6a shows the utilization functions of two different applications (A1, A2) on a single processor PE1. Intuitively, it can be seen that the mean

computation in Equation 8.2 would produce a value under 1.0, i.e., less than 100% utilization. Figure 8.6b shows that depending on the relative starting time of the applications, different load situations are observed. In the upper plot of Figure 8.6b, it is assumed that both applications start at the same time. As a result, the combined utilization is computed by adding the functions in Figure 8.6a, as shown by the solid blue line in Figure 8.6b. In this case, the combined utilization lies below the 100% mark, indicating that both applica-

8.2. Scenario Analysis 153 b) t t 1.0 1.0 t t

a) Single-app. utilization Combined utilizations 1.0

1.0

Figure 8.6: Combined processor utilization. a) Sample utilization functions on a single

processor. b) Combined utilization functions with different displacements.

tions may run correctly on processor PE₁. However, if the second application starts while the first application is running, the combined load situation changes, as shown in the bot- tom plot of Figure 8.6b. In this case, the instantaneous load surpasses the 100% mark, indicating that the timing behavior of the applications may be affected. If the computa- tional peak observed in the utilization function of A₁is required to meet a path constraint, it is likely that this constraint will be missed.

The displacement-criterion tries to account for these changes in the instantaneous load, by analyzing the combined utilization of every processor, given different, relative applica- tion starting times. For two applications A₁ and A2, with candidate configurations RCA1

and RCA2_{, the displaced combined utilization is defined by:}

ˆ ϑA1,A2 PE (k, τ) = ˆϑRC A1 PE (k) + ˆϑRC A2 PE (k − τ) (8.4)

The displacement-criterion then analyzes how often the instantaneous combined load goes above a user-defined threshold ˜ϑmax for every displacement τ. For this analysis a

displacement function is defined as:

˜ ϑ_PERCA1,RCA2(τ) = N_PEA1−1 X k=0 ( ˆϑA1,A2 PE (k, τ) − ˜ϑmax) · H( ˆϑ A1,A2 PE (k, τ) − ˜ϑmax) (8.5)

where H is the Heaviside function and τ ∈ {0, . . . , NA2

PE − 1}. Note that the Heaviside

function cancels all the values of ˆϑA1,A2

PE (k, τ) below the threshold ˜ϑmax. In the case of

periodic applications, the time shift in Equation 8.4 is replaced by a circular shift.

The displacement analysis for two applications given by Equation 8.5 can be ex- tended to more applications. For three applications, the worst-case displacement of the first two applications is determined first. The worst-case is determined by the displacement for which the instantaneous load surpasses the threshold the most, i.e.,

τ∗ = argmax

( ˜ϑ_PERCA1,RCA2(τ)). Thereafter, the displacement function in Equation 8.5 is computed for the functions ˆϑRC_PEA3 and ˆϑA1,A2

PE |τ=τ∗ (from Equation 8.4). The resulting

function ˜ϑ_PERCA1,RCA2,RCA3 can be then used to compute the displacement function for more applications. The process can be repeated until all applications have been analyzed and a joint displacement function ˜ϑRC_PEUC has been computed for every processor.

The discrete function ˜ϑRC_PEUC gives a qualitative measure of the feasibility of a use case. The higher the values of the function, the more unlikely it is for the two applications to

154 Chapter 8. Multi-application Flow run simultaneously and yet meet the constraints. In order to provide a single score for the current candidate use case configuration, the maximum average value across all the processors is used. This value is then scaled and inverted, i.e.,

ω_dcRCUC = −1 NAm PE∗ · max PE∈P E    1 NAm−1 PE N_PEAm−1−1 X τ=0 ˜ ϑ_PERCUC(τ)    (8.6)

where PE∗ is the processor with the maximum average displacement function. The outer normalization (NAm

PE∗) in Equation 8.6 is inserted in order to remove the effect of

the summation in Equation 8.5. As with the mean-criterion score from Equation 8.3, the higher the displacement-criterion score, the better the configuration is. Note that

ω_dcRCUC ∈ [ ¯ϑmax− |SUC|, 0]. A hard decision criterion would discard all configurations for

which ω_dcRCUC < 0. Note however, that the score does not carry as much information as the individual functions ˜ϑRC_PEUC.

In this section, the platform utilization profiles are used to draw conclusions on whether it is possible for two or more applications to run simultaneously while still meeting their constraints. In the case of different application classes, e.g., real and non-real time, a hierarchical scheduler is used in which the best effort applications are executed only when the real time applications are blocked (recall Figure 4.1). This ensures that the execution of real time applications is not affected by the presence of best-effort applications. For applications of the same class, the composability analysis evaluates the jointly required processing bandwidth. How the bandwidth relates to the real time schedulability of the underlying scheduling algorithm is not addressed in the multi-application flow1.

In document Programming heterogeneous MPSoCs : tool flows to close the software productivity gap (Page 161-164)