• No results found

This section describes the hardware and software integration of OSIP in heterogeneous MPSoCs and introduces the ESL models used in the test platforms of this thesis. The integration from the hardware and software perspectives is addressed in Sections 4.2.1– 4.2.2. The ESL models are the matter of Section 4.2.3.

4.2.1

Hardware Integration

Figure 4.4 shows a generic view of the hardware of an OSIP-enabled MPSoC. The OSIP processor is shown in the upper left corner of the figure, with its program memory (PM), its private data memory (DM) and an interrupt controller (IC). Three conditions

4.2. Platform Integration 65 ... Memory subsystem LM LM LM LM OSIP PM DM ... ... I/O Timer ... ...Processors' interrupts Interconnect IC ... Peripherals' interrupt signals ... ... LM LM ... HW Proxy

Figure 4.4: Generic OSIP-based platform.

are needed for a correct hardware integration, namely (1) OSIP register interface must be reachable through the interconnect, (2) processors’ interrupt signals must be all gen- erated by OSIP and (3) all peripheral interrupt signals must be rerouted through OSIP. The peripheral interrupts are decoded by OSIP’s interrupt controller (IC), which creates task synchronization events. An additional hardware block is required if the platform contains processing elements without interrupt support (see PE31 and PE32 in Figure 4.4). This block is called HW Proxy and is shown in the bottom right corner of the figure. The HW Proxy allows to interface hardware accelerators and simple processors by capturing the interrupt signals from OSIP, processing OSIP’s requests and retaining the information in internal queues.

4.2.2

Software Integration

The processors in an OSIP-based platform interact with OSIP by sending low-level com- mands to its register interface. These commands serve five main different purposes: (1) Modify the scheduling hierarchy by creating scheduler descriptors and configuring their policies, (2) create tasks and push them into queues, either in the ready or in the pending state, (3) create pending queues, (4) ask for tasks from OSIP upon receiving an interrupt signal, and (5) update information about the running task, e.g., update the pri- ority according to a priority inheritance protocol.

There are a total of 50 different commands with a binary encoding designed to reduce the traffic in the interconnect. These commands are too detailed to be directly used by the programmer. For this reason, a set of high-level multi-tasking APIs and correspond- ing data structures were designed. This lightweight software stack includes typical task management functions, such as CreateTask, SuspendTask and DeleteTask, as well as standard synchronization APIs, such as Wait and Signal.

Apart from runtime support for multi-tasking, low-level routines for boot-up and in- terrupt handling are required for software integration. The architecture-agnostic part of this low-level software is written in C. Different versions of the architecture-dependent part are provided for the processing elements used in this thesis.

4.2.3

System Level Modeling

This section discusses ESL modeling issues for the virtual platforms used in this the- sis. OSIP’s client processing elements and other system components are described in Sec- tion 4.2.3.1, whereas the models of OSIP itself are treated in Section 4.2.3.2.

66 Chapter 4. MPSoC Runtime Management

4.2.3.1 Platform Models

LTRISC: As mentioned before, the LTRISC is a simple 5-stage pipeline processor dis-

tributed with Synopsys PD. Being the baseline architecture of OSIP, it serves to measure the impact of the new instructions described in Section 4.1.2. Additionally, the LTRISC is used in simple virtual platforms for case studies that are concerned with functional cor- rectness. As system model, the virtual platforms use a cycle-accurate simulator from the

Processor Support Package (PSP) generated with Synopsys PD.

LTVLIW: This is 4-slot Harvard VLIW architecture derived from the VLIW sample

model that comes with Synopsys PD, extended with a multiplier and a bus interface. As with the LTRISC, a cycle-accurate PSP is used in the virtual platforms. Since the LTVLIW model has no interrupt, it is interfaced with the HW Proxy. For multi-tasking, an implementation of Protothreads [67] on top of OSIP APIs was developed.

IRISC: IRISC stands for the ICE RISC core developed at the ICE chair. It is a RISC

processor with a load-store Harvard architecture featuring a fully inter-locked 5-stage pipeline. Compared to the LTRISC, it has an optimized general-purpose instruction set. A cycle-accurate simulator of this processor is used in this thesis. In contrast to the LTVLIW and the LTRISC, the IRISC has interrupt support. This eases software integration into OSIP-based MPSoCs. The OSIP software stack, including boot-up code, interrupt handling and task switching was ported to this processor.

ARM926EJ-S: This is an instruction-accurate model included in the libraries of Synop-

sys PA [237]. It is used to benchmark OSIP and as host processor for several of the virtual platforms in this thesis. The OSIP software stack is also available for it.

AMBA AHB Bus: The default system interconnect used in the virtual platforms of

this thesis is the Advanced High-performance Bus (BUS) of the Advanced Microcontroller Bus

Architecture (AMBA) protocol standard [8].

HW Proxy: This model is only included in OSIP-based platforms that include processing

elements with no interrupt support, e.g., the LTVLIW. Its behavior is modeled using an untimed SystemC Transaction Level Modeling (TLM) approach.

Memories: The memory architecture varies across the virtual platforms used in this

thesis. The internal memories of some processors are modeled by the processor simulator itself. External memories are modeled using SystemC TLM 2.0.

4.2.3.2 OSIP Models

As with other LISA processors, an automatically generated, cycle-accurate model of OSIP is available for simulation. However, since the general structure of OSIP’s application is clearly defined, it is possible to model the timing behavior in a more abstract way us- ing timing annotations. By doing so, the simulation speed is increased by several orders of magnitude while retaining an acceptable simulation accuracy. Due to the diversity of

4.2. Platform Integration 67 low-level commands and the open configuration of the hierarchy, it is not possible to char- acterize OSIP’s timing by a single latency and throughput equation. Instead, the model uses the formalisms of Tagged Signal Models (TSMs) [156] and time-annotated Communi- cation Extended Finite State Machines (tCEFSMs). Such a model has been successfully applied to model virtual processors in [136].

In the TSM formalism, processes (computing nodes) communicate through signals. Sig- nals, in turn, are represented by a set of events that are pairs containing a value and a

tag (usually time). In an OSIP-based system an event is equivalent to a change on the

hardware interface of OSIP at a given time (access to the register interface, peripheral events). A signal is a set of events that trigger a given functional behavior in OSIP, For example, a signal for creating a new task is composed of a chronologically ordered set of events: acquire OSIP spin-lock, receive task’s information and trigger task creation. OSIP is represented by a tCEFSM Fosip= (Z, z0, I, f , O, U), where:

• Z: The set of explicit states. OSIP might be in one of two states: idle or busy. • z0: The initial state (idle).

• I: The set of input signals, which contains all possible low-level commands and incoming interrupt signals from peripherals.

• O: The set of output signals, which contains interrupt signals to every core in the platform.

• U = {u1, u2, . . . }: The set of implicit states, which model the internals of OSIP, e.g.,

size of internal queues, state of the register interface.

• f : Z× I → Z× O: The state transition function, where Z= Z × Wand W∗ =

W(u1) × W(u2) × . . . , with W(ui) the set of all possible values that the i-th variable

can have. The transition function is modeled by a functional C++ model of OSIP. The execution of the firmware and the individual instructions of OSIP were statically analyzed in order to derive timing equations and bounds. These timing relations are defined on the input signals and the values of the implicit state variables (W∗). There are two kinds of timing equations for a value w ∈ Wand input signal s ∈ I:

• ∆tdsw = fs

d(w): Represents the time delay during which OSIP stays in the busy state.

• ∆trsw = fs

r(w): Represents the response time at which OSIP produces an output.

Consider the example in Figure 4.5. A general hierarchy is shown in Figure 4.5a, with several parameters that define it (I1, k1, k2, . . . ) and form part of the implicit state variables.

Figure 4.5b presents a sample transition diagram with the annotated timing equations for the incoming signal snew, which models task creation. The diagram shows two possible

paths. The one on the left is followed if the new task leads to an interrupt signal genera- tion. The interrupt is generated after a response time ∆trsw modeled by the first equation on the right-hand side of the figure. After the interrupt is generated, OSIP remains busy for additional ∆tdsw− ∆trs

w cycles. The second path in the diagram corresponds to the case

in which the task is added to the structure without generating an output. The processing time along this path varies with parameters that are not modeled by state variables. For

68 Chapter 4. MPSoC Runtime Management b) ... .. . ... ... ... .. . a) idle idle idle

Figure 4.5: Example of OSIP tCEFSM model.

this reason, only an upper bound is provided for this path (see the third equation on the right-hand side of the figure). More complex scenarios are modeled in a similar way.

Besides accelerating system simulation, the abstract model helps MPSoC and firmware designers in several ways. For example, typical interrupt latencies (∆trsw) under different load scenarios can be derived easily from the model. Furthermore, the difference between ∆tdsw and ∆trsw defines design constraints for low-level subroutines on the MPSoC cores, e.g., time constraint for a context switch.