Implementation - Kim_unc_0153D

CHAPTER 1: Introduction

3.3 Implementation

In this section, we discuss the implementation of the hardware management framework inMC2. We have already discussed the basic hardware-management strategies in Section 3.2. We provide additional implementation details here. We begin by providing general information on the implementation of hardware

ZĞƐĞƌǀĞĚĨŽƌ ƉĞƌŝƉŚĞƌĂůƐ ϬǆϬϬϬϬϬϬϬϬ ϬǆϭϬϬϬϬϬϬϬ Ϭǆϰ&&&&&&& ^ǇƐƚĞŵZD Ϭǆϰ&&&&&&& ϬǆϭϬϬϬϬϬϬϬ ϬǆϭϬϬϬϴϬϬϬ <ĞƌŶĞůĐŽĚĞ ;.textͿ ϬǆϭϬĚϴϬϬϬϬ <ĞƌŶĞůĚĂƚĂ ;.dataн.bssͿ ϭϬϮϰD ϬD ϭϮϴD;ĂŶŬϬͿ

Figure 3.6: The arrangement of the kernel code and data in memory.

management inMC2. We then discuss theMC2task scheduler and a system call that migrates pages inMC2. We also discuss the coarse-grained OS isolation provided by ourMC2framework.

3.3.1 General Information

The hardware management framework consists of two key components: a mixed-criticality task scheduler and a resource partitioning module. We implemented these components as an extension to LITMUSRT, version 2015.1, which is based upon the 4.1.3 Linux kernel.6

3.3.2 MC2Task Scheduler

Linux can use different scheduling algorithms to schedule different types of tasks by introducingscheduler classes. The base scheduler iterates over each scheduler class in order of priority. The highest priority scheduler class that has a runnable process picks the next task to run. LITMUSRT introduces a new scheduler class,SCHED LITMUS, to schedule real-time tasks. TheSCHED LITMUSclass has the highest priority of all classes, so real-time tasks are guaranteed to have priority over normal Linux tasks. In LITMUSRT, a new scheduling algorithm can be added toSCHED LITMUSas a scheduler plugin. We implemented our MC2scheduler as a plugin to LITMUSRT. We implemented three different scheduling policies to schedule mixed-criticality tasks becauseMC2has three criticality levels that require real-time constraints.7

The code is available athttps://wiki.litmus-rt.org/litmus/Publications.

7_{Level-D tasks are non-real-time. Thus, the}_MC2_{scheduler does not schedule Level-D tasks. Such tasks can be scheduled by Linux} when a processor exists that has no eligibleMC2tasks to execute.

LITMUSRT provides several real-time scheduling algorithms as scheduler plugins. TheP-RES (partitioned uniprocessor reservation)scheduler in LITMUSRT supports the partitioned cyclic-executive and P-EDF scheduling algorithms. However, asMC2uses G-EDF for scheduling Level-C tasks, we extended the P-RES scheduler to supportMC2tasks. In P-RES, areservationis a schedulable entity that may have more than one task. Each reservation has abudget(i.e., an OS-enforced execution time) and areplenishment period. This reservation structure is used to realize budget enforcement, which is required to implement MC2(Mollison et al., 2010; Herman et al., 2012).

InMC2, we modified the structure of a reservation to have only one task. The budget of a reservation is equal to the associated task’s PET at its own criticality level and the replenishment period is equal to the task’s period. To statically prioritize each criticality level, we implemented a set of reservations, known as acontainer(in other literature, called aserver). Each processor has a set of reservations, called alocal container, that has Level-A and -B reservations for tasks assigned to that processor. In a local container, Level-A reservations are prioritized over Level-B reservations. Thus, the local container can serve as both a dispatching table for the cyclic executive and a ready queue for Level-B tasks. For Level-C tasks, we implemented a set of reservations, called aglobal container, that is shared among all mprocessors. Reservations in the global container are sorted in the order of deadlines. When theMC2scheduler is invoked to select the next task to run on the current processor, it first selects a container. Then, the scheduler selects the next task from the selected container, according to a scheduling algorithm associated with that particular container.

3.3.3 Allocating Colored Pages to Tasks

As we discussed in Section 3.1, a physical address of a page determines both its LLC color and DRAM bank. Therefore, we can achieve set-based partitioning and DRAM bank partitioning by modifying the memory-management system in Linux. We begin by describing the memory-management system in Linux. We then present our modifications to this system.

Memory management in Linux. In Linux, physical pages are managed and allocated by the buddy

ϮϬƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϭƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϮƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϯƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϰƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϱƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϲƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϳƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϴƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϵƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϭϬƉĂŐĞƐŝǌĞĚďůŽĐŬƐ &ƌĞĞƉĂŐĞďůŽĐŬƐ KƌĚĞƌ

Figure 3.7: A list of free blocks managed by the buddy allocator.

block is2kpages for somek. The buddy allocator maintains a list of free blocks.8 Figure 3.7 illustrates the management of free blocks in Linux. When the OS kernel allocates memory, the buddy allocator searches for an appropriately sized block. When a block of the requested size is not available, a larger block is divided into two half-sized blocks. These two blocks are buddies to each other. One half is used for the allocation and the other is added to the list of free blocks. When a block is freed, the buddy is examined and combined if it is free.

Our modifications to the buddy allocator. To properly allocate LLC colors and DRAM banks to tasks,

we modified the buddy allocator in Linux. Our modification to the buddy allocator consists of replacing the single list of free pages withm+1 independent lists wheremis the number of processors in the system. This is illustrated in Figure 3.8. These mlists hold free pages for the Level-A and -B tasks on each of themprocessors. The additional list holds free pages for Level-C tasks and other non-real-time tasks. By default, we use Level-C/OS pages for all page allocations because a task does not have a criticality when it is created. When we migrate pages via a system call, a new page is allocated from the buddy allocator. As the

8_{The buddy allocator maintains a list of free blocks per a}_zone_{. A zone is a group of pages that have similar properties. The hardware} platform considered in this dissertation has only one zone,ZONE NORMAL, in the system. However, other architectures may have multiple zones such asZONE DMA,ZONE NORMAL, andZONE HIGHMEM. In such platforms, the buddy allocator can have more than one list (Gorman, 2004).

WĂŐĞƐĨŽƌ>ĞǀĞůΘK^ WĂŐĞƐĨŽƌWhϯ WĂŐĞƐĨŽƌWhϮ WĂŐĞƐĨŽƌWhϭ ϮϬƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϭƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϮƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϯƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϰƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮϱƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮƉĂŐĞƐŝǌĞĚďůŽĐŬƐ Ϯ ƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮƉĂŐĞƐŝǌĞĚďůŽĐŬƐ ϮƉĂŐĞƐŝǌĞĚďůŽĐŬƐ &ƌĞĞƉĂŐĞďůŽĐŬƐ KƌĚĞƌ WĂŐĞƐĨŽƌWhϬ mнϭůŝƐƚƐŽĨ ĨƌĞĞďůŽĐŬƐ

Figure 3.8: Modifiedm+1 lists of free blocks in the buddy allocator.

buddy allocator has information about the requesting task’s criticality and assigned processor, we can allocate properly colored pages to tasks. After the new page is initialized, the contents of the old page are copied to the new page. Then, the old page is removed from the task, and the new page is inserted into the task. This procedure is repeated for every page of the task. Figure 3.9 shows a possible outcome of this migration for a task assigned Colors 0–3. In this figure, each box represents a page with the assigned color. The system call that migrates pages is performed before tasks commence execution, and as a result, tasks incur no runtime overheads due to page migrations.

3.3.4 Coarse-Grained OS Isolation

Our hardware management framework isolates the OS from Level-A and -B tasks in the LLC via way- based partitioning. Specifically, whenever the kernel begins executing on a processor as the result of an interrupt, exception, or a system call, we save the current value of that processor’s lockdown register and then modify it so that the OS accesses only certain LLC ways in kernel mode. When exiting kernel mode, we restore the lockdown register using the saved value. Together with the DRAM bank partitioning, this

ŽůŽƌϴ ŽůŽƌϭϱ ŽůŽƌϭϮ ŽůŽƌϬ ŽůŽƌϰ ŽůŽƌϯ ŽůŽƌϮ

ŽůŽƌϬ ŽůŽƌϭ ŽůŽƌϮ ŽůŽƌϯ ŽůŽƌϬ ŽůŽƌϭ ŽůŽƌϮ

ĞĨŽƌĞŵŝŐƌĂƚŝŽŶ

ĨƚĞƌŵŝŐƌĂƚŝŽŶ

WĂŐĞŵŝŐƌĂƚŝŽŶƐǇƐƚĞŵĐĂůů

Figure 3.9: Migration of seven pages to a task assigned the first four LLC colors.

ensures that the OS only minimally interferes with Levels A and B. However, this coarse-grained OS isolation technique has a limitation. When the OS executes a system call on behalf of Level-A or -B tasks, the LLC ways for Level-C and OS will be used for Level-A and -B tasks. This may cause interference from Level-C tasks. In the experiments presented in this chapter, we avoid such interference by not allowing tasks to invoke system calls. We resolve this issue by fine-grained OS isolation, which will be discussed in Section 4.3.

In document Kim_unc_0153D_18346.pdf (Page 46-51)