Real-Time Operating Systems – Part 2
Design & Analysis for Embedded & Time-Critical Systems Dr. Tom Clarke
2.1 Interrupts & Exceptions Hardware
Background/foreground systems Interrupts vs tasks
2.2 RTOS Task Lists
Linked List implementation
Bit-mapped task list implementation
2.3 Context Switching AVR example
Source code
2.4 Timing Functions Clock Tick
Task Delay implementation
2.5 RTOS Objects Semaphores
RTOS Course Structure
PART 1 Real-TimeApplications PART 2 - Implementation
1.1 Introduction Tasks
1.2 Task Control in FreeRTOS Task states
FreeRTOS API
1.3 Shared Data Access
1.4 Semaphores
Theory and applications
1.5 Message Queues Theory and applications
1.6 Synchronisation Theory and applications
1.7 Scheduling Theory Priority scheduling Round-robin scheduling
RMA, extended RMA, & deadlines
1.8 & 1.9 Scheduling Problems Deadlock, starvation & livelock Priority inversion
Anatomy of an RTOS
Hardware interface
Interrupt handling
Task switching
Clock tick implementation
Scheduling
Communication & Synchronisation
Memory Allocation
Timers & Delay Implementation
Portability
Minimise code specific to a given architecture/compiler
Scalability
Make feature-set configurable, so the same system can be configured for very small, or large, systems
Performance
Low interrupt-level & task-level latency
Interested in maximum limit, not average
Low RAM use
Rich Feature-set?
Ways to deal with priority-inversion
Deadline scheduling
Rich set of comms & synch primitives
Part 2 - RTOS Implementation
Interrupts
Simple foreground/background systems
How the RTOS responds to events
Polling vs timer ISR vs ISR
Interrupt latency
Task or interrupt?
RTOS Implementation
Task lists & scheduler
Linked lists vs bit-mapped array
Pre-emption & context switch
Task creation RTOS startup
Timing & delays
Clock tick
Delay functions
Calling scheduler from tasks
Semaphores
vTaskIncrementTick()
vPortYieldFromTick()
vPreemptiveTick()
prvCheckDelayedTasks()
vTaskDelay()
vTaskSwitchContext()
portSAVE_CONTEXT()
taskYIELD()
xTaskCreate()
vTaskStartScheduler()
xPortStartScheduler
vPortISRStartFirstTask()
xQueueSend()
prvUnlockQueue()
prvLockQueue()
prvCopyQueueData()
Lecture 2.1 - Interrupts
Interrupts are the fundamental device which allows sequential computer programs to respond to real-time events
Interrupts require hardware support
ALL CPUs provide some way of implementing interrupts
Boundary between hardware & software support is variable
As a minimum save/switch of PC must be implemented in hardware
Interrupt processing
Stop code currently executing
Branch to Interrupt Service Routine (ISR)
Execute ISR
Return transparently to previous executing
Provide a very fast
response to a hardware event
Timer timeout
I/O request
Note that ISR is very
different from a task – no state is saved from one invocation to the next
Can't implement infinite loop
Can't block
There cannot be greater rudeness than to interrupt another in the current of his discourse
Interrupt Priority
Microprocessors provide prioritised execution of interrupts
During interrupt priority n, all lower priority interrupts are disabled
Disabled interrupts will be enabled and executed at the first possible time
when the current priority level drops below that of the disabled interrupt
Hardware support can implement branch to the correct ISR for each interrupt
Vectored priority interrupt controller
3 2 1
Three prioritised ISRs
Background code
Rate Monotonic Analysis for Interrupts
Prioritised interrupts provide response to deadlines very similar to prioritised tasks implementing jobs
RMA analysis can be used to determine whether all ISRs will terminate before next ISR
invocation ISR n
Interrupt priority n
Hardware event initiates computation Tn Cn Interrupt Interrupt n occurs ISR n runs ISR n returns Task Task n becomes Task n runs Task n blocks
Forground/Background System Design
Real-time systems can be implemented without an RTOS using
foreground/background design
Time-critical computation implemented via interrupt-driven ISR (foreground)
Non-critical computation implemented in background loop
Note only one background loop is possible
Non-critical hardware can be serviced from background loop using polling
Loop length is variable depending on which devices need to be serviced Worst-case loop length determines latency for polled devices
Advantages
Simple More efficient than RTOS
Disadvantages
More computation must be implemented in ISR – coding more complex & difficult to debug
Polling from Timer interrupts
Interrupt-driven paradigm locates
computation in specific ISRs executed in
response to hardware events
Interrupt-driven computation in ISR trigerred by hardware event
Response time is limited by CPU-determined interrupt time + execution time of any higher or equal priority interrupts
Timer-interrupt ISR polls multiple devices
at regular intervals
provides guaranteed response with less hardware overhead than device-specific hardware interrupts
Simpler to code than multiple separate ISRs
Cf time-interrupt polling & background
loop polling
Background loop also implements polled computation – but time between successive polls cannot easily be determined.
BackgroundLoop() {
for (;;) {
if (device A is ready) [ Service Device A ]; if (device B is ready) [ Service Device B ]; }
}
Timer1_ISR() // device C or D {
if (device C is ready) [ Service Device C ]; if (device D is ready) [ Service Device D ]; [ update system clock ]
}
DeviceE_ISR() {
[ Service Device E ]; /*NB polling not needed*/ }
Response-time: Timer ISR polling vs
background loop
Can use one or more timer ISRs running at different speeds –
e.g. 1ms and 10ms – to implement half-way house between
polled & interrupt-driven paradigms
1ms interrupt runs higher priority than 10ms interrupt
1ms interrupt provides response-time (latency) of 1ms worst case, but total length of ISR must be << 1ms
10ms interrupt provides response-time of 10ms worst case. Total length of ISR must be <<10ms. Note that 10ms ISR execution time is effectively slowed down by CPU utilisation due to 1ms interrupt (and other high priority interrupts)
Background loop can be used to implement polled
computation which is not time-critical.
Polling time for background loop is variable and depends on overall loop period.
Interrupt Latency (1)
The next few slides show how to work out the maximum (worst-case) latency for code servicing a device from a device interrupt or a timer ISR
Interupt Latency is measured
from a hardware event to the
start of the ISR user code
Interrupt performance is often specified in terms of maximum allowed latency, and therefore requires analysis different from RMA
E Dev ISR R
Latency
Interrupt entry: •Hardware entry
•Save lower level context
Interrupt return: •Restore lower level context •Hardware return User Code Lower level code interrupt occurs
Interrupt Latency (2)
In real-time systems the deadline for ISR execution is often stricter than that used in RMA.
Deadlines are specified as a maximum allowed interrupt latency: delay from the interrupt being raised to the ISR starting
Worst-case interrupt latency Ln for ISR priority n must be explicitly calculated by considering:
Delay from non-interrupt code to the first ISR
All ISRs of higher priority than n
Interrupt entry & return time
Best-case latency = E ISR 4 ISR 3 ISR 2 E E E R R R ISR 2 latency
i m n i i nC
E
D
E
R
L
1)
(
D4 D3 D2 critical section length CPolling Latency from Timer Interrupt
The maximum Latency for a
device polled from timer interrupt
(L
P) is the period of the timer (P)
+ the worst-case difference in
timer ISR response-time from
one interrupt to the next.
Assuming that the timer ISR is
priority n, and using results on
previous slide
Difference also represents timing
jitter for sample if I/O time is
software controlled
Note that hardware controlled A/D avoids jitter
m i n i i n P m i n i i nR
E
D
C
P
E
L
P
L
R
E
D
E
C
L
1 1)
(
)
(
Latency for timer
ISR priority n
Best case
latency
Worst case
latency
Best
response Worstresponse
Lp P
Interrupts and tasks
In an RTOS devices can still be
handled by interrupts
All interrupts run at a higher priority than all tasks
ISRs must be written to inform RTOS that they are executing (RTOS-dependent mechanism)
NOTE – all interrupts are higher in
priority than all tasks
ISRs can handle devices directly
ISRs can signal tasks which then
handle device (next slide)
Task 3 Task 2 Task 1 ISR 1 ISR 2 ISR 2 Tasks ISRs increasing priority
Tasks synchronised to interrupts
ISR for device X can signal
a semaphore to unblock a
waiting device-handler task
Make ISR as short as possible
time-critical code has interrupt latency
Rest of code also has task latency
DeviceX_ISR() {
[ time-critical handler code ]
SemaGiveFromISR(SemaX); }
DeviceX_Task() {
for (;;) {
SemaTake(SemaX); /*wait for device*/ [ rest of handler code ]
} } inter rupt la tenc y tas k la tenc y Dev ic eX_ ISR Devi ce X_T ask
Lecture 2.1 Summary
Simplest real-time systems are implemented using the
foreground/background model using interrupts and background loop.
Interrupts can be used to provide guaranteed latency & response-time
Interrupts have priority which determines precedence
Interrupt priorities are managed by device-dependent hardware or software.
All CPUs allow individual interrupt sources to be switched on or off which in principle allows software control of priorities
In practice for speed prioritised interrupts are managed by special hardware in big systems
I/O operations can be performed:
From hardware-specific interrupt
From timer interrupt (polled)
From background loop (polled)
RMA can be used to determine whether prioritised interrupt deadlines are met
More precise calculation must add interrupt entry and return times to ISR time when calculating RMA compute time Ci
Lecture 2.2 – Implementing Task Lists
It is easy to go down into Hell; night and day,
the gates of dark Death stand wide; but to climb
back again, to retrace one's steps to the upper
air - there's the rub, the task.
Implementing an RTOS: Data Structures
Task Control Block Task Stack Task READY List Task DELAYED List Waiter List Semaphore Control Block Semaphore PCCurrent Task Pointer Task Control Block Task Stack Task Control Block Task Stack Registers PSR RTOS Objects Pr ocess or conte xt TASKS Task SUSPEND List
How is task state recorded?
Possible states:
READY
BLOCKED on timeout (DELAYED)
BLOCKED on EVENT
SUSPENDED
NB - must be simultaneously delayed & blocked on
event to implement semaphore timeouts etc
Need to know
What is the highest priority task in given state?
Are there any tasks in given state?
Need to change state of given task
SOLUTION - Task "lists"
Task Lists
READY List
Need to extract the highest priority task
DELAYED task list
Need to extract all tasks whose delay-time has elapsed
SUSPENDED list
Need to extract resumed task
EVENT lists
Lists for waiters on semaphores
& other objects
A separate list for each object
Need to extract the highest
Every task links to at
most two distinct lists
READY/DELAYED/SUSP
END List – generic list link
Mutually exclusive
At most one EVENT list
event list linkSuspended tasks are not allowed to wait on semaphores while
suspended, so they are removed if necessary from their EVENT list on
FreeRTOS Lists
Implement generic packages of ordered task lists
Each item in the list has assigned a value List is ordered by descending value
Implement operations
Insert task in list with value at correct position Remove task from front of list (largest value) Remove task from end of list (smallest value) Remove arbitrary task from list
Use this package to implement all RTOS lists
READY & EVENT lists: sort value = priority DELAYED List: sort value = wake-up time
Use doubly linked lists to allow quick removal
List nodes are part of Task Control Block
One less level of indirectionTask list data structures: doubly-linked
circular list
itemvalue next previous owner container itemvalue next previous owner container itemvalue next previous T CB NumberOf Items: 2 index: ? G eneric Lis tI tem E vent Lis tI temDummy list
item
marks
end & start
of circular
list
ListEnd:Item:
List operations - Remove
itemvalue next previous owner container itemvalue next previous owner container itemvalue next previous owner container pxRemove
(px)
{
pxn = px->next;
pxp = px->previous;
pxn->previous=pxp;
pxp->next=pxn;
px->container->numberofitems--;
px->container = NULL;/* if px matters */
}
pxn pxp itemvalue next numberofitems indexList operations - InsertAfter
itemvalue next previous owner container itemvalue next previous owner container itemvalue next previous owner container px InsertAfter(px, pz) { pxn = px->next; px->next = pz; pxn->previous=pz; pz->next=pxn; pz->previous=px; pz->container=px->container; px->container->numberofitems++; } pxn pz itemvalue next numberofitems indexList operations - InsertSorted
Insert new item
px into list pxl in
sorted position
InsertSorted( px, pxl)
{ /*pxit is list item pointer (iterator)*/
v = px->itemvalue; /*assume v < MAX*/
for( pxit = pxl->listend; pxit->next->itemvalue <= v; pxit = pxit->next ) { /* There is nothing to do here, we are just iterating to the
wanted insertion position. */ } InsertAfter( pxit, px); 3 10 22 22 99 MAX 22 15 pxl->listend 101
O(list length) –
linear time
Why doubly-linked circular list?
Doubly-linked list means removing
an item is easy
Circular list means start & end of
list is not a special case
Dummy (sentinel) item means that:
Empty list is not a special case List start/end is marked
Links from list items to list header
needed for header update on
removal of item
Implementing list items as part of
TCB mean that task data structures
can be accessed from list items
and list items accessed from tasks
with no overhead.
Advantages
List operations have no special cases and are simple to implement Tasks can remove themselves
from lists – removal is O(1)
Linked list structure is very flexible – no constraints on task
priorities or number of tasks
List nodes separate allow standard package of list functions to be
used
Storage for list nodes is allocated as part of TCB structure
Disadvantages
Sorted Insertion operation takes typical time O(length of list)
Why is time important?
RTOS operations: e.g. tasks waking up and blocking
involve changes to lists
Worst case time for operation is important RTOS
performance characteristic
Deterministic time – e.g. time does not depend on what
other tasks are doing – is desirable
For this doubly-linked list implementation:
Insert, InsertAfter, InsertStart, InsertEnd, Remove
O(1)=>Deterministic
InsertSorted
Depends on length of list and insert position
FreeRTOS lists module
FreeRTOS implements in lists.h a set of doubly linked list
operations, using C functions & macros, which are used throughout
the RTOS
void vListInitialise( xList *pxList )
void vListInsertEnd( xList *pxList, xListItem *pxNewListItem ) void vListInsert( xList *pxList, xListItem *pxNewListItem ) void vListRemove( xListItem *pxItemToRemove )
listSET_LIST_ITEM_OWNER( pxListItem, pxOwner )
listSET_LIST_ITEM_VALUE( pxListItem, xValue )
listGET_LIST_ITEM_VALUE( pxListItem )
listLIST_IS_EMPTY( pxList )
listCURRENT_LIST_LENGTH( pxList )
listGET_OWNER_OF_NEXT_ENTRY( pxTCB, pxList )
FreeRTOS Ready List optimisation
The Ready list changes whenever a task becomes READY or
blocked
Efficient implementation is important
Instead of having a single list the ready list is implemented as an
array of lists, indexed by task priority
Number of tasks in each list is smaller (normally only one!) Correct list for a given task can easily be found
31 30 29 ... 1 0 configMAX_PRIORITIES pxReadyTasksLists[] Doubly linked list
Optimisation (2)
Task Addition is now much faster, but finding the highest priority
ready task means scanning the array to find highest non-empty
location.
Use extra global variable to store top priority currently in ready list
Highest priority task normally from topReadyPriority list If this is empty search downwards in array until non-empty list is found Update uxTopReadyPriority
This trades lower cost on access for higher cost on update
Good policy if either the information is accessed more often than it is changed
Or if calculating the change is quicker than calculating the information from scratch
Often "incremental claculation" is quick
uxTopReadyPriority 31 30 …. 2 1 READYtasks Empty lists
MicroC/OS-II Ready List optimisation
MicroC/OS-II uses a
completely different and
clever implementation for all
its priority ordered task lists.
Suppose each task has a
unique priority
– so tasks
can be identified by their
priority, and the priorities lie in
a small range. We will assume
priority P satisfies: 0
P < 64
Any power of 2 can equally well be used
A task list can be represented as
a array with 1 location for each
possible priority (and hence
task). Locations set to 1 mean
that tasks are in the list.
Each item in array needs only 1
bit, so pack 8 items into each
byte as shown below
This allows highly efficient operations
Here tasks 5,4, and 1 are in list
0 0 1 1 0 0 1 0
7 6 5 4 3 2 1 0 One byte stores 8 array
OSRdyTbl
Bit-mapped array: 1 bit per task
Max 8 bytes => 64 bits 1=> task is waiting
Each row is a single byte,
containing 8 bits with the task
priorities shown in the diagram
Total 8 bytes required for 64 bits of
table
To access bit, first select correct
byte (index 0-7), then select correct
bit
Very space efficient
7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24 39 38 37 35 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
OSRdyTbl
[0] [1] [2] [3] [4] [5] [6] [7]Bit-mapped array characteristics
Sorting is not required since each task is recorded at a
position based on priority
Finding highest priority task requires linear search of table
We will see later that this can be very efficient
Inserting or removing a task setting or clearing a bit
at a predefined position in the array determined by the
task priority
This is clearly O(1) and very fast
To speed up the search for highest priority task present
we need one more data structure
OSRdyGrp is a
summary byte.
Each bit is the OR of
all the bits in one byte
(row) of OSRdyTbl
This doubles the time of insert & remove operations, but they are still fast.
OSRdyGrp is used to
implement very fast
searching.
1 1 0 1 0 0 1 1 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0OSRdyGrp
OSRdyTbl
0 1 2 3 4 5 6 7Y
X
How to find highest non-zero bit in a byte?
Not simple operation
One byte has only 256 distinct
values
Use byte value to index a 256 byte constant array
Store bit number of highest '1' bit of index in each array element
Trades space (table size) for faster
task switch time
For small systems table size can
be reduced since it depends on
number of tasks
0: 00000000 not used1: 00000001 0 2: 00000010 1 3: 00000011 1 4: 00000100 2 5: 00000101 2 6: 00000110 2 7: 00000111 2 8: 00001000 3 OSUnMapTbl index ... ... 255: 11111111 7 76543210
Table-driven search for highest priority task
The "highest non-zero bit search" is still quite expensive.
Solution is to remember search results in a constant array OSUnMapTbl[]
8 bits256 possible combinations
256 locations needed in table
Result is a number between 7 and 0
Will fit into 1 byte
OSUnMapTbl[] is 256 byte constant array
Each operation 1) & 2) on previous slide is now a byte table lookup {
y = OSUnMapTbl[OSRdyGrp]; // highest set bit no in OSRdyGrp
x = OSUnMapTbl[OSRdyTbl[y]];// highest set bit no in OSRdyTbl[y]
prio = ( y << 3) + x; // combine x and y to get highest prio
// << 3 is left shift by 3, used for X8
}
Bit-mapped array list summary
Advantages
All operations are O(1), and can be very fast if number of
priorities is small
Operation time is independent of number of tasks in lists
Disadvantages
All priorities must be unique
Does not allow round-robin scheduling
Not a problem for many real-time systems
Makes dynamic priority-changing (e.g. to avoid priority inversion) much more difficult
Does not scale nicely to increased number of tasks
All tables use at least 1 bit per priority level. 216 priorities 8K
bytes per table!
Lecture 2.2 Summary
Task lists are basic data structure used to control task scheduling
in an RTOS
All RTOS uses task lists for:
READY list Delayed task list
Event waiting task lists (semaphores etc) Suspended Task List
Task lists can be implemented in various ways. Two popular are:
Linked lists – can be single or doubly-linked. FreeRTOS uses doubly-linked lists
Bit-mapped array lists
Less flexible but can be very fast
Task priority determines unique position in a bit array
Lecture 2.3 –Context Switching
The key to any RTOS is ability to switch between tasks
Two main methods
Co-routine (cooperative) switching Preemptive switching
This lecture covers task startup and preemptive switching
Implementation is necessarily architecture & compiler-specific
Code is v low-level & must be rewritten for different compiler Most of code is written in assembler
Context switching involves changes to system stack etc and
therefore throughout this code interrupts must be switched off
Scheduler decides which task should next run and then if
necessary invokes context switching
RTOS Scheduler
Function of scheduler is to decide which of the available READY
tasks should be run next
Scheduler changes pxCurrentTCB
After calling the scheduler the context switch code (see later) can
complete the task switch
Scheduler is called from task level as taskYield() (see later)
FreeRTOS uses priority scheduling with round-robin time-slicing for
tasks of equal priority
All this is implemented using generic doubly-linked list package for compact code
void vTaskSwitchContext( void ) {
if( uxSchedulerSuspended != ( unsigned portBASE_TYPE ) pdFALSE ) {
/* The scheduler is currently suspended - do not allow a context switch. */
xMissedYield = pdTRUE; return;
}
/* Find the highest priority queue that contains ready tasks. */
while(listLIST_IS_EMPTY( &( pxReadyTasksLists[ uxTopReadyPriority ] ) ) ) {
--uxTopReadyPriority; }
/* listGET_OWNER_OF_NEXT_ENTRY walks through the list, so the tasks of the same priority get an equal share of the processor time. */
listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB,
&( pxReadyTasksLists[ uxTopReadyPriority ] ) ); vWriteTraceToBuffer();
Tasks & Stack
Task1 f1 Task2 Stack base Stack top during execution of f2() Task1 stack Task2 stack Task1() { f1(10); } f1( int a) { int b; …. f2(); } Task2() { ….. f3(); …. } f2 Task2 context f3How is a Task Implemented?
To understand context switch we need to know precisely how a
task is implemented
The next five slides show the FreeRTOS code which creates and
manipulates the RTOS data structures which represent tasks
This code is not very complex, but quite long because it must do am lot of work
Task creation must work both before the scheduler is started, and
afterwards
Tasks are initially created in a suspended form, before scheduling
is started
Since the RTOS always runs with one running task, the RTOS
startup code must initially move one task to running status &
commence its code.
signed portBASE_TYPE xTaskCreate( pdTASK_CODE pvTaskCode,
const signed portCHAR * const pcName, unsigned portSHORT usStackDepth,
void *pvParameters, unsigned portBASE_TYPE uxPriority, xTaskHandle *pxCreatedTask ) {
signed portBASE_TYPE xReturn; tskTCB * pxNewTCB;
static unsigned portBASE_TYPE uxTaskNumber = 0;
/* Allocate the memory required by the TCB and stack for the new task. checking that the allocation was successful. */
pxNewTCB = prvAllocateTCBAndStack( usStackDepth ); if( pxNewTCB != NULL )
{
portSTACK_TYPE *pxTopOfStack;
/* Setup the newly allocated TCB with the initial state of the task. */
prvInitialiseTCBVariables( pxNewTCB, usStackDepth, pcName, uxPriority );
/* Calculate the top of stack address. This depends on whether the stack grows from high memory to low (as per the 80x86) or visa versa. portSTACK_GROWTH is used to make the result positive or negative as required by the port. */
#if portSTACK_GROWTH < 0 {
pxTopOfStack = pxNewTCB->pxStack + ( pxNewTCB->usStackDepth - 1 ); }
#else
/* Initialize the TCB stack to look as if the task was already running, but had been interrupted by the scheduler. The return address is set to the start of the task function. Once the stack has been initialised the top of stack variable is updated. */
pxNewTCB->pxTopOfStack = pxPortInitialiseStack( pxTopOfStack, pvTaskCode, pvParameters );
/* We are going to manipulate the task queues to add this task to a ready list, so must make sure no interrupts occur. */
portENTER_CRITICAL(); {
uxCurrentNumberOfTasks++;
if( uxCurrentNumberOfTasks == ( unsigned portBASE_TYPE ) 1 ) {
/* As this is the first task it must also be the current task. pxCurrentTCB = pxNewTCB*/ /* This is the first task to be created so do the preliminary initialisation required.
We will not recover if this call fails, but we will report the failure. */
prvInitialiseTaskLists(); }
else {
/* If the scheduler is not already running, make this task the current task if it is the highest priority task to be created so far. */
if( xSchedulerRunning == pdFALSE ) {
if( pxCurrentTCB->uxPriority <= uxPriority ) {
pxCurrentTCB = pxNewTCB; }
/* Remember the top priority to make context switching faster. Use the priority in pxNewTCB as this has been capped to a valid value. */
if ( pxNewTCB->uxPriority > uxTopUsedPriority ) { uxTopUsedPriority = pxNewTCB->uxPriority; }
/* Add a counter into the TCB for tracing only. */
pxNewTCB->uxTCBNumber = uxTaskNumber; uxTaskNumber++; prvAddTaskToReadyQueue( pxNewTCB ); xReturn = pdPASS; } portEXIT_CRITICAL(); } else { xReturn = errCOULD_NOT_ALLOCATE_REQUIRED_MEMORY; }
if( xReturn == pdPASS ) {
if( ( void * ) pxCreatedTask != NULL ) {
/* Pass the TCB out - in an anonymous way. The calling function/ task can use this as a handle to delete the task later if required.*/
*pxCreatedTask = ( xTaskHandle ) pxNewTCB; }
if( xSchedulerRunning != pdFALSE ) {
/* If the created task is higher priority than the current task then it should run now. */
if( pxCurrentTCB->uxPriority < uxPriority ) { taskYIELD();
}
void vTaskStartScheduler( void ) {
portBASE_TYPE xReturn;
/* Add the idle task at the lowest priority. */
xReturn = xTaskCreate( prvIdleTask, ( signed portCHAR * ) "IDLE",
tskIDLE_STACK_SIZE, (void *) NULL, tskIDLE_PRIORITY, (xTaskHandle *) NULL ); if( xReturn == pdPASS )
{
/* Interrupts are turned off here, to ensure a tick does not occur before or during the call to xPortStartScheduler(). The stacks of the created tasks
contain a status word with interrupts switched on so interrupts will automatically get re-enabled when the first task starts to run.*/
xSchedulerRunning = pdTRUE; xTickCount = ( portTickType ) 0;
/*Setting up the timer tick is hardware specific and in the portable interface. */
if( xPortStartScheduler() ) {
/* Should not reach here as if scheduler is running function will not return. */
} else {
/* Should only reach here if a task calls xTaskEndScheduler(). */
}
portBASE_TYPE xPortStartScheduler( void ) {
/* Start the timer that generates the tick ISR. */ prvSetupTimerInterrupt();
/* Start the first task. This is done from portISR.c as ARM mode must be used. */
vPortISRStartFirstTask();
/* Should not get here! */
return 0; }
void vPortISRStartFirstTask( void ) {
/* Simply start the scheduler. This is included here as it can only be called from ARM mode. */
FreeRTOS Startup
Portable Code
Stack changes on context switching
Stack base Stack top during execution of f2() Task1 stack Task1 f1 Task2 Task2 stack f2 Task2 context f3 Stack base Task1 Stack top when suspended PC R0 …. R15 PSW Task1 f1 Task2 Task1stack Task2stack
f2 Task2 context f3 Task1 context
1
2
1: save current context on Task1() stack
2: restore Task2() context
CPU context Stack top on restart of f3() Task2 stack top when suspended
FreeRTOS 8086 Task switching in detail
As well as context switching, task switching involves
some additional work changing RTOS state to reflect
the task change
We will go through the task switch process in detail for
the simple case of an AVR architecture CPU
This illustrates the necessary steps in a simple form
ARM code is more messy because ARM ISA shadow
registers complicate things
FreeRTOS is designed so that maximum possible
amount of work can be done in C
Of course the actual context switch can't be done in a high level
language
Tick interrupt
We will consider the
case of a task TaskA()
which is executing when
a system clock tick
interrupt happens.
The interrupt results in a
higher priority task
TaskB() waking up &
preempting execution.
The overall result is a task switch.
The SAVE_CONTEXT &
RESTORE_CONTEXT
macros do the difficult
part of the work.
/* Interrupt Service Routine for the RTOS tick. */
void SIG_OUTPUT_COMPARE1A( void ) {
vPortYieldFromTick(); asm volatile ( "reti" ); }
/*---*/
void vPortYieldFromTick( void ) {
portSAVE_CONTEXT();
vTaskIncrementTick(); /* wakes up TaskB() */
vTaskSwitchContext(); /* schedules TaskB()*/
portRESTORE_CONTEXT(); asm volatile ( "ret" );
}
/*---*/ Compiled "naked" with nothing saved by compiler
0: AVR CPU context in TaskA()
R0 (A) R1 (A) R30 (A) R31 (A) SREG (A) PCL (A) SPL (A) PCH (A) SPH (A) LDI R0,0 LDI, R1,1 ADD R0, R1 TaskA data 8 bits 16 bits 32 registers TaskA Stack TaskA Code pxCurrentTCB TaskA TCB RTOS state1: CPU context after Tick Interrupt
R0 (A) R1 (A) R30 (A) R31 (A) SREG (A) PCL (A) SPL (A) PCH (A) SPH (A) PCL PCH PUSH R0 MOV R0, SR PUSH R0 TaskA data 8 bits 16 bits 32 registers 8 bits TaskA Stack Timer ISR start of SAVECONTEXT macro JSR vPortYieldfromTick2: CPU context after portSAVE_CONTEXT()
R1 (A) R30 (A) R31 (A) PCL (A) SPL (A) PCH (A) SPH (A) PCL (A) PCH (A) PCL (ISR) PCH (ISR) R0 (A) SREG (A) R1 (A) …. R31 (A) SAVE… TaskA data 8 bits 16 bits 32 registers TaskA Stack Timer ISR Copy of SP (A) PUSHed by interrupt Pushed by SAVE_CONTEXT() PUSHed by function call3: RTOS tick time increment
At this point the entire context of taskA has been pushed onto
TaskA stack
It can be retrieved later via the stored SP in taskA TCB
The ISR is still executing, using TaskA stack – extra items can be
pushed temporarily without harming the stored context – any
change in SP now will not affect TaskA stored SP
vTaskIncrementTick() will alter system time and, we assume,
wake up TaskB
TaskB moved to READY list
vTaskSwitchContext() will see that TaskB is now the highest
priority task in the READY list and select it as the next task.
4: Start of portRESTORE_CONTEXT()
R1 (A) R30 (A) R31 (A) PCL SPL (B) PCH SPH (B) PCL (B) PCH (B) PCL (ISR) PCH (ISR) R0 (B) SREG (B) R1 (B) …. R31 (B) MOV SPL,R0 MOV SPH, R1 TaskB data 8 bits 16 bits 32 registers TaskB Stack Timer ISR Copy of SP (B) PUSHed by interrupt Pushed by SAVE_CONTEXT() PUSHed by function call5: End of portRESTORE_CONTEXT()
R0 (B) R1 (B) R30 (B) R31 (B) SREG (B) PCL SPL (B) PCH SPH (B) PCL (B) PCH (B) RET RETI TaskB data 8 bits 16 bits 32 registers TaskB Stack Timer ISR PUSHed by interrupt PCL (ISR)6: Restart of taskB
R0 (B) R1 (B) R30 (B) R31 (B) SREG (B) PCL (B) SPL (B) PCH (B) SPH (B) RET SUB R0,R1 TaskB data 8 bits 16 bits 32 registers TaskB Stack TimerISR Final instructionOf ISR Next TaskB instr
AVR portSAVE_CONTEXT
; interrupts are disabled on entry
; PCH,PCL are pushed onto stack by interrupt
push r0 ;push r0 first to allow SREG to be loaded
in r0, __SREG_ ; r0 := SREG
push r0 ; push SREG
push r1
clr r1 ; needed since compiler expects 0 in r1
push r2 ; push r3-r29 push r30 push r31 lds r26, pxCurrentTCB ;x := pxCurrentTCB lds r27, pxCurrentTCB + 1 in r0, 0x3d ;r0 := SPL st x+, r0 ;[x]:=r0, x := x+1 in r0, 0x3e ;r0 := SPH st x+, r0 ;[x]:=r0, x := x+1 Push all other registers
RTOS state during task switch
The RTOS data structures that have changed during
the task switch are:
pxCurrentTCB
– always points to current task TCB
Changes from TaskA to TaskB
pxReadyTasksLists[]
– TaskB is added to ready list
See lecture 2.2 For implementation of READY lists
TaskA TCB: SP is updated to correct value for saved TaskA
context
Porting the RTOS context switch
The code which saves & restores context is written in assembler &
depends on CPU, and (to a lesser extent) compiler.
The ARM7 port has very convoluted save & restore code because
ARM processor switches to a different IRQ-mode stack while
processing an interrupt. The saved PC must be extracted from this
stack and saved on the Task stack.
ARM code will be shown here – with detailed description of how
the ARM stacks are managed during portSAVE_CONTEXT()
Restore operation (not shown here) is very similar but in reverseARM CPU context after IRQ interrupts TaskA
R0 R1 SP R13^ LR R14^ PC R15 SP R13 LR R14 R12…. TaskA registers (System mode) IRQ mode shadow registers SPSR CPSR IRQ stack TaskA Return addressSTM/LDM instructions can transfer either system mode or current
(IRQ) mode registers
^ indicates system mode R0-R12 are shared
STMDB SP!, {R0} /* Store R0 on IRQ stack temporarily as we need to use it*/
STMDB SP,{SP}^ /* Set R0 to point to the task stack pointer.*/
NOP
SUB SP, SP, #4 LDMIA SP!,{R0}
STMDB R0!, {LR} /* Push the return address onto the TaskA stack.*/
MOV LR, R0 /* Now we have saved LR can use as SP for TaskA stack.*/
LDMIA SP!, {R0} /* Pop R0 so we can save it onto the system mode stack*/
STMDB LR,{R0-LR}^ /* Push all the system mode registers onto the task stack*/
NOP
SUB LR, LR, #60 /* Adjust stack pointer */
MRS R0, SPSR /* Push the SPSR (saved TaskA PSR) onto the task stack.*/
STMDB LR!, {R0} LDR R0, =ulCriticalNesting LDR R0, [R0] STMDB LR!, {R0} LDR R0, =pxCurrentTCB LDR R1, [R0]
ARM7 portSAVE_CONTEXT()
Messing with ARM stacks
On entry ARM SP (R13) points to IRQ-mode stack – this is a shadow register
STMDB SP!, {R0} Push R0 on IRQ stack temporarily as we need to use a register
STMDB SP,{SP}^ Store user-mode SP on IRQ stack. don’t change stack pointer – ARM does not allow a push from user-mode SP
NOP wait for result - needed because of
pipelining
SUB SP, SP, #4 decrement stack pointer by one item to complete push of user-mode SP LDMIA SP!,{R0} pop user-mode SP from IRQ stack
Lecture 2.3 Summary
Context-switch is implemented at interrupt-level – this allows
interrupts to wake up a task
Multiple nested interrupts must unwind to the outermost before the
task-switch is implemented
All user interrupts that wake RTOS tasks must call RTOS code at entry & exit to inform RTOS
Task-level switch is handled by software interrupt, followed by
interrupt-level switch
Context-switch involves saving & restoring register values to (from)
task stacks, and changing RTOS current task info.
Lecture 2.4 – FreeRTOS Timing functions
High resolution timing in an RTOS requires use of a
dedicated hardware interval timer, with interrupt to signal
timeout
Highly system-specific, not part of this course
All RTOS provide lower resolution timing functions
based on a clock-tick interrupt.
Typically 1ms
Period may be configured but should be >> clock tick processing
time
Implementation of clock-tick requires:
Update system clock
Overview: FreeRTOS Source Organisation
FreeRTOS
tasks.c
Core task control code
list.c
Core list package
queue.c
Queue package Portable
Arm-Keil port
port.c – task-level portable code portisr.c – ISR level
portable code
portmacro.h – portable
Include
tasks.h
– interface to tasks.c
list.h
– interface to lists.c
queue.h
– interface to queue.c
FreeRTOS.h
– includes:
projdefs.h – basic definitions including port specification
portable.h – portable definitions
One file for all ports – includes
correct portmacro.h
Contains any other stuff
FreeRTOSconfig.h
App-specific definitions Also croutine.c for
non-preemptive scheduling
Overview: FreeRTOS Task States
Task state Eventlist Genericlist
READY no ReadyList
Waiting on Queue Queue Waiter list DelayedTaskList Waiting on Queue
xTicksToWait = portMAX_DELAY
=> no timeout
Queue Waiter list SuspendedTaskList
Delayed no DelayedTaskList
Overview of clock-tick code
System-specific code sets up hardware timer to
provide an interrupt at clock tick frequency
(1000Hz).
Timer ISR performs clock tick operations:
1. Save current task context
2.
vTaskIncrementTick()
Perform clock tick processing (inline function to reduce
entry/exit overhead)
3. Call Scheduler
4. Restore selected task context
Clock-tick operation
vTaskIncrementTick() -- Clock tick
processing
1.
Increment system clock
2.
prvChkDelayedTasks()
Check all delayed tasks for possible wakeup –
implemented as macro for highest speed
ARM port of timer ISR (in portISR.c)
This is the code used for preemptive scheduling
The actual C code is different from this (conditional compilation) if
RTOS is configured for nonpreemptive scheduling
void vPreemptiveTick( void ) __task {
/* Save the context of the current task. */
portSAVE_CONTEXT();
/* Increment the tick count - this may make a delayed task ready to run.*/
vTaskIncrementTick();
/* Find the highest priority task that is ready to run. */
vTaskSwitchContext();
/* Ready for the next interrupt. */
T0IR = portTIMER_MATCH_ISR_BIT;
VICVectAddr = portCLEAR_VIC_INTERRUPT;
/* Restore the context of the highest priority task that is ready to run. */
portRESTORE_CONTEXT(); }
Delayed Clock Ticks
FreeRTOS clock tick code must
deal with delayed ticks
When the scheduler is disabled
(preemption lock) clock ticks (which
cause tasks to wake up) are not
executed
They are recorded for later execution
When the scheduler is enabled again
any clock-ticks recorded are executed
immediately as delayed ticks
This may wake up tasks, cause preemption, etc.
vTaskSuspendAll()
Data Structures used by FreeRTOS clock tick
xTickCount pxDelayedTaskList pxOverflowDelayedTaskList uxMissedTicks uxSchedulerSuspended pxReadyTasksLists[0] pxReadyTasksLists[1] pxReadyTasksLists[2] Ready Queue:Array of task lists
indexed by task priority
See prvAddTaskToReadyQueue() ….
inline void
vTaskIncrementTick
( void )
{
if ( uxSchedulerSuspended == pdFALSE )
{
++xTickCount;
/* increment system clock */
if( xTickCount == ( portTickType ) 0 ) {
[
deal with tick count overflow – see 2.77
]
}
/* See if this tick has made a timeout expire.*/
prvCheckDelayedTasks();/*Real work,(next slide)*/
} else {
/* if scheduler is suspended */
++uxMissedTicks;
[
tick hook
]
}
/* end if */
if (uxMissedTicks==0) {
/* if tick hook needed */
[
tick hook
]
}
}
#if ( configUSE_TICK_HOOK == 1 ) {
extern void vApplicationTickHook( void ); vApplicationTickHook();
Normal Case
#define prvCheckDelayedTasks() \ { \ register tskTCB *pxTCB; \ \ while ( ( pxTCB = ( tskTCB * ) listGET_OWNER_OF_HEAD_ENTRY( \ pxDelayedTaskList ) ) != NULL ) \ { \ if(xTickCount < listGET_LIST_ITEM_VALUE(&(pxTCB->xGenericListItem) ) )\ { \
break; /*this item and all after will wake up in the future, so exit */ \
} \
vListRemove( &( pxTCB->xGenericListItem ) );/*remove from delayed task list*/\
/* Is the task waiting on an event also? */ \
if( pxTCB->xEventListItem.pvContainer ) \
{ \
/* if so remove it from the event waiters list as well \
vListRemove( &( pxTCB->xEventListItem ) ); \
} \
/* add task to ready queue for possible scheduling */ \
prvAddTaskToReadyQueue( pxTCB ); \
} \
Tick count overflow
If ( xTickCount == ( portTickType ) 0 ) {
xList *pxTemp;
/* Tick count has overflowed so we need to swap the delay lists. If there are any items in pxDelayedTaskList here then there is an error! */
pxTemp = pxDelayedTaskList;
pxDelayedTaskList = pxOverflowDelayedTaskList; pxOverflowDelayedTaskList = pxTemp;
#if ( INCLUDE_vTaskDelay == 1 )
void vTaskDelay( portTickType xTicksToDelay ) {
portTickType xTimeToWake;
signed portBASE_TYPE xAlreadyYielded = pdFALSE;
/* A delay time of zero just forces a reschedule. */
if( xTicksToDelay > ( portTickType ) 0 ) {
vTaskSuspendAll();
[ Delay the current task – see next slide ]
xAlreadyYielded = xTaskResumeAll(); }
/* Force a reschedule if xTaskResumeAll has not
already done so, we may have put ourselves to sleep.*/
if( !xAlreadyYielded ) {
taskYIELD(); }
} A task that is removed from the event list while the scheduler is
/* Calculate the time to wake - this may overflow but this is not a problem. */
xTimeToWake = xTickCount + xTicksToDelay;
/* We must remove ourselves from the ready list before adding ourselves to the blocked list as the same list item is used for both lists. */
vListRemove( ( xListItem * ) &( pxCurrentTCB->xGenericListItem ) );
/* The list item will be inserted in wake time order. */
listSET_LIST_ITEM_VALUE( &( pxCurrentTCB->xGenericListItem ), xTimeToWake ); if( xTimeToWake < xTickCount )
{
/* Wake time has overflowed. Place this item in the overflow list. */
vListInsert( ( xList * ) pxOverflowDelayedTaskList,
( xListItem * ) &( pxCurrentTCB->xGenericListItem ) ); }
else {
/* The wake time has not overflowed, so we use normal list. */
vListInsert( ( xList * ) pxDelayedTaskList,
( xListItem * ) &( pxCurrentTCB->xGenericListItem ) );
signed portBASE_TYPE xTaskResumeAll( void ) { register tskTCB *pxTCB; signed portBASE_TYPE xAlreadyYielded = pdFALSE; portENTER_CRITICAL(); { --uxSchedulerSuspended;
if( uxSchedulerSuspended == ( unsigned portBASE_TYPE ) pdFALSE ) {
if( uxCurrentNumberOfTasks > ( unsigned portBASE_TYPE ) 0 ) {
portBASE_TYPE xYieldRequired = pdFALSE;
/* Move any readied tasks from the pending list into the appropriate ready list. */
while( ( pxTCB = ( tskTCB * ) listGET_OWNER_OF_HEAD_ENTRY( ( ( xList * )&xPendingReadyList ) ) ) != NULL )
{
vListRemove( &( pxTCB->xEventListItem ) ); vListRemove( &( pxTCB->xGenericListItem ) ); prvAddTaskToReadyQueue( pxTCB );
/* If we have moved a task that has a priority higher than the current task then we should yield. */
if( pxTCB->uxPriority >= pxCurrentTCB->uxPriority ) {
It is possible that an ISR caused a task to be removed from an event list while the scheduler was suspended. If this was the case then the removed task will have been added to the xPendingReadyList. Once the scheduler has been resumed it is safe to move all the pending ready tasks from this list into their appropriate ready list.
TaskResumeAll() API function
NB – TaskSuspendAll() incrementsif( uxMissedTicks > ( unsigned portBASE_TYPE ) 0 ) {
while( uxMissedTicks > ( unsigned portBASE_TYPE ) 0 ) {
vTaskIncrementTick(); --uxMissedTicks; }
/* As we have processed some ticks it is appropriate to yield to ensure the highest priority task that is ready to run is the task actually running. */
xYieldRequired = pdTRUE; }
if( ( xYieldRequired == pdTRUE ) || ( xMissedYield == pdTRUE ) ) { xAlreadyYielded = pdTRUE; xMissedYield = pdFALSE; taskYIELD(); } } } } portEXIT_CRITICAL(); return xAlreadyYielded;
If any ticks occurred while the scheduler was suspended then they should be processed now. This ensures the tick count does not slip, and that any delayed tasks are resumed at the correct time.
Calling the scheduler from task API
taskYIELD() is the FreeRTOS macro that invokes the
scheduler
Called from task level
Performs task switch to new task if current task is not highest
priority ready task
Typically called from task code that has just altered task state:
taskYIELD() return means
Either no task-switch Or the task has suspended and woken up again
taskYIELD can be called from inside or outside a critical
section, on return the critical section will be as before – though
if task switch happened interrupts will be enabled while the task
is suspended & therefore criticality broken
Other techniques: Method 1
MicroC/OS-II uses a different strategy to
implement task delay, which avoids having to
consider task overflow
Each delayed task is put on the delayed task list with
a field that indicates "time to wakeup" = wakeup
time – current time
Order of tasks in list is not significant
"time to wakeup" precision can be shorter than system
clock precision, so reducing TCB size (e.g. 2 bytes instead
of 4 bytes)
The clock tick chains through the entire delayed task
list decrementing each "time to wakeup"
This strategy has the big disadvantage
that the clock-tick code is proportional
to number of delayed tasks
Typically all of the tasks in a system may
be delayed (except the running task) since
blocking on events has an optional
timeout.
This makes clock tick processing high
Other Techniques: method 2
Method 2: eliminate possibility of overflow
A 64 bit integer for portTickType would mean that with
any possible tick rates overflow will take more than 100
years
Ok to ignore overflow & so simplify the code?
Computing or comparing time delay requires a 64 bit
subtraction
Lecture 2.4 - Summary
FreeRTOS divides source into port-independent & port-specific
files
Clock tick ISR implements system clock, task delays & timeouts.
System clock is incremented once per clock tick
Basic function to wake up tasks in delayed task list whose wakeup
time is equal to that of system clock
Future clock ticks can be stacked up for later execution when
scheduling is locked.
System clock overflow is dealt with via an overflow delayed task
list
This is necessitated by possibility of wrap-around in system clock making time ordering incorrect
Alternative solution is to store time-till-wakeup with each task that is in delayed task list.
Cleaner, but makes clock tick considerably slower if there are many delayed tasks
Lecture 2.5 – RTOS Objects
The framework we have already examined makes it easy to write
RTOS code that implements synchronisation & communication
operations. We will look at algorithms for implementing two
common constructs which illustrate the principles.
Use FreeRTOS as concrete example
Objects considered:
Semaphores Message Queues
Implementing new objects is one of the easier RTOS design
problems.
Most of the code – event lists of waiting tasks – is common to all objects and can be reused.
Semaphore Implementation
Semaphore is implemented using a variable to indicate the number of
tokens held by the semaphore
Semaphore Take operation
Decrement this value, or if it is zero suspend the taking task
Semaphore Give operation
Increment this value, or if there are waiting tasks (and therefore the token count = 0) wake up the highest priority such leaving token count unchanged.
Semaphore Create operation
Allocate semaphore control block
Set initial number of tokens
Counting & binary semaphore can have identical implementation – only
difference is initial token value