Real-Time Scheduling on Multi-core: Theory and Practice
Ref: CISTER-TR-131202       Publication Date: 29, Oct, 2013

Real-Time Scheduling on Multi-core: Theory and Practice

Ref: CISTER-TR-131202       Publication Date: 29, Oct, 2013

Nowadays, multi-core platforms are commonplace for efficiently achieving high computational power, even in embedded systems, with the number of cores steadily increasing and expected to reach hundreds of cores per chip in near future. Real-time computing is becoming increasingly important and pervasive, as more and more industries, infrastructures, and people depend on it. For real-time systems too, multi-cores offer an opportunity for a considerable boost in processing capacity, at relatively low price and power. This could, in principle, help with meeting the timing requirements of computationally intensive applications which could not be met on single-cores. However, real-time system designers must adopt suitable approaches for their system to allow them to fully exploit the capabilities of the multi-core platforms. In this line, scheduling algorithms will, certainly, play an important role.
Real-time scheduling algorithms for multiprocessors are typically categorized as global, partitioned, and semi-partitioned. Global scheduling algorithms store tasks in one global queue, shared by all processors. Tasks can migrate from one processor to another; that is, a task can be preempted during its execution and resume its execution on another processor. At any moment, the m highest-priority tasks are selected for execution on the m processors. Some algorithms of this kind achieve a utilization bound of 100% but generate too many preemptions and migrations. Partitioned scheduling algorithms partition the task set and assign all tasks in a partition to the same processor. Hence, tasks cannot migrate between processors. Such algorithms involve few preemptions but their utilization bound is at most 50%. In semi-partitioned (or task-splitting) scheduling algorithms most tasks are fixed to specific processors (like partitioned), while a few tasks migrate across processors (like global). This approach produces a better balance of the workload among processors than partitioning and consequently presents higher utilization bounds, additionally, it also reduces the contention on shared queues and the number of migrations (by reducing the number of migratory tasks). However, it must ensure that migratory tasks never execute on two or more processors simultaneously.
In this dissertation, we deal with semi-partitioned scheduling algorithms that are categorized as slot-based task-splitting to achieve the above mentioned claims. To this end, we define a set of design principles to efficiently implement those scheduling algorithms in a real operating system (a PREEMPT-RT-patched Linux kernel version). Grounded on implementations of the slot-based task-splitting scheduling algorithms in the Linux kernel, we identify and model all run-time over- heads incurred by those scheduling algorithms. We, then, incorporate them into a new schedu- lability analysis. This new theory is based on exact schedulability tests, thus also overcoming many sources of pessimism in existing analysis. Additionally, since schedulability testing guides the task assignment under the schemes in consideration, we also formulate an improved task as- signment procedure. The outcome is a new demand-based overhead-aware schedulability analysis that permits increased efficiency and reliability. We also devise a new scheduling algorithm for multiprocessor systems, called Carousel-EDF. Although, it presents some similarities with slot- based task-splitting scheduling algorithms, we classify the Carousel-EDF scheduling algorithm as a reserve-based scheduling algorithm. The underlying theory is also overhead-aware and we also implement it in the Linux kernel.

Paulo Baltarejo Sousa

PhD Thesis, Faculdade de Engenharia da Universidade do Porto.
Porto, Portugal.

Record Date: 29, Oct, 2013