3.5 Parallel real-time systems

Time	Label	Presentation Title Authors
14:30	3.5.1	ON THE VOLUME CALCULATION FOR CONDITIONAL DAG TASKS: HARDNESS AND ALGORITHMS Speaker: Jinghao Sun, Northeastern University, CN Authors: Jinghao Sun¹, Yaoyao Chi¹, Tianfei Xu¹, Lei Cao¹, Nan Guan², Zhishan Guo³ and Wang Yi⁴ ¹Northeastern University, CN; ²The Hong Kong Polytechnic University, CN; ³University of Central Florida, US; ⁴Uppsala universitet, SE Abstract The hardness of analyzing conditional directed acyclic graph (DAG) tasks remains unknown so far. For example, previous researches asserted that the conditional DAG's volume can be solved in polynomial time. However, these researches all assume well-nested structures that are recursively composed by single-source-single-sink parallel and conditional components. For conditional DAGs in general that do not comply with this assumption, the hardness and algorithms of volume computation are still open. In this paper, we construct counterexamples to show that previous work cannot provide a safe upper bound of the conditional DAG's volume in general. Moreover, we prove that the volume computation problem for conditional DAGs is strongly NP-hard. Finally, we propose an exact algorithm for computing the conditional DAG's volume. Experiments show that our method can significantly improve the accuracy of the conditional DAG's volume estimation. Download Paper (PDF; Only available from the DATE venue WiFi)
15:00	3.5.2	WCET-AWARE CODE GENERATION AND COMMUNICATION OPTIMIZATION FOR PARALLELIZING COMPILERS Speaker: Simon Reder, Karlsruhe Institute of Technology, DE Authors: Simon Reder and Juergen Becker, Karlsruhe Institute of Technology, DE Abstract High performance demands of present and future embedded applications increase the need for multi-core processors in hard real-time systems. Challenges in static multi-core WCET-analysis and the more complex design of parallel software, however, oppose the adoption of multi-core processors in that area. Automated parallelization is a promising approach to solve these issues, but specialized solutions are required to preserve static analyzability. With a WCET-aware parallelizing transformation, this work presents a novel solution for an important building block of a real-time capable parallelizing compiler. The approach includes a technique to optimize communication and synchronization in the parallelized program and supports complex memory hierarchies consisting of both shared and core-private memory segments. In an experiment with four different applications, the parallelization improved the WCET by up to factor 3.2 on 4 cores. The studied optimization technique and the support for shared memories significantly contribute to these results. Download Paper (PDF; Only available from the DATE venue WiFi)
15:30	3.5.3	TEMPLATE SCHEDULE CONSTRUCTION FOR GLOBAL REAL-TIME SCHEDULING ON UNRELATED MULTIPROCESSOR PLATFORMS Authors: Antoine Bertout¹, Joel Goossens², Emmanuel Grolleau³ and Xavier Poczekajlo⁴ ¹LIAS, Université de Poitiers, ISAE-ENSMA, FR; ²ULB, BE; ³LIAS, ISAE-ENSMA, Universite de Poitiers, FR; ⁴Université libre de Bruxelles, BE Abstract The seminal work on the global real-time scheduling of periodic tasks on unrelated multiprocessor platforms is based on a two-steps method. First, the workload of each task is distributed over the processors and it is proved that the success of this first step ensures the existence of a feasible schedule. Second, a method for the construction of a template schedule from the workload assignment is presented. In this work, we review the seminal work and show by using a counter-example that this second step is incomplete. Thus, we propose and prove correct a novel and efficient algorithm to build the template schedule. Download Paper (PDF; Only available from the DATE venue WiFi)
15:45	3.5.4	APPLICATION-AWARE SCHEDULING OF NETWORKED APPLICATIONS OVER THE LOW-POWER WIRELESS BUS Speaker: Kacper Wardega, Boston University, US Authors: Kacper Wardega and Wenchao Li, Boston University, US Abstract Recent successes of wireless networked systems in advancing industrial automation and in spawning the Internet of Things paradigm motivate the adoption of wireless networked systems in current and future safety-critical applications. As reliability is key in safety-critical applications, in this work we present NetDAG, a scheduler design and implementation suitable for real-time applications in the wireless setting. NetDAG is built upon the Low-Power Wireless Bus, a high-performant communication abstraction for wireless networked systems, and enables system designers to directly schedule applications under specified task-level real-time constraints. Access to real-time primitives in the scheduler permits efficient design exploration of tradeoffs between power consumption and latency. Furthermore, NetDAG provides support for weakly hard real-time applications with deterministic guarantees, in addition to heretofore considered soft real-time applications with probabilistic guarantees. We propose novel abstraction techniques for reasoning about conjunctions of weakly hard constraints and show how such abstractions can be used to handle the significant scheduling difficulties brought on by networked components with weakly hard behaviors. Download Paper (PDF; Only available from the DATE venue WiFi)
16:01	IP1-15, 453	A PERFORMANCE ANALYSIS FRAMEWORK FOR REAL-TIME SYSTEMS SHARING MULTIPLE RESOURCES Speaker: Shayan Tabatabaei Nikkhah, Eindhoven University of Technology, NL Authors: Shayan Tabatabaei Nikkhah¹, Marc Geilen¹, Dip Goswami¹ and Kees Goossens² ¹Eindhoven University of Technology, NL; ²Eindhoven university of technology, NL Abstract Timing properties of applications strongly depend on resources that are allocated to them. Applications often have multiple resource requirements, all of which must be met for them to proceed. Performance analysis of event-based systems has been widely studied in the literature. However, the proposed works consider only one resource requirement for each application task. Additionally, they mainly focus on the rate at which resources serve applications (e.g., power, instructions or bits per second), but another aspect of resources, which is their provided capacity (e.g., energy, memory ranges, FPGA regions), has been ignored. In this work, we propose a mathematical framework to describe the provisioning rate and capacity of various types of resource. Additionally, we consider the simultaneous use of multiple resources. Conservative bounds on response times of events and their backlog are computed. We prove that the bounds are monotone in event arrivals and in required and provided rate and capacity, which enables verification of real-time application performance based on worst-case characterizations. The applicability of our framework is shown in a case study. Download Paper (PDF; Only available from the DATE venue WiFi)
16:02	IP1-16, 778	SCALING UP THE MEMORY INTERFERENCE ANALYSIS FOR HARD REAL-TIME MANY-CORE SYSTEMS Speaker: Matheus Schuh, Verimag / Kalray, FR Authors: Matheus Schuh¹, Maximilien Dupont de Dinechin², Matthieu Moy³ and Claire Maiza⁴ ¹Verimag / Kalray, FR; ²ENS Paris / ENS Lyon / LIP, FR; ³ENS Lyon / LIP, FR; ⁴Grenoble INP / Verimag, FR Abstract In RTNS 2016, Rihani et al. proposed an algorithm to compute the impact of interference on memory accesses on the timing of a task graph. It calculates a static, time-triggered schedule, i.e. a release date and a worst-case response time for each task. The task graph is a DAG, typically obtained by compilation of a high-level dataflow language, and the tool assumes a previously determined mapping and execution order. The algorithm is precise, but suffers from a high O(n^4) complexity, n being the number of input tasks. Since we target many-core platforms with tens or hundreds of cores, applications likely to exploit the parallelism of these platforms are too large to be handled by this algorithm in reasonable time. This paper proposes a new algorithm that solves the same problem. Instead of performing global fixed-point iterations on the task graph, we compute the static schedule incrementally, reducing the complexity to O(n^2). Experimental results show a reduction from 535 seconds to 0.90 seconds on a benchmark with 384 tasks, i.e. 593 times faster. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00		End of session

Time

Label

Presentation Title
Authors

14:30

3.5.1

ON THE VOLUME CALCULATION FOR CONDITIONAL DAG TASKS: HARDNESS AND ALGORITHMS
Speaker:
Jinghao Sun, Northeastern University, CN
Authors:
Jinghao Sun¹, Yaoyao Chi¹, Tianfei Xu¹, Lei Cao¹, Nan Guan², Zhishan Guo³ and Wang Yi⁴
¹Northeastern University, CN; ²The Hong Kong Polytechnic University, CN; ³University of Central Florida, US; ⁴Uppsala universitet, SE
Abstract
The hardness of analyzing conditional directed acyclic graph (DAG) tasks remains unknown so far. For example, previous researches asserted that the conditional DAG's volume can be solved in polynomial time. However, these researches all assume well-nested structures that are recursively composed by single-source-single-sink parallel and conditional components. For conditional DAGs in general that do not comply with this assumption, the hardness and algorithms of volume computation are still open. In this paper, we construct counterexamples to show that previous work cannot provide a safe upper bound of the conditional DAG's volume in general. Moreover, we prove that the volume computation problem for conditional DAGs is strongly NP-hard. Finally, we propose an exact algorithm for computing the conditional DAG's volume. Experiments show that our method can significantly improve the accuracy of the conditional DAG's volume estimation.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:00

3.5.2

WCET-AWARE CODE GENERATION AND COMMUNICATION OPTIMIZATION FOR PARALLELIZING COMPILERS
Speaker:
Simon Reder, Karlsruhe Institute of Technology, DE
Authors:
Simon Reder and Juergen Becker, Karlsruhe Institute of Technology, DE
Abstract
High performance demands of present and future embedded applications increase the need for multi-core processors in hard real-time systems. Challenges in static multi-core WCET-analysis and the more complex design of parallel software, however, oppose the adoption of multi-core processors in that area. Automated parallelization is a promising approach to solve these issues, but specialized solutions are required to preserve static analyzability. With a WCET-aware parallelizing transformation, this work presents a novel solution for an important building block of a real-time capable parallelizing compiler. The approach includes a technique to optimize communication and synchronization in the parallelized program and supports complex memory hierarchies consisting of both shared and core-private memory segments. In an experiment with four different applications, the parallelization improved the WCET by up to factor 3.2 on 4 cores. The studied optimization technique and the support for shared memories significantly contribute to these results.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:30

3.5.3

TEMPLATE SCHEDULE CONSTRUCTION FOR GLOBAL REAL-TIME SCHEDULING ON UNRELATED MULTIPROCESSOR PLATFORMS
Authors:
Antoine Bertout¹, Joel Goossens², Emmanuel Grolleau³ and Xavier Poczekajlo⁴
¹LIAS, Université de Poitiers, ISAE-ENSMA, FR; ²ULB, BE; ³LIAS, ISAE-ENSMA, Universite de Poitiers, FR; ⁴Université libre de Bruxelles, BE
Abstract
The seminal work on the global real-time scheduling of periodic tasks on unrelated multiprocessor platforms is based on a two-steps method. First, the workload of each task is distributed over the processors and it is proved that the success of this first step ensures the existence of a feasible schedule. Second, a method for the construction of a template schedule from the workload assignment is presented. In this work, we review the seminal work and show by using a counter-example that this second step is incomplete. Thus, we propose and prove correct a novel and efficient algorithm to build the template schedule.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:45

3.5.4

APPLICATION-AWARE SCHEDULING OF NETWORKED APPLICATIONS OVER THE LOW-POWER WIRELESS BUS
Speaker:
Kacper Wardega, Boston University, US
Authors:
Kacper Wardega and Wenchao Li, Boston University, US
Abstract
Recent successes of wireless networked systems in advancing industrial automation and in spawning the Internet of Things paradigm motivate the adoption of wireless networked systems in current and future safety-critical applications. As reliability is key in safety-critical applications, in this work we present NetDAG, a scheduler design and implementation suitable for real-time applications in the wireless setting. NetDAG is built upon the Low-Power Wireless Bus, a high-performant communication abstraction for wireless networked systems, and enables system designers to directly schedule applications under specified task-level real-time constraints. Access to real-time primitives in the scheduler permits efficient design exploration of tradeoffs between power consumption and latency. Furthermore, NetDAG provides support for weakly hard real-time applications with deterministic guarantees, in addition to heretofore considered soft real-time applications with probabilistic guarantees. We propose novel abstraction techniques for reasoning about conjunctions of weakly hard constraints and show how such abstractions can be used to handle the significant scheduling difficulties brought on by networked components with weakly hard behaviors.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:01

IP1-15, 453

A PERFORMANCE ANALYSIS FRAMEWORK FOR REAL-TIME SYSTEMS SHARING MULTIPLE RESOURCES
Speaker:
Shayan Tabatabaei Nikkhah, Eindhoven University of Technology, NL
Authors:
Shayan Tabatabaei Nikkhah¹, Marc Geilen¹, Dip Goswami¹ and Kees Goossens²
¹Eindhoven University of Technology, NL; ²Eindhoven university of technology, NL
Abstract
Timing properties of applications strongly depend on resources that are allocated to them. Applications often have multiple resource requirements, all of which must be met for them to proceed. Performance analysis of event-based systems has been widely studied in the literature. However, the proposed works consider only one resource requirement for each application task. Additionally, they mainly focus on the rate at which resources serve applications (e.g., power, instructions or bits per second), but another aspect of resources, which is their provided capacity (e.g., energy, memory ranges, FPGA regions), has been ignored. In this work, we propose a mathematical framework to describe the provisioning rate and capacity of various types of resource. Additionally, we consider the simultaneous use of multiple resources. Conservative bounds on response times of events and their backlog are computed. We prove that the bounds are monotone in event arrivals and in required and provided rate and capacity, which enables verification of real-time application performance based on worst-case characterizations. The applicability of our framework is shown in a case study.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:02

IP1-16, 778

SCALING UP THE MEMORY INTERFERENCE ANALYSIS FOR HARD REAL-TIME MANY-CORE SYSTEMS
Speaker:
Matheus Schuh, Verimag / Kalray, FR
Authors:
Matheus Schuh¹, Maximilien Dupont de Dinechin², Matthieu Moy³ and Claire Maiza⁴
¹Verimag / Kalray, FR; ²ENS Paris / ENS Lyon / LIP, FR; ³ENS Lyon / LIP, FR; ⁴Grenoble INP / Verimag, FR
Abstract
In RTNS 2016, Rihani et al. proposed an algorithm to compute the impact of interference on memory accesses on the timing of a task graph. It calculates a static, time-triggered schedule, i.e. a release date and a worst-case response time for each task. The task graph is a DAG, typically obtained by compilation of a high-level dataflow language, and the tool assumes a previously determined mapping and execution order. The algorithm is precise, but suffers from a high O(n^4) complexity, n being the number of input tasks. Since we target many-core platforms with tens or hundreds of cores, applications likely to exploit the parallelism of these platforms are too large to be handled by this algorithm in reasonable time. This paper proposes a new algorithm that solves the same problem. Instead of performing global fixed-point iterations on the task graph, we compute the static schedule incrementally, reducing the complexity to O(n^2). Experimental results show a reduction from 535 seconds to 0.90 seconds on a benchmark with 384 tasks, i.e. 593 times faster.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:00

End of session