W08 First Workshop on Resource Awareness and Application Autotuning in Adaptive and Heterogeneous Computing

Printer-friendly versionPDF version
Location / Room: 
Seminar 3-4

There are no handouts available yet. Please ask the organisers.


Walter Stechele, Technische Universität München, DE (Contact Walter Stechele)
Cristina Silvano, Politecnico di Milano, IT (Contact Cristina Silvano)
Stephan Wong, TU Delft, NL (Contact Stephan Wong)

Adaptive and heterogeneous computing platforms are gaining interest for applications spanning from embedded to high performance computing due to their promising power/performance ratio. However, sharing hardware resources creates some challenges with respect to predictable execution time and power consumption. In traditional real-time approaches, resource usage is over dimensioned to achieve worst case guarantees, whereas in best effort approaches, predictability remains a challenge. The goal of the workshop is to bring together researchers from the area of resource awareness and application autotuning, to discuss their various approaches, their commonalities and differences, to foster collaboration between them and to share their most recent research achievements with the international research community.


08:45W08.1Opening Session

09:00W08.2Morning Session 1

Cristina Silvano, Politecnico di Milano, IT

09:00W08.2.1Programming and Benchmarking FPGAs with Software-Centric Design Entries
Cathal McCabe, Xilinx, IE

End of Dennard scaling and increasing performance to cost ratios on multicore architectures have recently stimulated a new era of increasingly heterogeneous compute architectures and large diversification of integrated circuits has dawned. During this talk, we will elaborate on our ongoing efforts within the Xilinx research organization to shed light on the universal question what applications work on which architectures best, and benchmark and characterize a wide spectrum of applications and compute architectures including FPGAs, GPUs, Xeons and Xeon Phis.

09:30W08.2.2Adaptive Restriction and Isolation for Predictable MPSoC Stream Processing
Jürgen Teich, Friedrich-Alexander-Universität Erlangen-Nürnberg, DE

Resource sharing and interferences of multiple threads of one, but even worse between multiple application programs running concurrently on a Multi-Processor System-on-a-Chip (MPSoC) today make it very hard to provide any timing or throughput-critical applications with time bounds.
Additional interferences result from the interaction of OS functions such as thread multiplexing and scheduling as well as complex resource
(e.g., cache) reservation protocols used heavily today. Finally, dynamic power and temperature management on a chip might also throttle down processor speed at arbitrary times leading to additional varations and jitter in execution time. This may be intolerable for many safety-critical applications such as medical imaging or automotive driver assistance systems.

Static solutions to provide the required isolation by allocating distinct resources to safety- or performance-critical applications may not be feasible for reasons of cost and due to the lack of efficiency and unflexibility.

In this invited talk, we first review and present novel definitions of predictability of execution qualities. Subsequently, we distinguish two techniques for improving predictability called restriction and isolation and present new definitions. Then, new techniques for adaptive isolation of resources including processor, I/O, memory as well as communication resources on demand on an MPSoC are introduced based on the paradigm of Invasive Computing.
In Invasive Computing, a programmer may specify bounds on the execution quality of a program or even segment of a program followed by an invade command that returns a constellation of exclusive resources called a claim that is subsequently used in a by-default non-shared way until being released
again by the invader. Through this principle, it becomes possible to isolate applications automatically and in an on-demand manner. In invasive computing, isolation is supported on all levels of hardware and software including the OS. Together with restriction (of input uncertainties), the level of on-demand predictability of program execution qualities may be fundamentally increased.

For a broad class of streaming applications, and a concrete demonstration based on a complex object detection application algorithm chain taken from robot vision, we show how jitter-minimized implementations become possible, even for statically unknown arrivals of other concurrent applications.

10:00W08.2.3Introduction to the Poster Session

11:00W08.3Morning Session 2

Walter Stechele, TUM, DE

11:00W08.3.1Energy efficiency in high performance computing. Examples from RSC experience.
Alexander Moskovsky, RSC Group, RU

11:30W08.3.2New Computer Architectures for High Performance Computing - The DEEP-ER and Mont-Blanc Projects
Axel Auweter, Leibniz Supercomputing Centre, DE

Numeric simulations and scientific computing cause an increasing demand for high performance computing resources. However, limitations due to energy consumption and limited application scalability require significant research efforts to build the next generation of even larger, yet more efficient supercomputers.
This talk will show the concepts and achievements of the EU funded DEEP-ER, and Mont-Blanc projects in researching and prototyping new computer architectures suited for future scientific computing applications and also explain the challenges when trying to tune application performance and energy efficiency at scale.

13:00W08.4Afternoon Session 1

Stephan Wong, TU Delft, NL

13:00W08.4.1A DSL-based Approach for Cross Layer Programming: Monitoring, Adaptivity and Tuning
João M. P. Cardoso, Faculty of Engineering (FEUP), University of Porto, PT

The variety of concerns developers may have to face when developing and compiling their applications overloads the development and deployment phases. When targeting contemporary systems (e.g., including heterogeneous many/multicore architectures) developers need to focus on specific mapping decisions and optimizations, specification of runtime adaptivity strategies, and features related to the tools being used. Monitoring, adaptivity, and tuning are three fundamental features developers may need to consider in order to achieve more efficient software and hardware/software solutions. One of the challenges is related to the way to express these features when considering General Programming Languages (GPLs), such as C and Java. In this talk we present our Domain-Specific Language (DSL) approach and the challenges we face to make the approach highly useful to developers. Our approach is based on LARA, an aspect-oriented programming (AOP) language, specifically designed for allowing developers to program code instrumentation strategies, to control the application of code transformations and compiler optimizations, and to effectively control different tools in a toolchain. LARA provides a separation of concerns, including non-functional requirements and strategies, while allowing to exploit the benefits of an automatic approach for various domain-specific and target component-specific compilation/synthesis tools. The experiments of using LARA in the context of different aspects of programming and compilation have revealed its cross-cutting and cross-layer benefits. In this talk, we will briefly introduce the LARA language, its current status and supporting tools, as well as some of the plans related to the extensions to LARA in the context of the H2020 ANTAREX project.

13:30W08.4.2Resource management in self-aware platforms
Axel Jantsch, Technical University of Vienna, AT

Capabilities of self-awareness have been shown to facilitate the sensible management of a system's resources in the presence of faults and when environmental conditions and demands change. This is not surprising since a detailed knowledge of the system's own state, its shortcomings, its performance as well as the environment's expectations is a precondition to allocate the available resource where they are most useful.

The talk will demonstrate this principle by way of the Cyber-physical SoC (CPSoC), which is a platform full of sensors to monitor delays, power consumption, temperature, and aging phenomena. This information is used to continuously adjust resource usage resulting in minimal power consumption and wear out for a given performance target.

14:00W08.4.3Poster Interactive Presentations

15:00W08.5Afternoon Session 2

Jeronimo Castrillon, TU Dresden, Germany, DE

15:00W08.5.1Drivers and solutions for tailored automotive ECU architectures
Andreas Rohatschek, Robert Bosch GmbH, DE

15:30W08.5.2Panel discussion on "Moore's law is still alive! So why resource awareness?"
Cathal McCabe1, Axel Auweter2, João M. P. Cardoso3, Axel Jantsch4, Andreas Rohatschek5, X. Sharon Hu6 and Michael Hübner7
1Xilinx, IE; 2Leibniz Supercomputing Centre, DE; 3Faculty of Engineering (FEUP), University of Porto, PT; 4Technical University of Vienna, AT; 5Robert Bosch GmbH, DE; 6University of Notre Dame, US; 7Ruhr-University Bochum, DE

Speakers from morning & afternoon sessions are invited as panelists.

Chair: Michael Hübner