Thermal- and Cache-Aware Resource Management based on ML-Driven Cache Contention Prediction
Mohammed Bakr Sikala, Heba Khdrb, Martin Rappc and Jörg Henkeld
Karlsruhe Institute of Technology, Karlsruhe, Germany
abakr.sikal@kit.edu
bheba.khdr@kit.edu
cmartin.rapp@kit.edu
dhenkel@kit.edu
ABSTRACT
While on-chip many-core systems enable a large number of applications to run in parallel, the increased overall performance may come at the cost of complicating the performance constraints of individual applications due to contention on shared resources. For instance, the competition for last-level cache by concurrently-running applications may lead to slowing down the execution and to potentially violating individual performance constraints. Clustered many-cores reduce cache contention at chip level by sharing caches only at cluster level. To reduce cache contention within a cluster, state-of-the art techniques aim to co-map a memory-intensive application with a compute-intensive application onto one cluster. However, compute-intensive applications typically consume high power, and therefore, executing another application in their nearby cores may lead to high temperatures. Hence, there is a trade-off between cache contention and temperature. This paper is the first to consider this trade-off through a novel thermal- and cache-aware resource management technique. We build a neural network (NN)-based model to predict the slowdown of the application execution induced by cache contention feeding our resource management technique that then optimizes the application mapping and selects the voltage/frequency levels of the clusters to compensate for the potential contention-induced slowdown. Thereby, it meets the performance constraints, while minimizing temperature. Compared to the state of the art, our technique significantly reduces the temperature by 30% on average, while satisfying performance constraints of all individual applications.
Keywords: Thermal Optimization, Resource Management, Cache Contention, Application Mapping, Dvfs, Machine Learning.