DATE 2022

Runtime Energy Minimization of Distributed Many-Core Systems using Transfer Learning

Dainius Jenkus^a, Fei Xia^b, Rishad Shafik^c and Alex Yakovlev^d
School of Engineering, Newcastle University, NE1 7RU, UK
^ad.jenkus1@newcastle.ac.uk
^bfei.xia@newcastle.ac.uk
^crishad.shafik@newcastle.ac.uk
^dalex.yakovlev@newcastle.ac.uk

ABSTRACT

The heterogeneity of computing resources continues to permeate into many-core systems making energy-efficiency a challenging objective. Existing rule-based and model-driven methods return sub-optimal energy-efficiency and limited scalability as system complexity increases to the domain of distributed systems. This is exacerbated further by dynamic variations of workloads and quality-of-service (QoS) demands. This work presents a QoS-aware runtime management method for energy minimization using a transfer learning (TL) driven exploration strategy. It enhances standard Q-learning to improve both learning speed and operational optimality (i.e., QoS and energy). The core to our approach is a multi-dimensional knowledge transfer across a task's state-action space. It accelerates the learning of dynamic voltage/frequency scaling (DVFS) control actions for tuning power/performance trade-offs. Firstly, the method identifies and transfers already learned policies between explored and behaviorally similar states referred to as Intra-Task Learning Transfer (ITLT). Secondly, if no similar “expert” states are available, it accelerates exploration at a local state’s level through what’s known as Intra-State Learning Transfer (ISLT). A comparative evaluation of the approach indicates faster and more balanced exploration. This is shown through energy savings ranging from 7.30% to 18.06%, and improved QoS from 10.43% to 14.3%, when compared to existing exploration strategies. This method is demonstrated under WordPress and TensorFlow workloads on a server cluster.

Full Text (PDF)