Runtime Energy Minimization of Distributed Many-Core Systems using Transfer Learning

Dainius Jenkusa, Fei Xiab, Rishad Shafikc and Alex Yakovlevd
School of Engineering, Newcastle University, NE1 7RU, UK


The heterogeneity of computing resources continues to permeate into many-core systems making energy-efficiency a challenging objective. Existing rule-based and model-driven methods return sub-optimal energy-efficiency and limited scalability as system complexity increases to the domain of distributed systems. This is exacerbated further by dynamic variations of workloads and quality-of-service (QoS) demands. This work presents a QoS-aware runtime management method for energy minimization using a transfer learning (TL) driven exploration strategy. It enhances standard Q-learning to improve both learning speed and operational optimality (i.e., QoS and energy). The core to our approach is a multi-dimensional knowledge transfer across a task's state-action space. It accelerates the learning of dynamic voltage/frequency scaling (DVFS) control actions for tuning power/performance trade-offs. Firstly, the method identifies and transfers already learned policies between explored and behaviorally similar states referred to as Intra-Task Learning Transfer (ITLT). Secondly, if no similar “expert” states are available, it accelerates exploration at a local state’s level through what’s known as Intra-State Learning Transfer (ISLT). A comparative evaluation of the approach indicates faster and more balanced exploration. This is shown through energy savings ranging from 7.30% to 18.06%, and improved QoS from 10.43% to 14.3%, when compared to existing exploration strategies. This method is demonstrated under WordPress and TensorFlow workloads on a server cluster.

Full Text (PDF)