DATE 2020

Dynamic Thermal Management with Proactive Fan Speed Control Through Reinforcement Learning

Arman Iranfar^1,a, Federico Terraneo^2,e, Gabor Csordas^1,b, Marina Zapater^1,c, William Fornaciari^2,f and David Atienza^1,d

¹Embedded Systems Laboratory (ESL), Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland
^aarman.iranfar@epfl.ch
^bgabor.csordas@epfl.ch
^cmarina.zapater@epfl.ch
^ddavid.atienza@epfl.ch
²Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Italy
^efederico.terraneo@polimi.it
^fwilliam.fornaciari@polimi.it

ABSTRACT

Dynamic Thermal Management (DTM) has become a major challenge since it directly affects Multiprocessors Systems-on-chip (MPSoCs) performance, power consumption, and reliability. In this work, we propose a transient fan model, enabling adaptive fan speed control simulation for efficient DTM. Our model is validated through a thermal test chip achieving less than 2°C error in the worst case. With multiple fan speeds, however, the DTM design space grows significantly, which can ultimately make conventional solutions impractical. We address this challenge through a reinforcement learning-based solution to proactively determine the number of active cores, operating frequency, and fan speed. The proposed solution is able to reduce fan power by up to 40% compared to a DTM with constant fan speed with less than 1% performance degradation. Also, compared to a state-of-the-art DTM technique our solution improves the performance by up to 19% for the same fan power.

Full Text (PDF)