CAnDy-TM: Comparative Analysis of Dynamic Thermal Management in Many-Cores using Model Checking

Syed Ali Asadullah Bukhari1,a, Faiq Khalid Lodhi1,b, Osman Hasan1,c, Muhammad Shafique2 and Jörg Henkel3
1School of Electrical and Computer Science, National University of Sciences and Technology, Islamabad, Pakistan.
aali.asadullah@seecs.edu.pk
bfaiq.khalid@seecs.edu.pk
cosman.hasan@seecs.edu.pk
2Institute of Computer Engineering, Vienna University of Technology, Vienna, Austria.
muhammad.shafique@tuwien.ac.at
3Chair for Embedded Systems, Karlsruhe Institute of Technology, Karlsruhe, Germany.
henkel@kit.edu

ABSTRACT


Dynamic thermal management (DTM) techniques based on task migration provide a promising solution to mitigate thermal emergencies and thereby ensuring safe operation and reliability of Many-Core systems. These techniques can be classified as central or distributed on the basis of a central DTM controller for the whole system or individual DTM controllers for each core or set of cores in the system, respectively. However, having a trustworthy comparison between central (c-) and distributed (d-) DTM techniques to find out the most suitable one for a given system is quite challenging. This is primarily due to the systemic difference between cDTM and dDTM controllers, and the inherent non-exhaustiveness of simulation and emulation methods conventionally used for DTM analysis. In this paper, we present a novel methodology called CAnDy-TM (stands for Comparative Analysis of Dynamic Thermal Management) that employs Model Checking to perform formal comparative analysis for cDTM and dDTM techniques. We identify a set of generic ground for their comparison. We demonstrate the usability and benefits of our methodology by comparing state-of-the-art cDTM and dDTM techniques, and illustrate which technique is good w.r.t. thermal stability and other task migration parameters. Such an analysis helps in selecting the most appropriate DTM for a given chip.



Full Text (PDF)