Multi-Armed Bandits for Efficient Lifetime Estimation in MPSoC design

Calvin Maa, Aditya Mahajanb and Brett H. Meyerc
Department of Electrical and Computer Engineering, McGill University, Montréal, Québec, Canada.
acalvin.ma@mail.mcgill.ca
baditya.mahajan@mcgill.ca
cbrett.meyer@mcgill.ca

ABSTRACT


Reliability in integrated circuits is becoming a critical issue with the miniaturization of electronics. Smaller process technologies have led to higher power densities, resulting in higher temperatures and earlier device wear-out. One way to mitigate failure is by over-provisioning resources and remapping tasks from failed components to components with spare capacity, or slack. Since the slack allocation design space is large, finding the optimal is difficult, as brute-force approaches are impractical. During design space exploration, device lifetimes are typically evaluated using Monte-Carlo Simulation (MCS) by sampling each design equally; this method is inefficient since poor designs are evaluated as accurately as good designs. A better method will focus sampling time on the designs that are difficult to distinguish, reducing the time required to evaluate a set of designs; this can be accomplished using Multi-armed Bandit (MAB) Algorithms. This work demonstrates that MAB achieve the same level of accuracy as MCS in 1.45 to 5.26 times fewer samples.



Full Text (PDF)