DATE 2021

A Case for Emerging Memories in DNN Accelerators

Avilash Mukherjee^1,a, Kumar Saurav², Prashant Nair^1,b, Sudip Shekhar^1,c and Mieszko Lis^1,d
¹The University of British Columbia, Vancouver, Canada
^aavilash@ece.ubc.ca
^bprashantnair@ece.ubc.ca
^csudip@ece.ubc.ca
^dmieszko@ece.ubc.ca
²QUALCOMM India
saurav@qti.qualcomm.com

ABSTRACT

The popularity of Deep Neural Networks (DNNs) has led to many DNN accelerator architectures, which typically focus on the on-chip storage and computation costs. However, much of the energy is spent on accesses to off-chip DRAM memory. While emerging resistive memory technologies such as MRAM, PCM, and RRAM can potentially reduce this energy component, they suffer from drawbacks such as low endurance that prevent them from being a DRAM replacement in DNN applications.
In this paper, we examine how DNN accelerators can be designed to overcome these limitations and how emerging memories can be used for off-chip storage. We demonstrate that through (a) careful mapping of DNN computation to the accelerator and (b) a hybrid setup (both DRAM and an emerging memory), we can reduce inference energy over a DRAM-only design by a factor ranging from 1.12× on EfficientNetB7 to 6.3× on ResNet-50, while also increasing the endurance from 2 weeks to over a decade. As the energy benefits vary dramatically across DNN models, we also develop a simple analytical heuristic solely based on DNN model parameters that predicts the suitability of a given DNN for emerging-memory-based accelerators.

Keywords: Machine-Learning, Convolutional Neural Networks, Non-Volatile Memories, PCM, RRAM, MRAM.

Full Text (PDF)