DeepNVM: A Framework for Modeling and Analysis of Non-Volatile Memory Technologies for Deep Learning Applications

Ahmet Fatih Incia, Mehmet Meric Isgencb and Diana Marculescuc

Carnegie Mellon University Department of Electrical and Computer Engineering, Pittsburgh, PA, USA
aainci@andrew.cmu.edu
bmisgenc@andrew.cmu.edu
cdianam@cmu.edu

ABSTRACT

Non-volatile memory (NVM) technologies such as spin-transfer torque magnetic random access memory (STTMRAM) and spin-orbit torque magnetic random access memory (SOT-MRAM) have significant advantages compared to conventional SRAM due to their non-volatility, higher cell density, and scalability features. While previous work has investigated several architectural implications of NVM for generic applications, in this work we present DeepNVM, a framework to characterize, model, and analyze NVM-based caches in GPU architectures for deep learning (DL) applications by combining technologyspecific circuit-level models and the actual memory behavior of various DL workloads. We present both iso-capacity and isoarea performance and energy analysis for systems whose lastlevel caches rely on conventional SRAM and emerging STTMRAM and SOT-MRAM technologies. In the iso-capacity case, STT-MRAM and SOT-MRAM provide up to 4:2× and 5× energy-delay product (EDP) reduction and 2:4× and 3×area reduction compared to conventional SRAM, respectively. Under iso-area assumptions, STT-MRAM and SOT-MRAM provide 2:3× EDP reduction on average across all workloads when compared to SRAM. Our comprehensive cross-layer framework is demonstrated on STT-/SOT-MRAM technologies and can be used for the characterization, modeling, and analysis of any NVM technology for last-level caches in GPU platforms for deep learning applications.



Full Text (PDF)