MARVEL: A Vertical Resistive Accelerator for Low-Power Deep Learning Inference in Monolithic 3D

Fan Chena, Linghao Songb, Hai “Helen” Lic and Yiran Chend
Department of Electrical and Computer Engineering, Duke University, Durham NC, U.S.A
afan.chen@duke.edu@duke.edu
blinghao.song@duke.edu
chai.li@duke.edu
dyiran.chen@duke.edu

ABSTRACT


Resistive memory (ReRAM) based Deep Neural Network (DNN) accelerators have achieved state-of-the-art DNN inference throughput. However, the power efficiency of such resistive accelerators is greatly limited by their peripheral circuitry including analog-to-digital converters (ADCs), digital-to-analog converters (DACs), SRAM registers, and eDRAM buffers. These power-hungry components consume 87% of the total system power, despite of the high power efficiency of ReRAM computing cores. In this paper, we propose MARVEL, a monolithic 3D stacked resistive DNN accelerator, which consists of carbon nanotube field-effect transistors (CNFETs) based low-power ADC/DACs, CNFET logic, CNFET SRAM, and high-density global buffers implemented by cross-point Spin Transfer Torque Magnetic RAM (STT-MRAM). To compensate for the loss of inference throughput that is incurred by the slow CNFET ADCs, we propose to integrate more ADC layers into MARVEL. Unlike the CMOS-based ADCs that can only be implemented in the bottom layer of the 3D structure, multiple CNFET layers can be implemented using a monolithic 3D stacking technique. Compared to prior ReRAMbased DNN accelerators, on average, MARVEL achieves the same inference throughput with 4.5⨯ improvement on performance per Watt. We also demonstrated that increasing the number of integration layers enables MARVEL to further achieve 2⨯ inference throughput with 7.6⨯ improved power efficiency.

Keywords: Accelerator, Monolithic 3D, DNNs, Inference.



Full Text (PDF)