DATE 2020

Modular Rram Based In-Memory Computing Design for Embedded AI

Xinxin Wang¹, Qiwen Wang¹, Mohammed A. Zidan¹, Fan-Hsuan Meng¹, John Moon¹ and Wei Lu²
¹University of Michigan, US;
²Michigan University, US

ABSTRACT

Deep Neural Networks (DNN) are widely used for many artificial intelligence applications with great success. However, they often come with high computation cost and complexity. Accelerators are crucial in improving energy efficiency and throughput, particularly for embedded AI applications. Resistive random-access memory (RRAM) has the potential to enable efficient AI accelerator implementation, as the weights can be mapped as the conductance values of RRAM devices and computation can be directly performed in-memory. Specifically, by converting input activations into voltage pulses, vector-matrix multiplications (VMM) can be performed in analog domain, in place and in parallel. Moreover, the whole model can be stored on-chip, thus eliminating off-chip DRAM access completely and achieving high energy efficiency during the end-to-end operation. In this presentation, we will discuss how practical DNN models can be mapped onto realistic RRAM arrays in a modular design. Challenges such as quantization effects, finite array size, and device non-idealities on the system performance will be analyzed through standard DNN models such as VGG-16 and MobileNet. System performance metrics such as throughput and energy/image will also be discussed.