TinyADC: Peripheral Circuit-aware Weight Pruning Framework for Mixed-signal DNN Accelerators

Geng Yuan1,a, Payman Behnam2, Yuxuan Cai1,b, Ali Shafiee3, Jingyan Fu4,a, Zhiheng Liao4,b, Zhengang Li1,c, Xiaolong Ma1,d, Jieren Deng5,a, Jinhui Wang6, Mahdi Bojnordi7, Yanzhi Wang1,e and Caiwen Ding5,b
1Northeastern University
ayuan.geng@northeastern.edu
bcai.yuxu@northeastern.edu
cli.zhen@northeastern.edu
dma.xiaol@northeastern.edu
eyanz.wang@northeastern.edu
2Georgia Institute of Technology
payman.behnam@gatech.edu
3Samsung
ali.shafiee@samsung.com
4North Dakota State University
ajingyan.fu@ndsu.edu
bzhiheng.liao@ndsu.edu
5University of Connecticut
ajieren.deng@uconn.edu
bcaiwen.ding@uconn.edu
6University of South Alabama
jwang@southalabama.edu
7University of Utah
bojnordi@cs.utah.edu

ABSTRACT


As the number of weight parameters in deep neural networks (DNNs) continues growing, the demand for ultra-efficient DNN accelerators has motivated research on non-traditional architectures with emerging technologies. Resistive Random-Access Memory (ReRAM) crossbar has been utilized to perform insitu matrix-vector multiplication of DNNs. DNN weight pruning techniques have also been applied to ReRAM-based mixed-signal DNN accelerators, focusing on reducing weight storage and accelerating computation. However, the existing works capture very few peripheral circuits features such as Analog to Digital converters (ADCs) during the neural network design. Unfortunately, ADCs have become the main part of power consumption and area cost of current mixed-signal accelerators, and the large overhead of these peripheral circuits is not solved efficiently. To address this problem, we propose a novel weight pruning framework for ReRAM-based mixed-signal DNN accelerators, named TINYADC, which effectively reduces the required bits for ADC resolution and hence the overall area and power consumption of the accelerator without introducing any computational inaccuracy. Compared to state-of-the-art pruning work on the ImageNet dataset, TINYADC achieves 3.5× and 2.9× power and area reduction, respectively. TINYADC framework optimizes the throughput of state-of-the-art architecture design by 29% and 40% in terms of the throughput per unit of millimeter square and watt ( GOPs/ s×mm2 and GOP/s ), respectively.



Full Text (PDF)