Approach to Improve the Performance Using Bit-level Sparsity in Neural Networks
Yesung Kang, Eunji Kwon, Seunggyu Lee, Younghoon Byun, Youngjoo Lee and Seokhyeong Kanga
EE Department, POSTECH, South Korea
ashkang@postech.ac.kr
ABSTRACT
This paper presents a convolutional neural network (CNN) accelerator that can skip zero weights and handle outliers, which are few but have a significant impact on the accuracy of CNNs, to achieve speedup and increase the energy efficiency of CNN. We propose an offline weightscheduling algorithm which can skip zero weights and combine two nonoutlier weights simultaneously using bit-level sparsity of CNNs. We use a reconfigurable multiplier-and-accumulator (MAC) unit for two purposes; usually used to compute combined two non-outliers and sometimes to compute outliers. We further improve the speedup of our accelerator by clipping some of the outliers with negligible accuracy loss. Compared to DaDianNao [7] and Bit-Tactical [16] architectures, our CNN accelerator can improve the speed by 3.34 and 2.31 times higher and reduce energy consumption by 29.3% and 30.2%, respectively.