DATE 2022

LRP: Predictive Output Activation Based on SVD Approach for CNNs Acceleration

Xinxin Wu^1,2,a, Zhihua Fan^1,2,b, Tianyu Liu^1,2,c, Wenming Li^1,d, Xiaochun Ye^1,e and Dongrui Fan^1,2,f
¹State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences
^awuxinxin@ict.ac.cn
^bfanzhihua@ict.ac.cn
^cliutianyu@ict.ac.cn
^dliwenming@ict.ac.cn
^eyexiaochun@ict.ac.cn
^ffandr@ict.ac.cn
²School of Computer and Control Engineering, University of Chinese Academy of Sciences

ABSTRACT

Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in a wide range of applications. CNNs contain millions of parameters, and a large number of computations challenge hardware design. In this paper, we take advantage of the output activation sparsity of CNNs to reduce the execution time and energy consumption of the network. We propose Low Rank Prediction (LRP), an effective prediction method that leverages the output activation sparsity. LRP first predicts the output activation polarity of the convolutional layer based on the singular value decomposition (SVD) approach of the convolution kernel. And then it uses the predicted negative value to skip invalid computation in the original convolution. In addition, an effective accelerator, LRPPU, is proposed to take advantage of sparsity to achieve network inference acceleration. Experiments show that our LRPPU achieves 1.48× speedup and 2.02× energy reduction compared with dense networks with slight loss of accuracy. Also, it achieves on average 2.57× speedup over Eyeriss and has similar performance and less accuracy loss compared with SnaPEA.

Keywords: Convolutional Neural Networks, Output activation, SVD approach, Prediction, Sparsity.

Full Text (PDF)