DATE 2018

CCR: A Concise Convolution Rule for Sparse Neural Network Accelerators

Jiajun Li^a, Guihai Yan^b, Wenyan Lu^c, Shuhao Jiang^d, Shijun Gong^e, Jingya Wu^f and Xiaowei Li^g
State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences University of Chinese Academy of Sciences
^alijiajun@ict.ac.cn
^byan@ict.ac.cn
^cluwenyan@ict.ac.cn
^djiangshuhao@ict.ac.cn
^egongshijun@ict.ac.cn
^fwujingya@ict.ac.cn
^glxw@ict.ac.cn

ABSTRACT

Convolutional Neural networks (CNNs) have achieved great success in a broad range of applications. As CNNbased methods are often both computation and memory intensive, sparse CNNs have emerged as an effective solution to reduce the amount of computation and memory accesses while maintaining the high accuracy. However, dense CNN accelerators can hardly benefit from the reduction of computations and memory accesses due to the lack of support for irregular and sparse models. This paper proposed a concise convolution rule (CCR) to diminish the gap between sparse CNNs and dense CNN accelerators. CCR transforms a sparse convolution into multiple effective and ineffective ones. The ineffective convolutions in which either the neurons or synapses are all zeros do not contribute to the final results and the computations and memory accesses can be eliminated. The effective convolutions in which both the neurons and synapses are dense can be easily mapped to the existing dense CNN accelerators. Unlike prior approaches which trade complexity for flexibility, CCR advocates a novel approach to reaping the benefits from the reduction of computation and memory accesses as well as the acceleration of the existing dense architectures without intrusive PE modifications. As a case study, we implemented a sparse CNN accelerator, SparseK, following the rationale of CCR. The experiments show that SparseK achieved a speedup of 2.9× on VGG16 compared to a comparably provisioned dense architecture. Compared with state-of-the-art sparse accelerators, SparseK can improve the performance and energy efficiency by 1.8× and 1.5×, respectively.

Full Text (PDF)