DASC : A DRAM Data Mapping Methodology for Sparse Convolutional Neural Networks

Bo-Cheng Laia, Tzu-Chieh Chiangb, Po-Shen Kuoc, Wan-Ching Wangd, Yan-Lin Hunge, Hung-Ming Chenf, Chien-Nan Liug and Shyh-Jye Jouh
Institute of Electronics National Yang Ming Chiao Tung University Hsinchu, Taiwan
abclai@nycu.edu.tw
bm56565566.ee08@nycu.edu.tw
ckuoposhen@gmail.com
drockruby622@gmail.com
estevenhungs@gmail.com
fhmchen@mail.nctu.edu.tw
gjimmyliu@nctu.edu.tw
hjerryjou@mail.nctu.edu.tw

ABSTRACT


The data transferring of sheer model size of CNN (Convolution Neural Network) has become one of the main performance challenges in modern intelligent systems. Although pruning can trim down substantial amount of non-effective neurons, the excessive DRAM accesses of the non-zero data in a sparse network still dominate the overall system performance. Proper data mapping can enable efficient DRAM accesses for a CNN. However, previous DRAM mapping methods focus on dense CNN and become less effective when handling the compressed format and irregular accesses of sparse CNN. The extensive design space search for mapping parameters also results in a time-consuming process. This paper proposes DASC: a DRAM data mapping methodology for sparse CNNs. DASC is designed to handle the data access patterns and block schedule of sparse CNN to attain good spatial locality and efficient DRAM accesses. The bank-group feature in modern DDR is further exploited to enhance processing parallelism. DASC also introduces an analytical model to facilitate fast exploration and quick convergence of parameter search in minutes instead of days from previous work. When compared with the state-of-theart, DASC decreases the total DRAM latencies and attains an average of 17.1x, 14.3x, and 23.3x better DRAM performance for sparse AlexNet, VGG-16, and ResNet-50 respectively.

Keywords: Sparse CNN, DRAM, Data Mapping, Optimization, Design Space Exploration.



Full Text (PDF)