An Efficient Mapping Approach to Large-Scale DNNs on Multi-FPGA Architectures
Wentai Zhang1,a, Jiaxi Zhang1, Minghua Shen2,b, Guojie Luo1,3,c and Nong Xiao2
1Center for Energy-Efficient Computing and Applications, Peking University
archardx@gmail.com
2School of Data and Computer Science, Sun Yat-Sen University
bshenmh6@mail.sysu.edu.cn
3Peng Cheng Laboratory
cgluo@pku.edu.cn
ABSTRACT
FPGAs are very attractive to accelerate the deep neural networks (DNNs). While single FPGA can provide good performance for small-scale DNNs, support for large-scale DNNs is limited due to higher resource demand. In this paper, we propose an efficient mapping approach for accelerating largescale DNNs on asymmetric multi-FPGA architectures. In this approach, the neural network mapping can be formulated as a resource allocation problem. We design a dynamic programmingbased partitioning to solve this problem optimally. Experimental results using the large-scale ResNet-152 demonstrate that our approach deploys sixteen FPGAs to provide an advantage of 16.4x GOPS over the state-of-the-art work.