Design Space Exploration of FPGA Accelerators for Convolutional Neural Networks

Atul Rahman1, Sangyun Oh2, Jongeun Lee2,a and Kiyoung Choi3
1Samsung Electronics, Suwon, South Korea
2School of Electrical and Computer Engineering, UNIST, Ulsan, South Korea.
3Dept. of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea


The increasing use of machine learning algorithms, such as Convolutional Neural Networks (CNNs), makes the hardware accelerator approach very compelling. However the question of how to best design an accelerator for a given CNN has not been answered yet, even on a very fundamental level. This paper addresses that challenge, by providing a novel framework that can universally and accurately evaluate and explore various architectural choices for CNN accelerators on FPGAs. Our exploration framework is more extensive than that of any previous work in terms of the design space, and takes into account various FPGA resources to maximize performance including DSP resources, on-chip memory, and off-chip memory bandwidth. Our experimental results using some of the largest CNN models including one that has 16 convolutional layers demonstrate the efficacy of our framework, as well as the need for such a high-level architecture exploration approach to find the best architecture for a CNN model.

Full Text (PDF)