A Runtime Reconfigurable Design of Compute-in-Memory based Hardware Accelerator
Anni Lu, Xiaochen Peng, Yandong Luo, Shanshi Huang and Shimeng Yua
School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA
ashimeng.yu@ece.gatech.edu
ABSTRACT
Compute-in-memory (CIM) is an attractive solution to address the “memory wall” challenges for the extensive computation in machine learning hardware accelerators. Prior CIM-based architectures, though can adapt to different neural network models during the design time, they are implemented to different custom chips. Therefore, a specific chip instance is restricted to a specific network during runtime. However, the development cycle of the hardware is normally far behind the emergence of new algorithms. In this paper, a runtime reconfigurable design methodology of CIM-based accelerator is proposed to support a class of convolutional neural networks running on one pre-fabricated chip instance. First, several design aspects are investigated: 1) reconfigurable weight mapping method; 2) input side of data transmission, mainly about the weight reloading; 3) output side of data processing, mainly about the reconfigurable accumulation. Then, system-level performance benchmark is performed for the inference of different models like VGG-8 on CIFAR-10 dataset and AlexNet, GoogLeNet, ResNet-18 and DenseNet-121 on ImageNet dataset to measure the tradeoffs between runtime reconfigurability, chip area, memory utilization, throughput and energy efficiency.
Keywords: Convolutional Neural Network, Hardware Accelerator, Compute-In-Memory, Reconfigurable Architecture