Accelerating Spatiotemporal Supervised Training of Large-Scale Spiking Neural Networks on GPU

Ling Liang1,a, Zhaodong Chen1,b, Lei Deng2,e, Fengbin Tu1,c, Guoqi Li2,f and Yuan Xie1,d
1Department of Electrical Computer Engineering, University of California, Santa Barbara, CA, USA
alingliang@ucsb.edu
bchenzd15thu@ucsb.edu
cfengbintu@ucsb.edu
dyuanxie@ucsb.edu
2Department of Precision Instrument, Tsinghua University, Beijing, China
eleideng@mail.tsinghua.edu.cn
fliguoqi@mail.tsinghua.edu.cn

ABSTRACT


Spiking neural networks (SNNs) have great potential to achieve brain-like intelligence, however, it suffers low accuracy of conventional synaptic plasticity rules and low training efficiency on GPUs. Recently, the emerging backpropagation through time (BPTT) inspired learning algorithms bring new opportunities to boost the accuracy of SNNs, while training on GPUs still remains inefficient due to the complex spatiotemporal dynamics and huge memory consumption, which restricts the model exploration for SNNs and prevents the advance of neuromorphic computing.

In this work, we build a framework to solve the inefficiency of BPTT-based SNN training on modern GPUs. To reduce the memory consumption, we optimize the dataflow by saving CONV/FC results only in the forward pass and recomputing other intermediate results in the backward pass. Then, we customize kernel functions to accelerate the neural dynamics for all training stages. Finally, we provide a Pytorch interface to make our framework easy-to-deploy in real systems. Compared to vanilla Pytorch implementation, our framework can achieve up to 2.13× end-to-end speedup and consume only 0.41× peak memory on the CIFAR10 dataset. Moreover, for the distributed training on the large ImageNet dataset, we can achieve up to 1.81× end-to-end speedup and consume only 0.38× peak memory.

Keywords: Neuromorphic Computing, Spiking Neural Network, GPU Optimization, Training Acceleration.



Full Text (PDF)