ACOUSTIC: Accelerating Convolutional Neural Networks through Or-Unipolar Skipped Stochastic Computing

Wojciech Romaszkana, Tianmu Lib, Tristan Meltonc, Sudhakar Pamartid and Puneet Guptae

Department of Electrical and Computer Engineering University of California, Los Angeles Los Angeles, USA
awromaszkan@ucla.edu
blitianmu1995@ucla.edu
ctristanmelton@ucla.edu
dspamarti@ucla.edu
epuneetg@ucla.edu

ABSTRACT

As privacy and latency requirements force a move towards edge Machine Learning inference, resource constrained devices are struggling to cope with large and computationally complex models. For Convolutional Neural Networks, those limitations can be overcome by taking advantage of enormous data reuse opportunities and amenability to reduced precision. To do that however, a level of compute density unattainable for conventional binary arithmetic is required. Stochastic Computing can deliver such density, but it has not lived up to its full potential because of multiple underlying precision issues. We present ACOUSTIC: Accelerating Convolutions through Or-Unipolar Skipped sTochastIc Computing, an accelerator framework that enables fully stochastic, high-density CNN inference. Leveraging split-unipolar representation, OR-based accumulation and novel computation-skipping approach, ACOUSTIC delivers server-class parallelism within a mobile area and power budget - a 12mm2 accelerator can be as much as 38.7x more energy efficient and 72.5x faster than conventional fixed-point accelerators. It can also be up to 79.6x more energy efficient than state-of-the-art stochastic accelerators. At the lower-end ACOUSTIC achieves 8x-120X inference throughput improvement with similar energy and area when compared to recent mixed-signal/neuromorphic accelerators.

Keywords: Neural nets, Hardware architecture



Full Text (PDF)