Knowledge Distillation and Gradient Estimation for Active Error Compensation in Approximate Neural Networks
Cecilia De la Parra1,a, Xuyi Wu2,a, Andre Guntoro1,b and Akash Kumar2,b
1Robert Bosch GmbH Renningen, Germany
acecilia.delaparra@de.bosch.com
bandre.guntoro@de.bosch.com
2Technische Universität München Munich, Germany
axuyi.wu@tum.de
bakash.kumar@tu-dresden.de
ABSTRACT
Approximate computing is a promising approach for optimizing computational resources of error-resilient applications such as Convolutional Neural Networks (CNNs). However, such approximations introduce an error that needs to be compensated by optimization methods, which typically include a retraining or fine-tuning stage. To efficiently recover from the introduced error, this fine-tuning process needs to be adapted to take CNN approximations into consideration. In this work, we present a novel methodology for fine-tuning approximate CNNs with ultralow bit-width quantization and large approximation error, which combines knowledge distillation and gradient estimation to recover the lost accuracy due to approximations. With our proposed methodology, we demonstrate energy savings of up to 38% in complex approximate CNNs with weights quantized to 4 bits and 8-bit activations, with less than 3% accuracy loss w.r.t. the full precision model.
Keywords: Approximate Computing, Neural Networks, Quantization, Approximate Multipliers.