TruLook: A Framework for Configurable GPU Approximation
Ricardo Garcia1, Fatemeh Asgarinejad1, Behnam Khaleghi1, Tajana Rosing1 and Mohsen Imani2
1University of California San Diego
2University of California Irvine
m.imani@uci.edu
ABSTRACT
In this paper, we propose TruLook, a framework that employs approximate computing techniques for GPU acceleration through computation reuse as well as approximate arithmetic operations to eliminate redundant and unnecessary exact computations. To enable computational reuse, GPU is enhanced with small lookup tables that are placed close to the stream cores that return already computed values for exact and potential inexact matches. Inexact matching is subject to a threshold controlled by the number of mantissa bits involved in the search. Approximate arithmetic is provided by a configurable approximate multiplier that dynamically detects and approximates operations which are not significantly affected by approximation. TruLook guarantees the accuracy bound required for an application by configuring the hardware at runtime. We have evaluated TruLook efficiency on a wide range of multimedia and deep learning applications. Our evaluation shows that with 0% and less than 1% quality loss budget, TruLook yields on average 2.1⨯ and 5.6⨯ energydelay product improvement over four popular networks on the ImageNet dataset.