Approximate Associative Memristive Memory for Energy-Efficient GPUs
Abbas Rahimi1,a, Amirali Ghofrani2,c, Kwang-Ting Cheng2,d, Luca Benini3,4 and Rajesh K. Gupta1,b
1CSE, UC San Diego, La Jolla, CA 92093, USA.
2ECE, UC Santa Barbara, Santa Barbara, CA 93111, USA.
3DEI, University of Bologna, 40136 Bologna, Italy
4IIS, Swiss Federal Institute of Technology, 8092 Zurich, Switzerland.
Multimedia applications running on thousands of deep and wide pipelines working concurrently in GPUs have been an important target for power minimization both at the architectural and algorithmic levels. At the hardware level, energy-efficiency techniques that employ voltage overscaling face a barrier so-called “path walls“: reducing operating voltage beyond a certain point generates massive number of timing errors that are impractical to tolerate. We propose an architectural innovation, called A2M2 module (approximate associative memristive memory) that exhibits few tolerable timing errors suitable for GPU applications under voltage overscaling. A2M2 is integrated with every floating point unit (FPU), and performs partial functionality of the associated FPU by pre-storing high frequency patterns for computational reuse that avoids overhead due to re-execution. Voltage overscaled A2M2 is designed to match an input search pattern with any of the stored patterns within a Hamming distance range of 0–2. This matching behavior under voltage overscaling leads to a controllable approximate computing for multimedia applications. Our experimental results for the AMD Southern Islands GPU show that four image processing kernels tolerate the mismatches during pattern matching resulting in a PSNR ≥ 30dB. The A2M2 module with 8–row enables 28% voltage overscaling in 45nm technology resulting in 32% average energy saving for the kernels, while delivering an acceptable quality of service.
Full Text (PDF)