QSLC: Quantization-Based, Low-Error Selective Approximation for GPUs
Sohan Lal, Jan Lucas and Ben Juurlink
Technische Universität Berlin, Germany
ABSTRACT
GPUs use a large memory access granularity (MAG) that often results in a low effective compression ratio for memory compression techniques. The low effective compression ratio is caused by a significant fraction of compressed blocks that have a few bytes above a multiple of MAG. While MAG-aware selective approximation, based on a tree structure, has been used to increase the effective compression ratio and the performance gain, approximation results in a high error that is reduced by using complex optimizations. We propose a simple quantizationbased approximation technique (QSLC) that can also selectively approximate a few bytes above MAG. While the quantizationbased approximation technique has a similar performance to the state-of-the-art tree-based selective approximation, the average error for the quantization-based technique is 5× lower. We further trade-off the two techniques and show that the area and power overhead of the quantization-based technique is 12.1× and 7.6× lower than the state-of-the-art, respectively. Our sensitivity analysis to different block sizes further shows the opportunities and the significance of MAG-aware selective approximation.