Data Subsetting: A Data-Centric Approach to Approximate Computing
Younghoon Kim1,a, Swagath Venkataramani2, Nitin Chandrachoodan3 and Anand Raghunathan1,b
1School of Electrical and Computer Engineering, Purdue University
akim1606@purdue.edu
braghunathan@purdue.edu
2IBM T. J. Watson Research Center
swagath.venkataramani@ibm.com
3Indian Institute of Technology, Madras
nitin@ee.iitm.ac.in
ABSTRACT
Approximate Computing (AxC), which leverages the intrinsic resilience of applications to approximations in their underlying computations, has emerged as a promising approach to improving computing system efficiency. Most prior efforts in AxC take a compute-centric approach and approximate arithmetic or other compute operations through design techniques at different levels of abstraction. However, emerging workloads such as machine learning, search and data analytics process large amounts of data and are significantly limited by the memory subsystems of modern computing platforms.
In this work, we shift the focus of approximations from computations to data, and propose a data-centric approach to AxC, which can boost the performance of memory-subsystemlimited applications. The key idea is to modulate the application's data-accesses in a manner that reduces off-chip memory traffic. Specifically, we propose a data-access approximation technique called data subsetting, in which all accesses to a data structure are redirected to a subset of its elements so that the overall footprint of memory accesses is decreased. We realize data subsetting in a manner that is transparent to hardware and requires only minimal changes to application software. Recognizing that most applications of interest represent and process data as multidimensional arrays or tensors, we develop a templated data structure called SubsettableTensor that embodies mechanisms to define the accessible subset and to suitably redirect accesses to elements outside the subset. As a further optimization, we observe that data subsetting may cause some computations to become redundant and propose a mechanism for application software to identify and eliminate such computations. We implement SubsettableTensor as a C++ class and evaluate it using parallel software implementations of 7 machine learning applications on a 48-core AMD Opteron server. Our experiments indicate that data subsetting enables 1.33×-4.44× performance improvement with <0.5% loss in application-level quality, underscoring its promise as a new approach to approximate computing.