doi: 10.3850/978-3-9815370-4-8_0348


AHEAD: Automated Framework for Hardware Accelerated Iterative Data Analysis


Ebrahim M. Songhoria, Azalia Mirhoseinib, Xuyang Luc and Farinaz Koushanfard

Rice University, Houston Texas, USA.

aebrahim@rice.edu
bazalia@rice.edu
cxl27@rice.edu
dfarinaz@rice.edu

ABSTRACT

This paper introduces AHEAD, a novel domainspecific framework for automated (hardware-based) acceleration of massive data analysis applications with a dense (nonsparse) correlation matrix. Due to non-scalability of matrix inversion, often iterative computation is used for converging to a solution. AHEAD addresses two sets of domain-specific matrix computation challenges. First, the I/O and memory bandwidth constraints which limit the performance of hardware accelerators. Second, the hardness of handling large data because of the complexity of the known matrix transformations and the inseparability of non-sparse correlations. The inseparability problem translates to an increased communication cost with the accelerators. To optimize the performance within these limits, AHEAD learns the dependency structure of the domain data and suggests a scalable matrix transformation. The transformation minimizes the memory access required for matrix computing within an error threshold and thus, optimizes the mapping of domain data to the available (bandwidth constrained) accelerator resources. To facilitate automation, AHEAD also provides an Application Programming Interface (API) so users can customize the framework to an arbitrary iterative analysis algorithm and hardware mapping. Proof-of-concept implementation of AHEAD is performed on the widely used compressive sensing and general l1 regularized least squares solvers. On a massive light field imaging data set with 4.6B non-zeros, AHEAD attains up to 320x iteration speed improvement using reconfigurable hardware accelerators compared with the conventional solver and about 4x improvement compared to our transformed matrix solver on a general purpose processor (without hardware acceleration).

Keywords: Iterative solver, Gram matrix, Least squares, FPGAs, Sparse approximation, FISTA, HLS, Dense matrix, API.



Full Text (PDF)