On the Design of High Performance HW Accelerator through High-level Synthesis Scheduling Approximations

Siyuan Xua and Benjamin Carrion Schaferb

Department of Electrical and Computer Engineering The University of Texas at Dallas, TX, USA
asiyuan.xu@utdallas.edu
bschaferb@utdallas.edu

ABSTRACT

High-level synthesis (HLS) takes as input a behavioral description (e.g. C/C++) and generates efficient hardware through three main steps: allocation, scheduling, and binding. The scheduling step, times the operations in the behavioral description by scheduling different portions of the code at unique clock steps (control steps). The code portions assigned to each clock step mainly depend on the target synthesis frequency and target technology. This work makes use of this to generate smaller and faster circuits by approximating the program portions scheduled in each clock step and by exploiting the slack between different scheduling step to further increase the performance/reduce the latency of the resultant circuit. In particular, each individual scheduling step is approximated given a maximum error boundary and a library of different approximation techniques. In order to further optimize the resultant circuit, different scheduling steps are merged based on the timing slack of different control step without violating the given timing constraint (target frequency). Experimental results from different domain-specific applications show that our method works well and is able to increase the throughput on average by 82% while at the same time reducing the area by 21% for a given maximum allowable error.

Keywords: Approximate Computing, High-level Synthesis, Machine Learning, Behavioral Hardware Accelerators



Full Text (PDF)