doi: 10.7873/DATE.2015.1123
Accelerating Arithmetic Kernels with Coherent Attached FPGA Coprocessors
Heiner Giefersa, Raphael Poligb and Christoph Hagleitnerc
IBM Research - Zurich, Switzerland.
ahgi@zurich.ibm.com
bpol@zurich.ibm.com
chle@zurich.ibm.com
ABSTRACT
The energy efficiency of computer systems can be
increased by migrating computational kernels that are known to
under-utilize the CPU to an FPGA based coprocessor. In contrast
to traditional I/O-based coprocessors that require explicit data
movement, coherently attached accelerators can operate on the
same virtual address space than the host CPU. A shared memory
organization enables widely accepted programming models and
helps to deploy energy efficient accelerators in general purpose
computing systems. In this paper we study an FFT accelerator on
FPGA attached via the Coherent Accelerator Processor Interface
(CAPI) to a POWER8 processor. Our results show that the
coherent attached accelerator outperforms device driver based
approaches in terms of latency. Hardware acceleration delivers a
5× gain in energy efficiency compared to an optimized parallel
software FFT running on a 12-core CPU and improves single
thread performance by more than 2×. We conclude that the integration
of CAPI into heterogeneous programming frameworks
such as OpenCL will facilitate latency critical operations and will
further enhance programmability of hybrid systems.
Full Text (PDF)
|