doi: 10.3850/978-3-9815370-4-8_1124


Transparent Offloading of Computational Hotspots from Binary Code to Xeon Phi


Marvin Damschen1, Heinrich Riebler2,a, Gavin Vaz2,b and Christian Plessl2,c

1Karlsruhe Institute of Technology, Germany.

marvin.damschen@kit.edu

2University of Paderborn, Germany.

aheinrich.riebler@uni-paderborn.de
bgavin.vaz@uni-paderborn.de
cchristian.plessl@uni-paderborn.de

ABSTRACT

In this paper, we study how binary applications can be transparently accelerated with novel heterogeneous computing resources without requiring any manual porting or developerprovided hints. Our work is based on Binary Acceleration At Runtime (BAAR), our previously introduced binary acceleration mechanism that uses the LLVM Compiler Infrastructure. BAAR is designed as a client-server architecture. The client runs the program to be accelerated in an environment, which allows program analysis and profiling and identifies and extracts suitable program parts to be offloaded. The server compiles and optimizes these offloaded program parts for the accelerator and offers access to these functions to the client with a remote procedure call (RPC) interface. Our previous work proved the feasibility of our approach, but also showed that communication time and overheads limit the granularity of functions that can be meaningfully offloaded. In this work, we motivate the importance of a lightweight, high-performance communication between server and client and present a communication mechanism based on the Message Passing Interface (MPI). We evaluate our approach by using an Intel Xeon Phi 5110P as the acceleration target and show that the communication overhead can be reduced from 40% to 10%, thus enabling even small hotspots to benefit from offloading to an accelerator.



Full Text (PDF)