Efficient Compilation and Execution of JVM-Based Data Processing Frameworks on Heterogeneous Co- Processors
Christos Kotselidis1, Sotiris Diamantopoulos2, Orestis Akrivopoulos3, Viktor Rosenfeld4, Katerina Doka5, Hazeef Mohammed6, Georgios Mylonas7, Vassilis Spitadakis8 and Will Morgan9
1The University of Manchester
christos.kotselidis@manchester.ac.uk
2Exus Ltd.
s.diamantopoulos@exus.co.uk
3SparkWorks ITC Ltd.
akribopo@sparkworks.net
4German Research Center for Artificial Intelligence
viktor.rosenfeld@dfki.de
5National Technical University of Athens
katerina@cslab.ece.ntua.gr
6Kaleao Ltd.
hazeef.mohammed@kaleao.com
7Computer Technology Institute & Press Diophantus
mylonasg@cti.gr
8Neurocom Luxembourg
v.spitadakis@neurocom.lu
9Will Morgan IProov Ltd.
will.morgan@iproov.com
ABSTRACT
This paper addresses the fundamental question of how modern Big Data frameworks can dynamically and transparently exploit heterogeneous hardware accelerators. After presenting the major challenges that have to be addressed towards this goal, we describe our proposed architecture for automatic and transparent hardware acceleration of Big Data frameworks and applications. Our vision is to retain the uniform programming model of Big Data frameworks and enable automatic, dynamic Just-In-Time compilation of the candidate code segments that benefit from hardware acceleration to the corresponding format. In conjunction with machine learning-based device selection, that respect user-defined constraints (e.g., cost, time, etc.), we enable dynamic code execution on GPUs and FPGAs transparently to the user. In addition, we dynamically re-steer execution at runtime based on the availability of resources. Our preliminary results demonstrate that our approach can accelerate an existing Apache Flink application by up to 16.5×.