DATE 2020

Efficient Compilation and Execution of JVM-Based Data Processing Frameworks on Heterogeneous Co- Processors

Christos Kotselidis¹, Sotiris Diamantopoulos², Orestis Akrivopoulos³, Viktor Rosenfeld⁴, Katerina Doka⁵, Hazeef Mohammed⁶, Georgios Mylonas⁷, Vassilis Spitadakis⁸ and Will Morgan⁹

¹The University of Manchester
christos.kotselidis@manchester.ac.uk
²Exus Ltd.
s.diamantopoulos@exus.co.uk
³SparkWorks ITC Ltd.
akribopo@sparkworks.net
⁴German Research Center for Artificial Intelligence
viktor.rosenfeld@dfki.de
⁵National Technical University of Athens
katerina@cslab.ece.ntua.gr
⁶Kaleao Ltd.
hazeef.mohammed@kaleao.com
⁷Computer Technology Institute & Press Diophantus
mylonasg@cti.gr
⁸Neurocom Luxembourg
v.spitadakis@neurocom.lu
⁹Will Morgan IProov Ltd.
will.morgan@iproov.com

ABSTRACT

This paper addresses the fundamental question of how modern Big Data frameworks can dynamically and transparently exploit heterogeneous hardware accelerators. After presenting the major challenges that have to be addressed towards this goal, we describe our proposed architecture for automatic and transparent hardware acceleration of Big Data frameworks and applications. Our vision is to retain the uniform programming model of Big Data frameworks and enable automatic, dynamic Just-In-Time compilation of the candidate code segments that benefit from hardware acceleration to the corresponding format. In conjunction with machine learning-based device selection, that respect user-defined constraints (e.g., cost, time, etc.), we enable dynamic code execution on GPUs and FPGAs transparently to the user. In addition, we dynamically re-steer execution at runtime based on the availability of resources. Our preliminary results demonstrate that our approach can accelerate an existing Apache Flink application by up to 16.5×.

Full Text (PDF)