PIMProf: An Automated Program Profiler for Processing-in-Memory Offloading Decisions

Yizhou Weia, Minxuan Zhoue, Sihang Liub, Korakit Seemakhuptc, Tajana Rosingf and Samira Khand
University of Virginia, †University of California San Diego
ayizhouwei@virginia.edu
bsihangliu@virginia.edu
ckorakit@virginia.edu
dsamirakhan@virginia.edu
emiz087@ucsd.edu
ftajana@ucsd.edu

ABSTRACT


Processing-in-memory (PIM) architectures reduce the data movement overhead by bringing computation closer to the memory. However, a key challenge is to decide which code regions of a program should be offloaded to PIM for the best performance. The goal of this work is to help programmers leverage PIM architectures by automatically profiling legacy workloads to find PIM-friendly code regions for offloading. We propose PIMProf1, an automated profiling and offloading tool to determine PIM offloading regions for CPU-PIM hybrid architectures. PIMProf efficiently models the comprehensive cost related to PIM offloading and makes the offloading decision by an effective and computational-tractable algorithm. We demonstrate the effectiveness of PIMProf by evaluating the GAP graph benchmark suite and the PARSEC benchmark suite under different PIM and CPU configurations. Our evaluation shows that, compared to the CPU baseline and a PIM-only configuration, the offloading decisions by PIMProf provides 5.33× and 1.39× speedup in the GAP graph workloads, respectively; 2.22× and 1.74× speedup in the PARSEC benchmarks, respectively.



Full Text (PDF)