LSP: Collective Cross-Page Prefetching for NVM
Haiyang Pan1,2,a, Yuhang Liu1,2,3,b, Tianyue Lu1,2,3,c and Mingyu Chen1,2,3,d
1University of Chinese Academy of Sciences, Beijing, China
2State Key Laboratory of Computer Architecture, Institute of Computing Technology, CAS, Beijing, China
3Peng Cheng Laboratory, Shenzhen, China
apanhaiyang@ict.ac.cn
bliuyuhang@ict.ac.cn
clutianyue@ict.ac.cn
dcmy@ict.ac.cn
ABSTRACT
As an emerging technique, non-volatile memory (NVM) provides valuable opportunities for boosting the memory system, which is vital for the computing system performance. However, one challenge preventing NVM from replacing DRAM as the main memory is that NVM row activation’s latency is much longer (by approximately 10x) than that of DRAM. To address this issue, we present a collective cross-page prefetching scheme that can accurately open an NVM row in advance and then prefetch the data blocks from the opened row with low overhead. We identify a memory access pattern (referred to as a ladder stream) to facilitate prefetching that can cross page boundary, and propose the ladder stream prefetcher (LSP) for NVM. In LSP, two crucial components have been well designed. Collective Prefetch Table is proposed to reduce the interference with demand requests caused by prefetching through speculatively scheduling the prefetching according to the states of the memory queue. It is implemented with low overhead by using single entry to track multiple prefetches. Memory Mapping Table is proposed to accurately prefetch future pages by maintaining the mapping between physical and virtual addresses. Experimental evaluations show that LSP improves the memory system performance with no prefetching by 66%, and the improvement over the state-of-the-art prefetchers, Access Map Pattern Matching Prefetcher (AMPM), Best-Offset Prefetcher (BOP) and Signature Path Prefetcher (SPP) is 26.6%, 21.7% and 27.4%, respectively.
Keywords: Prefetch, DRAM Cache, Non-Volatile Memory.