6.2 Secure and fast memory and storage

Printer-friendly version PDF version

Date: Wednesday 11 March 2020
Time: 11:00 - 12:30
Location / Room: Chamrousse

Chair:
Hao Yu, SUSTech, CN

Co-Chair:
Chengmo Yang, University of Delaware, US

As memories become persistent, the design of traditional data structures such as trees and hash tables as well as filesystems should be revisited to cope with the challenges brought by new memory devices. In this context, the main focus of this session is on how to improve performance, security, and energy-efficiency of memory and storage. The specific techniques range from the designs of integrity trees and hash tables, the management of superpages in filesystems, data prefetch in solid state drives (SSDs), as well as energy-efficient carbon-nanotube cache design.

TimeLabelPresentation Title
Authors
11:006.2.1AN EFFICIENT PERSISTENCY AND RECOVERY MECHANISM FOR SGX-STYLE INTEGRITY TREE IN SECURE NVM
Speaker:
Mengya Lei, Huazhong University of Science & Technology, CN
Authors:
Mengya Lei, Fang Wang, Dan Feng, Fan Li and Jie Xu, Huazhong University of Science & Technology, CN
Abstract
The integrity tree is a crucial part of secure non-volatile memory (NVM) system design. For NVM with large capacity, the SGX-style integrity tree (SIT) is practical due to its parallel updates and variable arity. However, employing SIT in secure NVM is not easy. This is because the secure metadata SIT must be strictly persisted or restored after a sudden power-loss, which unfortunately incurs unacceptable run-time overhead or recovery time. In this paper, we propose PSIT, a metadata persistency solution for SIT-protected secure NVM with high performance and fast restoration. PSIT utilizes the observation that for a lazily updated SIT, the lost tree nodes after a crash can be recovered by the corresponding child nodes in the NVM. It reduces the persistency overhead of the SIT nodes through a restrained write-back meta-cache and leverages the SIT inter-layer dependency for recovery. Experiments show that compared to ASIT, a state-of-the-art secure NVM using SIT, PSIT decreases write traffic by 47% and improves the performance by 18% on average while maintaining a comparable recovery time.

Download Paper (PDF; Only available from the DATE venue WiFi)
11:306.2.2REVISITING PERSISTENT HASH TABLE DESIGN FOR COMMERCIAL NON-VOLATILE MEMORY
Speaker:
Kaixin Huang, Shanghai Jiao Tong University, CN
Authors:
Kaixin Huang, Yan Yan and Linpeng Huang, Shanghai Jiao Tong University, CN
Abstract
Emerging non-volatile memory technologies bring evolution to storage systems and durable data structures. Among them, a proliferation of researches on persistent hash table employ NVM as the storage layer for both fast access and efficient persistence. Most of them are based on the assumptions that NVM has byte access granularity, poor write endurance, DRAM-comparable read latency and much higher write latency. However, a commercial non-volatile memory product, named Intel Optane DC Persistent Memory (AEP), has a few interesting features that are different from previous assumptions, such as 1) block access granularity 2) little concern for software-layer write endurance and 3) much higher read latency than DRAM and DRAM-comparable write latency. Confronted with the new challenges brought by AEP, we propose Rewo-Hash, a novel read-efficient and write-optimized hash table for commercial non-volatile memory. Our design can be summarized into three key points. First, we keep a hash table copy in DRAM as a cached table to speed up search requests. Second, we design a log-free atomic mechanism to support fast writes. Third, we devise an efficient synchronization scheme between the persistent table and cached table to mask the data synchronization overhead. We conduct extensive experiments using real NVM platform and the results show that compared with state-of-the-art NVM-Optimized hash tables, Rewo-Hash gains improvement of 1.73x-2.70x and 1.46x-3.11x in read latency and write latency, respectively. Rewo-Hash also outperforms its counterparts by 1.86x-4.24x in throughput for various YCSB workloads.

Download Paper (PDF; Only available from the DATE venue WiFi)
12:006.2.3OPTIMIZING PERFORMANCE OF PERSISTENT MEMORY FILE SYSTEMS USING VIRTUAL SUPERPAGES
Speaker:
Chaoshu Yang, Chongqing University, CN
Authors:
Chaoshu Yang1, Duo Liu1, Runyu Zhang1, Xianzhang Chen1, Shun Nie1, Qingfeng Zhuge1 and Edwin H.-M Sha2
1Chongqing University, CN; 2East China Normal University, CN
Abstract
Existing persistent memory file systems can significantly improve the performance by utilizing the advantages of emerging Persistent Memories (PMs). Especially, persistent memory file systems can employ superpages (e.g., 2MB a page) of PMs to alleviate the overhead of locating file data and reduce TLB misses. Unfortunately, superpage also induces two critical problems. First, the data consistency of file systems using superpages causes severe write amplification during overwrite of file data. Second, existing management of superpages may lead to large waste of PM space. In this paper, we propose a Virtual Superpage Mechanism (VSM) to solve the problems by taking advantages of virtual address space. On one hand, VSM adopts multi-grained copy-on-write mechanism to reduce the write amplification while ensuring data consistency. On the other hand, VSM presents zero-copy file data migration mechanism to eliminate the loss of space utilization efficiency caused by superpages.We implement the proposed VSM mechanism in Linux kernel based on PMFS. Compared with the original PMFS and NOVA, the experimental results show that VSM improves 36% and 14% on average for write and read performance, respectively. Meanwhile, VSM can achieve the same space utilization efficiency of file system that uses the normal 4KB pages to organize files.

Download Paper (PDF; Only available from the DATE venue WiFi)
12:156.2.4FREQUENT ACCESS PATTERN-BASED PREFETCHING INSIDE OF SOLID-STATE DRIVES
Speaker:
Jianwei Liao, Southwest University of China, CN
Authors:
Xiaofei Xu1, Zhigang Cai2, Jianwei Liao2 and Yutaka Ishikawa3
1Southwest University, CN; 2Southwest University of China, CN; 3RIKEN, Japan, JP
Abstract
This paper proposes an SSD-inside data prefetching scheme, which has features of OS-dependence and use transparency. To be specific, it first mines frequent block access patterns that reflect the correlation among the occurred requests. Then it compares the requests in the current time window with the identified patterns, to direct fetching data in advance. Furthermore, to maximize the cache use efficiency, we construct a mathematical model to adaptively determine the cache partition on the basis of I/O workload characteristics, for separately buffering the prefetched data and the write data. Experimental results demonstrate that our proposal can yield improvements on average read latency by 6.3% to 9.3% without noticeably increasing write latency, in contrast to conventional SSD-inside prefetching schemes.

Download Paper (PDF; Only available from the DATE venue WiFi)
12:30IP3-1, 594CNT-CACHE: AN ENERGY-EFFICIENT CARBON NANOTUBE CACHE WITH ADAPTIVE ENCODING
Speaker:
Kexin Chu, School of Electronic Science & Applied Physics Hefei University of Technology Anhui,China, CN
Authors:
Dawen Xu1, Kexin Chu1, Cheng Liu2, Ying Wang2, Lei Zhang2 and Huawei Li2
1School of Electronic Science & Applied Physics Hefei University of Technology Anhui, CN; 2Chinese Academy of Sciences, CN
Abstract
Carbon Nanotubu field-effect transistor(CNFET) that promises both higher clock speed and energy efficiency becomes an attractive alternative to the conventional power-hungry CMOS cache. We observe that CNFET-based cacheconstructed with typical 9T SRAM cells has distinct energy consumption when reading/writing 0 and 1 from/to it. The energy consumption of reading 0 is around 3X higher compared toreading 1. The energy consumption of writing 1 is almost 10X higher than writing 0. With this observation, we propose an energy-efficient cache design called CNT-Cache to take advantage of this feature. It includes an adaptive data encoding modulethat can convert the coding of each cache line to match the cache reading and writing preferences. Meanwhile, it has a cache line encoding direction predictor that instructs the encoding direction according to the cache line access history. The two optimizations combined together can reduce the overall dynamicpower consumption significantly. According to our experiments,the optimized CNFET-based L1 D-Cache reduces the dynamic power consumption by 22% on average compared to the baseline CNFET cache.

Download Paper (PDF; Only available from the DATE venue WiFi)
12:30End of session