6.2 Secure and fast memory and storage

Time	Label	Presentation Title Authors
11:00	6.2.1	AN EFFICIENT PERSISTENCY AND RECOVERY MECHANISM FOR SGX-STYLE INTEGRITY TREE IN SECURE NVM Speaker: Mengya Lei, Huazhong University of Science & Technology, CN Authors: Mengya Lei, Fang Wang, Dan Feng, Fan Li and Jie Xu, Huazhong University of Science & Technology, CN Abstract The integrity tree is a crucial part of secure non-volatile memory (NVM) system design. For NVM with large capacity, the SGX-style integrity tree (SIT) is practical due to its parallel updates and variable arity. However, employing SIT in secure NVM is not easy. This is because the secure metadata SIT must be strictly persisted or restored after a sudden power-loss, which unfortunately incurs unacceptable run-time overhead or recovery time. In this paper, we propose PSIT, a metadata persistency solution for SIT-protected secure NVM with high performance and fast restoration. PSIT utilizes the observation that for a lazily updated SIT, the lost tree nodes after a crash can be recovered by the corresponding child nodes in the NVM. It reduces the persistency overhead of the SIT nodes through a restrained write-back meta-cache and leverages the SIT inter-layer dependency for recovery. Experiments show that compared to ASIT, a state-of-the-art secure NVM using SIT, PSIT decreases write traffic by 47% and improves the performance by 18% on average while maintaining a comparable recovery time. Download Paper (PDF; Only available from the DATE venue WiFi)
11:30	6.2.2	REVISITING PERSISTENT HASH TABLE DESIGN FOR COMMERCIAL NON-VOLATILE MEMORY Speaker: Kaixin Huang, Shanghai Jiao Tong University, CN Authors: Kaixin Huang, Yan Yan and Linpeng Huang, Shanghai Jiao Tong University, CN Abstract Emerging non-volatile memory technologies bring evolution to storage systems and durable data structures. Among them, a proliferation of researches on persistent hash table employ NVM as the storage layer for both fast access and efficient persistence. Most of them are based on the assumptions that NVM has byte access granularity, poor write endurance, DRAM-comparable read latency and much higher write latency. However, a commercial non-volatile memory product, named Intel Optane DC Persistent Memory (AEP), has a few interesting features that are different from previous assumptions, such as 1) block access granularity 2) little concern for software-layer write endurance and 3) much higher read latency than DRAM and DRAM-comparable write latency. Confronted with the new challenges brought by AEP, we propose Rewo-Hash, a novel read-efficient and write-optimized hash table for commercial non-volatile memory. Our design can be summarized into three key points. First, we keep a hash table copy in DRAM as a cached table to speed up search requests. Second, we design a log-free atomic mechanism to support fast writes. Third, we devise an efficient synchronization scheme between the persistent table and cached table to mask the data synchronization overhead. We conduct extensive experiments using real NVM platform and the results show that compared with state-of-the-art NVM-Optimized hash tables, Rewo-Hash gains improvement of 1.73x-2.70x and 1.46x-3.11x in read latency and write latency, respectively. Rewo-Hash also outperforms its counterparts by 1.86x-4.24x in throughput for various YCSB workloads. Download Paper (PDF; Only available from the DATE venue WiFi)
12:00	6.2.3	OPTIMIZING PERFORMANCE OF PERSISTENT MEMORY FILE SYSTEMS USING VIRTUAL SUPERPAGES Speaker: Chaoshu Yang, Chongqing University, CN Authors: Chaoshu Yang¹, Duo Liu¹, Runyu Zhang¹, Xianzhang Chen¹, Shun Nie¹, Qingfeng Zhuge¹ and Edwin H.-M Sha² ¹Chongqing University, CN; ²East China Normal University, CN Abstract Existing persistent memory file systems can significantly improve the performance by utilizing the advantages of emerging Persistent Memories (PMs). Especially, persistent memory file systems can employ superpages (e.g., 2MB a page) of PMs to alleviate the overhead of locating file data and reduce TLB misses. Unfortunately, superpage also induces two critical problems. First, the data consistency of file systems using superpages causes severe write amplification during overwrite of file data. Second, existing management of superpages may lead to large waste of PM space. In this paper, we propose a Virtual Superpage Mechanism (VSM) to solve the problems by taking advantages of virtual address space. On one hand, VSM adopts multi-grained copy-on-write mechanism to reduce the write amplification while ensuring data consistency. On the other hand, VSM presents zero-copy file data migration mechanism to eliminate the loss of space utilization efficiency caused by superpages.We implement the proposed VSM mechanism in Linux kernel based on PMFS. Compared with the original PMFS and NOVA, the experimental results show that VSM improves 36% and 14% on average for write and read performance, respectively. Meanwhile, VSM can achieve the same space utilization efficiency of file system that uses the normal 4KB pages to organize files. Download Paper (PDF; Only available from the DATE venue WiFi)
12:15	6.2.4	FREQUENT ACCESS PATTERN-BASED PREFETCHING INSIDE OF SOLID-STATE DRIVES Speaker: Jianwei Liao, Southwest University of China, CN Authors: Xiaofei Xu¹, Zhigang Cai², Jianwei Liao² and Yutaka Ishikawa³ ¹Southwest University, CN; ²Southwest University of China, CN; ³RIKEN, Japan, JP Abstract This paper proposes an SSD-inside data prefetching scheme, which has features of OS-dependence and use transparency. To be specific, it first mines frequent block access patterns that reflect the correlation among the occurred requests. Then it compares the requests in the current time window with the identified patterns, to direct fetching data in advance. Furthermore, to maximize the cache use efficiency, we construct a mathematical model to adaptively determine the cache partition on the basis of I/O workload characteristics, for separately buffering the prefetched data and the write data. Experimental results demonstrate that our proposal can yield improvements on average read latency by 6.3% to 9.3% without noticeably increasing write latency, in contrast to conventional SSD-inside prefetching schemes. Download Paper (PDF; Only available from the DATE venue WiFi)
12:30	IP3-1, 594	CNT-CACHE: AN ENERGY-EFFICIENT CARBON NANOTUBE CACHE WITH ADAPTIVE ENCODING Speaker: Kexin Chu, School of Electronic Science & Applied Physics Hefei University of Technology Anhui,China, CN Authors: Dawen Xu¹, Kexin Chu¹, Cheng Liu², Ying Wang², Lei Zhang² and Huawei Li² ¹School of Electronic Science & Applied Physics Hefei University of Technology Anhui, CN; ²Chinese Academy of Sciences, CN Abstract Carbon Nanotubu field-effect transistor(CNFET) that promises both higher clock speed and energy efficiency becomes an attractive alternative to the conventional power-hungry CMOS cache. We observe that CNFET-based cacheconstructed with typical 9T SRAM cells has distinct energy consumption when reading/writing 0 and 1 from/to it. The energy consumption of reading 0 is around 3X higher compared toreading 1. The energy consumption of writing 1 is almost 10X higher than writing 0. With this observation, we propose an energy-efficient cache design called CNT-Cache to take advantage of this feature. It includes an adaptive data encoding modulethat can convert the coding of each cache line to match the cache reading and writing preferences. Meanwhile, it has a cache line encoding direction predictor that instructs the encoding direction according to the cache line access history. The two optimizations combined together can reduce the overall dynamicpower consumption significantly. According to our experiments,the optimized CNFET-based L1 D-Cache reduces the dynamic power consumption by 22% on average compared to the baseline CNFET cache. Download Paper (PDF; Only available from the DATE venue WiFi)
12:30		End of session

Time

Label

Presentation Title
Authors

11:00

6.2.1

AN EFFICIENT PERSISTENCY AND RECOVERY MECHANISM FOR SGX-STYLE INTEGRITY TREE IN SECURE NVM
Speaker:
Mengya Lei, Huazhong University of Science & Technology, CN
Authors:
Mengya Lei, Fang Wang, Dan Feng, Fan Li and Jie Xu, Huazhong University of Science & Technology, CN
Abstract
The integrity tree is a crucial part of secure non-volatile memory (NVM) system design. For NVM with large capacity, the SGX-style integrity tree (SIT) is practical due to its parallel updates and variable arity. However, employing SIT in secure NVM is not easy. This is because the secure metadata SIT must be strictly persisted or restored after a sudden power-loss, which unfortunately incurs unacceptable run-time overhead or recovery time. In this paper, we propose PSIT, a metadata persistency solution for SIT-protected secure NVM with high performance and fast restoration. PSIT utilizes the observation that for a lazily updated SIT, the lost tree nodes after a crash can be recovered by the corresponding child nodes in the NVM. It reduces the persistency overhead of the SIT nodes through a restrained write-back meta-cache and leverages the SIT inter-layer dependency for recovery. Experiments show that compared to ASIT, a state-of-the-art secure NVM using SIT, PSIT decreases write traffic by 47% and improves the performance by 18% on average while maintaining a comparable recovery time.
Download Paper (PDF; Only available from the DATE venue WiFi)

11:30

6.2.2

REVISITING PERSISTENT HASH TABLE DESIGN FOR COMMERCIAL NON-VOLATILE MEMORY
Speaker:
Kaixin Huang, Shanghai Jiao Tong University, CN
Authors:
Kaixin Huang, Yan Yan and Linpeng Huang, Shanghai Jiao Tong University, CN
Abstract
Emerging non-volatile memory technologies bring evolution to storage systems and durable data structures. Among them, a proliferation of researches on persistent hash table employ NVM as the storage layer for both fast access and efficient persistence. Most of them are based on the assumptions that NVM has byte access granularity, poor write endurance, DRAM-comparable read latency and much higher write latency. However, a commercial non-volatile memory product, named Intel Optane DC Persistent Memory (AEP), has a few interesting features that are different from previous assumptions, such as 1) block access granularity 2) little concern for software-layer write endurance and 3) much higher read latency than DRAM and DRAM-comparable write latency. Confronted with the new challenges brought by AEP, we propose Rewo-Hash, a novel read-efficient and write-optimized hash table for commercial non-volatile memory. Our design can be summarized into three key points. First, we keep a hash table copy in DRAM as a cached table to speed up search requests. Second, we design a log-free atomic mechanism to support fast writes. Third, we devise an efficient synchronization scheme between the persistent table and cached table to mask the data synchronization overhead. We conduct extensive experiments using real NVM platform and the results show that compared with state-of-the-art NVM-Optimized hash tables, Rewo-Hash gains improvement of 1.73x-2.70x and 1.46x-3.11x in read latency and write latency, respectively. Rewo-Hash also outperforms its counterparts by 1.86x-4.24x in throughput for various YCSB workloads.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:00

6.2.3

OPTIMIZING PERFORMANCE OF PERSISTENT MEMORY FILE SYSTEMS USING VIRTUAL SUPERPAGES
Speaker:
Chaoshu Yang, Chongqing University, CN
Authors:
Chaoshu Yang¹, Duo Liu¹, Runyu Zhang¹, Xianzhang Chen¹, Shun Nie¹, Qingfeng Zhuge¹ and Edwin H.-M Sha²
¹Chongqing University, CN; ²East China Normal University, CN
Abstract
Existing persistent memory file systems can significantly improve the performance by utilizing the advantages of emerging Persistent Memories (PMs). Especially, persistent memory file systems can employ superpages (e.g., 2MB a page) of PMs to alleviate the overhead of locating file data and reduce TLB misses. Unfortunately, superpage also induces two critical problems. First, the data consistency of file systems using superpages causes severe write amplification during overwrite of file data. Second, existing management of superpages may lead to large waste of PM space. In this paper, we propose a Virtual Superpage Mechanism (VSM) to solve the problems by taking advantages of virtual address space. On one hand, VSM adopts multi-grained copy-on-write mechanism to reduce the write amplification while ensuring data consistency. On the other hand, VSM presents zero-copy file data migration mechanism to eliminate the loss of space utilization efficiency caused by superpages.We implement the proposed VSM mechanism in Linux kernel based on PMFS. Compared with the original PMFS and NOVA, the experimental results show that VSM improves 36% and 14% on average for write and read performance, respectively. Meanwhile, VSM can achieve the same space utilization efficiency of file system that uses the normal 4KB pages to organize files.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:15

6.2.4

FREQUENT ACCESS PATTERN-BASED PREFETCHING INSIDE OF SOLID-STATE DRIVES
Speaker:
Jianwei Liao, Southwest University of China, CN
Authors:
Xiaofei Xu¹, Zhigang Cai², Jianwei Liao² and Yutaka Ishikawa³
¹Southwest University, CN; ²Southwest University of China, CN; ³RIKEN, Japan, JP
Abstract
This paper proposes an SSD-inside data prefetching scheme, which has features of OS-dependence and use transparency. To be specific, it first mines frequent block access patterns that reflect the correlation among the occurred requests. Then it compares the requests in the current time window with the identified patterns, to direct fetching data in advance. Furthermore, to maximize the cache use efficiency, we construct a mathematical model to adaptively determine the cache partition on the basis of I/O workload characteristics, for separately buffering the prefetched data and the write data. Experimental results demonstrate that our proposal can yield improvements on average read latency by 6.3% to 9.3% without noticeably increasing write latency, in contrast to conventional SSD-inside prefetching schemes.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:30

IP3-1, 594

CNT-CACHE: AN ENERGY-EFFICIENT CARBON NANOTUBE CACHE WITH ADAPTIVE ENCODING
Speaker:
Kexin Chu, School of Electronic Science & Applied Physics Hefei University of Technology Anhui,China, CN
Authors:
Dawen Xu¹, Kexin Chu¹, Cheng Liu², Ying Wang², Lei Zhang² and Huawei Li²
¹School of Electronic Science & Applied Physics Hefei University of Technology Anhui, CN; ²Chinese Academy of Sciences, CN
Abstract
Carbon Nanotubu field-effect transistor(CNFET) that promises both higher clock speed and energy efficiency becomes an attractive alternative to the conventional power-hungry CMOS cache. We observe that CNFET-based cacheconstructed with typical 9T SRAM cells has distinct energy consumption when reading/writing 0 and 1 from/to it. The energy consumption of reading 0 is around 3X higher compared toreading 1. The energy consumption of writing 1 is almost 10X higher than writing 0. With this observation, we propose an energy-efficient cache design called CNT-Cache to take advantage of this feature. It includes an adaptive data encoding modulethat can convert the coding of each cache line to match the cache reading and writing preferences. Meanwhile, it has a cache line encoding direction predictor that instructs the encoding direction according to the cache line access history. The two optimizations combined together can reduce the overall dynamicpower consumption significantly. According to our experiments,the optimized CNFET-based L1 D-Cache reduces the dynamic power consumption by 22% on average compared to the baseline CNFET cache.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:30

End of session