7.2 In-memory Computing and Security for Non-volatile Memory Technologies

Time	Label	Presentation Title Authors
14:30	7.2.1	AUTOMATED SYNTHESIS OF COMPACT CROSSBARS FOR SNEAK-PATH BASED IN-MEMORY COMPUTING Speaker: Sumit Kumar Jha, University of Central Florida, US Authors: Dwaipayan Chakraborty and Sumit Kumar Jha, University of Central Florida, US Abstract The rise of data-intensive computational loads has exposed the processor-memory bottleneck in Von Neumann architectures and has intensified the need for in-memory computing. Existing literature on computing Boolean formula using sneak-paths in nanoscale memristor crossbars has only focussed on short Boolean formula, such as 1-bit addition. There are two open questions: (i) Can one synthesize sneak-path based crossbars for computing large Boolean formula? (ii) What is the size of a memristor crossbar that can compute a given Boolean formula using sneak paths? In this paper, we make progress on both these open problems. First, we show that the number of rows and columns required to compute a Boolean formula is at most linear in the size of the Reduced Ordered Binary Decision Diagram representing the Boolean function. Second, we demonstrate how Boolean Decision Diagrams can be used to synthesize nanoscale crossbars that can compute a given Boolean formula using naturally occurring sneak paths. In particular, we synthesize large logical circuits such as 128-bit adders for the first-time using sneak-path based crossbar computing. Download Paper (PDF; Only available from the DATE venue WiFi)
15:00	7.2.2	HYBRID SPIKING-BASED MULTI-LAYERED SELF-LEARNING NEUROMORPHIC SYSTEM BASED ON MEMRISTOR CROSSBAR ARRAYS Speaker: Yiran Chen, Professor, US Authors: Amr Hassan, Chaofei Yang, Chenchen Liu, Hai (Helen) Li and Yiran Chen, University of Pittsburgh, US Abstract Neuromorphic computing systems are under heavy investigation as a potential substitute for the traditional von Neumann systems in high-speed low-power applications. Recently, memristor crossbar arrays were utilized in realizing spiking-based neuromorphic system, where memristor conductance values correspond to synaptic weights. Most of these systems are composed of a single crossbar layer, in which system training is done off-chip, using computer based simulations, then the trained weights are pre-programmed to the memristor crossbar array. However, multi-layered, on-chip trained systems become crucial for handling massive amount of data and to overcome the resistance shift that occurs to memristors overtime. In this work, we propose a spiking-based multi-layered neuromorphic computing system capable of online training. The system performance is evaluated using three different datasets showing improved results versus previous work. In addition, studying the system accuracy versus memristor resistance shift shows promising results. Download Paper (PDF; Only available from the DATE venue WiFi)
15:30	7.2.3	REVAMP : RERAM BASED VLIW ARCHITECTURE FOR IN-MEMORY COMPUTING Speaker: Anupam Chattopadhyay, School of Computer Science and Engineering, Nanyang Technological University, SG Authors: Debjyoti Bhattacharjee, Rajeswari Devadoss and Anupam Chattopadhyay, Nanyang Technological University, SG Abstract With diverse types of emerging devices offering simultaneous capability of storage and logic operations, researchers have proposed novel platforms that promise gains in energy-efficiency. Such platforms can be classified into two domains---application-specific and general-purpose. The application-specific in-memory computing platforms include machine learning accelerators, arithmetic units, and Content Addressable Memory (CAM)-based structures. On the other hand, the general-purpose computing platforms stem from the idea that several in-memory computing logic devices do support a universal set of Boolean logic operation and therefore, can be used for mapping arbitrary Boolean functions efficiently. In this direction, so far, researchers have concentrated on challenges in logic synthesis (e.g. depth optimization), and technology mapping (e.g. device count reduction). The important problem of efficient technology mapping of arbitrary logic network onto a crossbar array structure has been overlooked so far. In this paper, we propose, ReVAMP, a general-purpose computing platform based on Resistive RAM crossbar array, which exploits the parallelism in computing multiple logic operations in the same word. Further, we study the problem of instruction generation and scheduling for such a platform. We benchmark the performance of ReVAMP with respect to the state of the art architecture. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00	IP3-7, 462	COVERT: COUNTER OVERFLOW REDUCTION FOR EFFICIENT ENCRYPTION OF NON-VOLATILE MEMORIES Speaker: Kartik Mohanram, ECE Dept, University of Pittsburgh, US Authors: Shivam Swami and Kartik Mohanram, University of Pittsburgh, US Abstract Security vulnerabilities arising from data persistence in emerging non-volatile memories (NVMs) necessitate memory encryption to ensure data security. Whereas counter mode encryption (CME) is a stop-gap practical approach to address this concern, it suffers from frequent memory re-encryption (system freeze) for small-sized counters and poor system performance for large-sized counters. CME thus imposes heavy overheads on memory, system performance, and system availability in practice. We propose Counter OVErflow ReducTion (COVERT), a CME-based memory encryption solution that performs on-demand memory allocation to reduce the memory encryption frequency of fast growing counters, while also retaining the area/performance benefits of small-sized counters. Our full-system simulations of a phase change memory (PCM) architecture across SPEC CPU2006 benchmarks show that for equivalent overhead and no impact to performance, COVERT simultaneously reduces the full memory re-encryption frequency from 6 minutes to 25 hours and doubles memory lifetime in comparison to state-of-the-art CME techniques. Download Paper (PDF; Only available from the DATE venue WiFi)
16:01	IP3-8, 79	A WEAR-LEVELING-AWARE COUNTER MODE FOR DATA ENCRYPTION IN NON-VOLATILE MEMORIES Speaker: Fangting Huang, Huazhong University of Science and Technology, CN Authors: Fangting Huang¹, Dan Feng², Yu Hua² and Wen Zhou² ¹Huazhong University of Science and Technology, CN; ²Wuhan National Lab for Optoelectronics, School of Computer Science and Technology, Huazhong University of Science and Technology, China, CN Abstract Counter-mode encryption has been widely used to resist NVMs from malicious attacks, due to its proved security and high performance. However, this scheme suffers from the counter size versus re-encryption problem, where per-line counters must be relatively large to avoid counter overflow, or re-encryption of the entire memory is required to ensure security. In order to address this problem, we propose a novel wear-leveling-aware counter mode for data encryption, called Resetting Counter via Remapping (RCR). The basic idea behind RCR is to leverage wear-leveling remappings to reset the line counter. With carefully designed procedure, RCR avoids counter overflow with much smaller counter size. The salient features of RCR include low storage overhead of counters, high counter cache hit ratio, and no extra re-encryption overhead. Compared with state-of-the-art works, RCR obtains significant performance improvements, e.g., up to a 57% reduction in the IPC degradation, under the evaluation of 8 memory-intensive benchmarks from SPEC 2006. Download Paper (PDF; Only available from the DATE venue WiFi)
16:02	IP3-9, 552	(Best Paper Award Candidate) TUNNEL FET BASED REFRESH-FREE-DRAM Speaker: Navneet Gupta, ISEP-Paris, FR Authors: Navneet Gupta¹, Adam Makosiej², Andrei Vladimirescu³, Amara Amara³ and Costin Anghel³ ¹Institut supérieur d'électronique de Paris, France; LETI, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA-Leti) France;, FR; ²LETI, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA-Leti), FR; ³Institut Superieur d'Electronique de Paris (ISEP), FR Abstract A refresh free and scalable ultimate DRAM (uDRAM) bitcell and architecture is proposed for embedded application. uDRAM 1T1C bitcell is designed using access Tunnel FETs. Proposed design is able to store the data statically during retention eliminating the need for refresh. This is achieved using negative differential resistance property of TFETs and storage capacitor leakage. uDRAM allows scaling of storage capacitor by 87% and 80% in comparison to DDR and eDRAMs, respectively. Implemented design have sub-array read/write access times of < 4ns. Bitcell area of 0.0275μm2 is achieved in 28nm FDSOI-CMOS and is scalable further with technology shrink. Estimated throughput gain is 3.8% to 18% in comparison to CMOS DRAMs by refresh removal. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00		End of session Coffee Break in Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Tuesday, March 28, 2017 Coffee Break 10:30 - 11:30 Coffee Break 16:00 - 17:00 Wednesday, March 29, 2017 Coffee Break 10:00 - 11:00 Coffee Break 16:00 - 17:00 Thursday, March 30, 2017 Coffee Break 10:00 - 11:00 Coffee Break 15:30 - 16:00

Time

Label

Presentation Title
Authors

14:30

7.2.1

AUTOMATED SYNTHESIS OF COMPACT CROSSBARS FOR SNEAK-PATH BASED IN-MEMORY COMPUTING
Speaker:
Sumit Kumar Jha, University of Central Florida, US
Authors:
Dwaipayan Chakraborty and Sumit Kumar Jha, University of Central Florida, US
Abstract
The rise of data-intensive computational loads has exposed the processor-memory bottleneck in Von Neumann architectures and has intensified the need for in-memory computing. Existing literature on computing Boolean formula using sneak-paths in nanoscale memristor crossbars has only focussed on short Boolean formula, such as 1-bit addition. There are two open questions: (i) Can one synthesize sneak-path based crossbars for computing large Boolean formula? (ii) What is the size of a memristor crossbar that can compute a given Boolean formula using sneak paths? In this paper, we make progress on both these open problems. First, we show that the number of rows and columns required to compute a Boolean formula is at most linear in the size of the Reduced Ordered Binary Decision Diagram representing the Boolean function. Second, we demonstrate how Boolean Decision Diagrams can be used to synthesize nanoscale crossbars that can compute a given Boolean formula using naturally occurring sneak paths. In particular, we synthesize large logical circuits such as 128-bit adders for the first-time using sneak-path based crossbar computing.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:00

7.2.2

HYBRID SPIKING-BASED MULTI-LAYERED SELF-LEARNING NEUROMORPHIC SYSTEM BASED ON MEMRISTOR CROSSBAR ARRAYS
Speaker:
Yiran Chen, Professor, US
Authors:
Amr Hassan, Chaofei Yang, Chenchen Liu, Hai (Helen) Li and Yiran Chen, University of Pittsburgh, US
Abstract
Neuromorphic computing systems are under heavy investigation as a potential substitute for the traditional von Neumann systems in high-speed low-power applications. Recently, memristor crossbar arrays were utilized in realizing spiking-based neuromorphic system, where memristor conductance values correspond to synaptic weights. Most of these systems are composed of a single crossbar layer, in which system training is done off-chip, using computer based simulations, then the trained weights are pre-programmed to the memristor crossbar array. However, multi-layered, on-chip trained systems become crucial for handling massive amount of data and to overcome the resistance shift that occurs to memristors overtime. In this work, we propose a spiking-based multi-layered neuromorphic computing system capable of online training. The system performance is evaluated using three different datasets showing improved results versus previous work. In addition, studying the system accuracy versus memristor resistance shift shows promising results.
Download Paper (PDF; Only available from the DATE venue WiFi)

15:30

7.2.3

REVAMP : RERAM BASED VLIW ARCHITECTURE FOR IN-MEMORY COMPUTING
Speaker:
Anupam Chattopadhyay, School of Computer Science and Engineering, Nanyang Technological University, SG
Authors:
Debjyoti Bhattacharjee, Rajeswari Devadoss and Anupam Chattopadhyay, Nanyang Technological University, SG
Abstract
With diverse types of emerging devices offering simultaneous capability of storage and logic operations, researchers have proposed novel platforms that promise gains in energy-efficiency. Such platforms can be classified into two domains---application-specific and general-purpose. The application-specific in-memory computing platforms include machine learning accelerators, arithmetic units, and Content Addressable Memory (CAM)-based structures. On the other hand, the general-purpose computing platforms stem from the idea that several in-memory computing logic devices do support a universal set of Boolean logic operation and therefore, can be used for mapping arbitrary Boolean functions efficiently. In this direction, so far, researchers have concentrated on challenges in logic synthesis (e.g. depth optimization), and technology mapping (e.g. device count reduction). The important problem of efficient technology mapping of arbitrary logic network onto a crossbar array structure has been overlooked so far. In this paper, we propose, ReVAMP, a general-purpose computing platform based on Resistive RAM crossbar array, which exploits the parallelism in computing multiple logic operations in the same word. Further, we study the problem of instruction generation and scheduling for such a platform. We benchmark the performance of ReVAMP with respect to the state of the art architecture.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:00

IP3-7, 462

COVERT: COUNTER OVERFLOW REDUCTION FOR EFFICIENT ENCRYPTION OF NON-VOLATILE MEMORIES
Speaker:
Kartik Mohanram, ECE Dept, University of Pittsburgh, US
Authors:
Shivam Swami and Kartik Mohanram, University of Pittsburgh, US
Abstract
Security vulnerabilities arising from data persistence in emerging non-volatile memories (NVMs) necessitate memory encryption to ensure data security. Whereas counter mode encryption (CME) is a stop-gap practical approach to address this concern, it suffers from frequent memory re-encryption (system freeze) for small-sized counters and poor system performance for large-sized counters. CME thus imposes heavy overheads on memory, system performance, and system availability in practice. We propose Counter OVErflow ReducTion (COVERT), a CME-based memory encryption solution that performs on-demand memory allocation to reduce the memory encryption frequency of fast growing counters, while also retaining the area/performance benefits of small-sized counters. Our full-system simulations of a phase change memory (PCM) architecture across SPEC CPU2006 benchmarks show that for equivalent overhead and no impact to performance, COVERT simultaneously reduces the full memory re-encryption frequency from 6 minutes to 25 hours and doubles memory lifetime in comparison to state-of-the-art CME techniques.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:01

IP3-8, 79

A WEAR-LEVELING-AWARE COUNTER MODE FOR DATA ENCRYPTION IN NON-VOLATILE MEMORIES
Speaker:
Fangting Huang, Huazhong University of Science and Technology, CN
Authors:
Fangting Huang¹, Dan Feng², Yu Hua² and Wen Zhou²
¹Huazhong University of Science and Technology, CN; ²Wuhan National Lab for Optoelectronics, School of Computer Science and Technology, Huazhong University of Science and Technology, China, CN
Abstract
Counter-mode encryption has been widely used to resist NVMs from malicious attacks, due to its proved security and high performance. However, this scheme suffers from the counter size versus re-encryption problem, where per-line counters must be relatively large to avoid counter overflow, or re-encryption of the entire memory is required to ensure security. In order to address this problem, we propose a novel wear-leveling-aware counter mode for data encryption, called Resetting Counter via Remapping (RCR). The basic idea behind RCR is to leverage wear-leveling remappings to reset the line counter. With carefully designed procedure, RCR avoids counter overflow with much smaller counter size. The salient features of RCR include low storage overhead of counters, high counter cache hit ratio, and no extra re-encryption overhead. Compared with state-of-the-art works, RCR obtains significant performance improvements, e.g., up to a 57% reduction in the IPC degradation, under the evaluation of 8 memory-intensive benchmarks from SPEC 2006.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:02

IP3-9, 552

(Best Paper Award Candidate)
TUNNEL FET BASED REFRESH-FREE-DRAM
Speaker:
Navneet Gupta, ISEP-Paris, FR
Authors:
Navneet Gupta¹, Adam Makosiej², Andrei Vladimirescu³, Amara Amara³ and Costin Anghel³
¹Institut supérieur d'électronique de Paris, France; LETI, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA-Leti) France;, FR; ²LETI, Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA-Leti), FR; ³Institut Superieur d'Electronique de Paris (ISEP), FR
Abstract
A refresh free and scalable ultimate DRAM (uDRAM) bitcell and architecture is proposed for embedded application. uDRAM 1T1C bitcell is designed using access Tunnel FETs. Proposed design is able to store the data statically during retention eliminating the need for refresh. This is achieved using negative differential resistance property of TFETs and storage capacitor leakage. uDRAM allows scaling of storage capacitor by 87% and 80% in comparison to DDR and eDRAMs, respectively. Implemented design have sub-array read/write access times of < 4ns. Bitcell area of 0.0275μm2 is achieved in 28nm FDSOI-CMOS and is scalable further with technology shrink. Estimated throughput gain is 3.8% to 18% in comparison to CMOS DRAMs by refresh removal.
Download Paper (PDF; Only available from the DATE venue WiFi)

16:00

End of session
Coffee Break in Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Tuesday, March 28, 2017