IP5 Interactive Presentations

Printer-friendly version PDF version

Date: Thursday 12 March 2020
Time: 15:30 - 16:00
Location / Room: Poster Area

Interactive Presentations run simultaneously during a 30-minute slot. Additionally, each IP paper is briefly introduced in a one-minute presentation in a corresponding regular session

LabelPresentation Title
Authors
IP5-1STATISTICAL MODEL CHECKING OF APPROXIMATE CIRCUITS: CHALLENGES AND OPPORTUNITIES
Speaker and Author:
Josef Strnadel, Brno University of Technology, CZ
Abstract
Many works have shown that approximate circuits may play an important role in the development of resourceefficient electronic systems. This motivates many researchers to propose new approaches for finding an optimal trade-off between the approximation error and resource savings for predefined applications of approximate circuits. The works and approaches, however, focus mainly on design aspects regarding relaxed functional requirements while neglecting further aspects such as signal and parameter dynamics/stochasticity, relaxed/non-functional equivalence, testing or formal verification. This paper aims to take a step ahead by moving towards the formal verification of time-dependent properties of systems based on approximate circuits. Firstly, it presents our approach to modeling such systems by means of stochastic timed automata whereas our approach goes beyond digital, combinational and/or synchronous circuits and is applicable in the area of sequential, analog and/or asynchronous circuits as well. Secondly, the paper shows the principle and advantage of verifying properties of modeled approximate systems by the statistical model checking technique. Finally, the paper evaluates our approach and outlines future research perspectives.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-2RUNTIME ACCURACY-CONFIGURABLE APPROXIMATE HARDWARE SYNTHESIS USING LOGIC GATING AND RELAXATION
Speaker:
Tanfer Alan, Karlsruhe Institute of Technology, DE
Authors:
Tanfer Alan1, Andreas Gerstlauer2 and Joerg Henkel1
1Karlsruhe Institute of Technology, DE; 2University of Texas at Austin, US
Abstract
Approximate computing trades off computation accuracy against energy efficiency. Algorithms from several modern application domains such as decision making and computer vision are tolerant to approximations while still meeting their requirements. The extent of approximation tolerance, however, significantly varies with a change in input characteristics and applications. We propose a novel hybrid approach for the synthesis of runtime accuracy configurable hardware that minimizes energy consumption at area expense. To that end, first we explore instantiating multiple hardware blocks with different fixed approximation levels. These blocks can be selected dynamically and thus allow to configure the accuracy during runtime. They benefit from having fewer transistors and also synthesis relaxations in contrast to state-of-the-art gating mechanisms which only switch off a group of logic. Our hybrid approach combines instantiating such blocks with area-efficient gating mechanisms that reduce toggling activity, creating a fine-grained design-time knob on energy vs. area. Examining total energy savings for a Sobel Filter under different workloads and accuracy tolerances show that our method finds Pareto-optimal solutions providing up to 16% and 44% energy savings compared to state-of-the-art accuracy-configurable gating mechanism and an exact hardware block, respectively, at 2x area cost

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-3POST-QUANTUM SECURE BOOT
Speaker:
Vinay B. Y. Kumar, Nanyang Technological University, SG
Authors:
Vinay B. Y. Kumar1, Naina Gupta2, Anupam Chattopadhyay1, Michael Kasper3, Christoph Krauss4 and Ruben Niederhagen4
1Nanyang Technological University, SG; 2Indraprastha Institute of Information Technology, IN; 3Fraunhofer Singapore, SG; 4Fraunhofer SIT, DE
Abstract
A secure boot protocol is fundamental to ensuring the integrity of the trusted computing base of a secure system. The use of digital signature algorithms (DSAs) based on traditional asymmetric cryptography, particularly for secure boot, leaves such systems vulnerable to the threat of quantum computers. This paper presents the first post-quantum secure boot solution, implemented fully as hardware for reasons of security and performance. In particular, this work uses the eXtended Merkle Signature Scheme (XMSS), a hash-based scheme that has been specified as an IETF RFC. The solution has been integrated into a secure SoC platform around RISC-V cores and evaluated on an FPGA and is shown to be orders of magnitude faster compared to corresponding hardware/software implementations and to compare competitively with a fully hardware elliptic curve DSA based solution.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-4ROQ: A NOISE-AWARE QUANTIZATION SCHEME TOWARDS ROBUST OPTICAL NEURAL NETWORKS WITH LOW-BIT CONTROLS
Speaker:
Jiaqi Gu, University of Texas at Austin, US
Authors:
Jiaqi Gu1, Zheng Zhao1, Chenghao Feng1, Hanqing Zhu2, Ray T. Chen1 and David Z. Pan1
1University of Texas at Austin, US; 2Shanghai Jiao Tong University, CN
Abstract
Optical neural networks (ONNs) demonstrate orders-of-magnitude higher speed in deep learning acceleration than their electronic counterparts. However, limited control precision and device variations induce accuracy degradation in practical ONN implementations. To tackle this issue, we propose a quantization scheme that adapts a full-precision ONN to low-resolution voltage controls. Moreover, we propose a protective regularization technique that dynamically penalizes quantized weights based on their estimated noise-robustness, leading to an improvement in noise robustness. Experimental results show that the proposed scheme effectively adapts ONNs to limited-precision controls and device variations. The resultant four-layer ONN demonstrates higher inference accuracy with lower variances than baseline methods under various control precisions and device noises.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-5STATISTICAL TRAINING FOR NEUROMORPHIC COMPUTING USING MEMRISTOR-BASED CROSSBARS CONSIDERING PROCESS VARIATIONS AND NOISE
Speaker:
Ying Zhu, TU Munich, DE
Authors:
Ying Zhu1, Grace Li Zhang1, Tianchen Wang2, Bing Li1, Yiyu Shi2, Tsung-Yi Ho3 and Ulf Schlichtmann1
1TU Munich, DE; 2University of Notre Dame, US; 3National Tsing Hua University, TW
Abstract
Memristor-based crossbars are an attractive platform to accelerate neuromorphic computing. However, process variations during manufacturing and noise in memristors cause significant accuracy loss if not addressed. In this paper, we propose to model process variations and noise as correlated random variables and incorporate them into the cost function during training. Consequently, the weights after this statistical training become more robust and together with global variation compensation provide a stable inference accuracy. Simulation results demonstrate that the mean value and the standard deviation of the inference accuracy can be improved significantly, by even up to 54% and 31%, respectively, in a two-layer fully connected neural network.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-6COMPUTATIONAL RESTRUCTURING: RETHINKING IMAGE PROCESSING USING MEMRISTOR CROSSBAR ARRAYS
Speaker:
Rickard Ewetz, University of Central Florida, US
Authors:
Baogang Zhang, Necati Uysal and Rickard Ewetz, University of Central Florida, US
Abstract
Image processing is a core operation performed on billions of sensor-devices in the Internet of Things (IoT). Emerging memristor crossbar arrays (MCAs) promise to perform matrix-vector multiplication (MVM) with extremely small energy-delay product, which is the dominating computation within the two-dimensional Discrete Cosine Transform (2D DCT). Earlier studies have directly mapped the digital implementation to MCA based hardware. The drawback is that the series computation is vulnerable to errors. Moreover, the implementation requires the use of large image block sizes, which is known to degrade the image quality. In this paper, we propose to restructure the 2D DCT into an equivalent single linear transformation (or MVM operation). The reconstruction eliminates the series computation and reduces the processed block sizes from NxN to √Nx√N. Consequently, both the robustness to errors and the image quality is improved. Moreover, the latency, power, and area is reduced with 2X while eliminating the storage of intermediate data, and the power and area can be further reduced with up to 62% and 74% using frequency spectrum optimization.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-7SCRIMP: A GENERAL STOCHASTIC COMPUTING ACCELERATION ARCHITECTURE USING RERAM IN-MEMORY PROCESSING
Speaker:
Saransh Gupta, University of California, San Diego, US
Authors:
Saransh Gupta1, Mohsen Imani1, Joonseop Sim1, Andrew Huang1, Fan Wu1, M. Hassan Najafi2 and Tajana Rosing1
1University of California, San Diego, US; 2University of Louisiana, US
Abstract
Stochastic computing (SC) reduces the complexity of computation by representing numbers with long independent bit-streams. However, increasing performance in SC comes with an increase in area and loss in accuracy. Processing in memory (PIM) with non-volatile memories (NVMs) computes data in-place, while having high memory density and supporting bit-parallel operations with low energy. In this paper, we propose SCRIMP for stochastic computing acceleration with resistive RAM (ReRAM) in-memory processing, which enables SC in memory. SCRIMP can be used for a wide range of applications. It supports all SC encodings and operations in memory. It maximizes the performance and energy efficiency of implementing SC by introducing novel in-memory parallel stochastic number generation and efficient implication-based logic in memory. To show the efficiency of our stochastic architecture, we implement image processing on the proposed hardware.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-8TDO-CIM: TRANSPARENT DETECTION AND OFFLOADING FOR COMPUTATION IN-MEMORY
Speaker:
Lorenzo Chelini, Eindhoven University of Technology, NL
Authors:
Kanishkan Vadivel1, Lorenzo Chelini2, Ali BanaGozar1, Gagandeep Singh2, Stefano Corda2, Roel Jordans1 and Henk Corporaal1
1Eindhoven University of Technology, NL; 2IBM Research, CH
Abstract
Computation in-memory is a promising non-von Neumann approach aiming at completely diminishing the data transfer to and from the memory subsystem. Although a lot of architectures have been proposed, compiler support for such architectures is still lagging behind. In this paper, we close this gap by proposing an end-to-end compilation flow for in-memory computing based on the LLVM compiler infrastructure. Starting from sequential code, our approach automatically detects, optimizes, and offloads kernels suitable for in-memory acceleration. We demonstrate our compiler tool-flow on the PolyBench/C benchmark suite and evaluate the benefits of our proposed in-memory architecture simulated in Gem5 by comparing it with a state-of-the-art von Neumann architecture.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-9BACKFLOW: BACKWARD EDGE CONTROL FLOW ENFORCEMENT FOR LOW END ARM MICROCONTROLLERS
Speaker:
Cyril Bresch, LCIS, FR
Authors:
Cyril Bresch1, David Hély2 and Roman Lysecky3
1LCIS, FR; 2LCIS - Grenoble INP, FR; 3University of Arizona, US
Abstract
This paper presents BackFlow, a compiler-based toolchain that enforces indirect backward edge control flow integrity for low-end ARM Cortex-M microprocessors. BackFlow is implemented within the Clang/LLVM compiler and supports the ARM instruction set and its subset Thumb. The control flow integrity generated by the compiler relies on a bitmap, where each set bit indicates a valid pointer destination. The efficiency of the framework is benchmarked using an STM32 NUCLEO F446RE microcontroller. The obtained results show that the control flow integrity solution incurs an execution time overhead ranging from 1.5 to 4.5%.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-10DELAY SENSITIVITY POLYNOMIALS BASED DESIGN-DEPENDENT PERFORMANCE MONITORS FOR WIDE OPERATING RANGES
Speaker:
Ruikai Shi, Chinese Academy of Sciences, CN
Authors:
Ruikai Shi1, Liang Yang2 and Hao Wang2
1Chinese Academy of Sciences / University of Chinese Academy of Sciences, CN; 2Loongson Technology Corporation Ltd., CN
Abstract
The downsizing of CMOS technology makes circuit performance more sensitive to on-chip parameter variations. Previous proposed design-dependent ring oscillator (DDRO) method provides an efficient way to monitor circuit performance at runtime. However, the linear delay sensitivity expression may be inadequate, especially in a wide range of operating conditions. To overcome it, a new design-dependent performance monitor (DDPM) method is proposed in this work, which formulates the delay sensitivity as high-order polynomials, makes it possible to accurately track the nonlinear timing behavior for wide operating ranges. A 28nm technology is used for design evaluation, and quite a low error rate is achieved in circuit performance monitoring comparison.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-11MITIGATION OF SENSE AMPLIFIER DEGRADATION USING SKEWED DESIGN
Speaker:
Daniel Kraak, TU Delft, NL
Authors:
Daniel Kraak1, Mottaqiallah Taouil1, Said Hamdioui1, Pieter Weckx2, Stefan Cosemans2 and Francky Catthoor2
1TU Delft, NL; 2IMEC, BE
Abstract
Designers typically add design margins to semiconductor memories to compensate for aging. However, the aging impact increases with technology downscaling, leading to the need for higher margins. This results into a negative impact on area, yield, performance, and power consumption. As an alternative, mitigation schemes can be developed to reduce such impact. This paper proposes a mitigation scheme for the memory's sense amplifier (SA); the scheme is based on creating a skew in the relative strengths of the SA's cross-coupled inverters during design. The skew is compensated by aging due to unbalanced workloads. As a result, the impact of aging on the SA is reduced. To validate the mitigation scheme, the degradation of the sense amplifier is analyzed for several workloads.The experimental results show that the proposed mitigation scheme reduces the degradation of the sense amplifier's critical figure-of-merit, the offset voltage, with up to 26%.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP5-12BLOCKCHAIN TECHNOLOGY ENABLED PAY PER USE LICENSING APPROACH FOR HARDWARE IPS
Speaker:
Krishnendu Guha, University of Calcutta, IN
Authors:
Krishnendu Guha, Debasri Saha and Amlan Chakrabarti, University of Calcutta, IN
Abstract
The present era is witnessing a reuse of hardware IPs to reduce cost. As trustworthiness is an essential factor, designers prefer to use hardware IPs which performed effectively in the past, but at the same time, are still active and did not age. In such scenarios, pay per use licensing schemes suit best for both producers and users. Existing pay per use licensing mechanisms consider a centralized third party, which may not be trustworthy. Hence, we seek refuge to blockchain technology to eradicate such third parties and facilitate a transparent and automated pay per use licensing mechanism. A blockchain is a distributed public ledger whose records are added based on peer review and majority consensus of its participants, that cannot be tampered or modified later. Smart contracts are deployed to facilitate the mechanism. Even dynamic pricing of the hardware IPs based on the factors of trustworthiness and aging have been focused in this work, which are not associated in existing literature. Security analysis of the proposed mechanism has been provided. Performance evaluation is carried based on the gas usage of Ethereum Solidity test environment, along with cost analysis based on lifetime and related user ratings.

Download Paper (PDF; Only available from the DATE venue WiFi)