6.3 When Approximation Meets Dependability

Time	Label	Presentation Title Authors
11:00	6.3.1	SENSOR-BASED APPROXIMATE ADDER DESIGN FOR ACCELERATING ERROR-TOLERANT AND DEEP-LEARNING APPLICATIONS Speaker: Ning-Chi Huang, National Chiao Tung University, TW Authors: Ning-Chi Huang, Szu-Ying Chen and Kai-Chiang Wu, Department of Computer Science, National Chiao Tung University, TW Abstract Approximate computing is an emerging strategy which trades computational accuracy for computational cost in terms of performance, energy, and/or area. In this paper, we propose a novel sensor-based approximate adder for high-performance energy-efficient arithmetic computation, while considering the accuracy requirement of error-tolerant applications. This is the first work using in-situ sensors for approximate adder design, based on monitoring online transition activity on the carry chain and speculating on carry propagation/truncation. On top of a fully-optimized ripple-carry adder, the performance of our adder is enhanced by 2.17X. When applied in error-tolerant applications such as image processing and handwritten digit recognition, our approximate adder leads to very promising quality of results compared to the case when an accurate adder is used. Download Paper (PDF; Only available from the DATE venue WiFi)
11:30	6.3.2	LOW-POWER VARIATION-AWARE CORES BASED ON DYNAMIC DATA-DEPENDENT BITWIDTH TRUNCATION Speaker: Ioannis Tsiokanos, Queen's University Belfast, GR Authors: Ioannis Tsiokanos, Lev Mukhanov and Georgios Karakonstantis, Queen's University Belfast, GB Abstract Increasing variability of transistor parameters in nanoscale era renders modern circuits prone to timing failures. To address such failures, designers adopt pessimistic timing/voltage guardbands, which are estimated under rare worst-case conditions, thus leading to power and performance overheads. Recent approximation schemes based on precision reduction may help to limit the incurred overheads, but the precision is reduced statically in all operations. This results in unnecessary quality loss, since these schemes neglect the fact that only few long latency paths (LLPs) may be prone to failures, and such paths may be activated rarely. In this paper, we propose a variation-aware framework that minimizes any quality loss by dynamically truncating the bitwidth only for operands triggering the LLPs. This is achieved by predicting at runtime the excitation of the LLPs based on the processed operands. The applied truncation, which we implement by setting a number of least-significant bits to a constant value of zero, can effectively reduce the delay of the excited LLPs, providing sufficient timing slack to avoid any failure without using conservative guardbands. To facilitate the adoption of such a scheme within pipelined cores and limit the incurred overheads, we also shape the path distribution appropriately for isolating the LLPs in a single pipeline stage. Additionally, to evaluate the efficacy of our framework, we perform post-layout dynamic timing analysis based on real operands that we extract from a variety of applications. When applied to the implementation of an IEEE-754 compatible double precision floating-point unit (FPU) in a 45nm technology, our approach eliminates timing failures under 8% delay variations with no performance loss. Our design comes at a cost of up-to 4.48% power and 0.34% area overheads, while the occasional operand truncation incurs minimal quality-loss in terms of relative error, up-to 4.1 · 10^−6. Finally, when compared to an FPU with pessimistic margins, our technique can save up-to 44.3% power. Download Paper (PDF; Only available from the DATE venue WiFi)
12:00	6.3.3	A SMART FAULT DETECTION SCHEME FOR RELIABLE IMAGE PROCESSING APPLICATIONS Speaker: Luca Cassano, Politecnico di Milano, IT Authors: Matteo Biasielli, Cristiana Bolchini, Luca Cassano and Antonio Miele, Politecnico di Milano, IT Abstract Traditional fault detection/tolerance techniques exploit multiple instances of the nominal processing and then perform a bit-wise comparison of the outputs to detect the occurrence of faults. In specific application scenarios, e.g., image/signal processing, the elaboration has an inherent degree of fault tolerance because it is possible to use the output even in the presence of slight alterations. In these contexts, the classical bit-wise comparison may be inefficient. Indeed, it may lead to conservatively discard outputs that have been only slightly altered by the fault and that could still be usefully exploited. In this paper, we propose a smart checking scheme based on Convolutional Neural Networks that rather than distinguishing between faulty and not faulty images, discriminates between usable and not usable images according to the ability of the end user to correctly process the output. The experimental evaluation shows that this solution enables an execution time saving of about 6.35% with a 99.42% accuracy, on average. Download Paper (PDF; Only available from the DATE venue WiFi)
12:30	IP3-1, 662	NON-INTRUSIVE SELF-TEST LIBRARY FOR AUTOMOTIVE CRITICAL APPLICATIONS: CONSTRAINTS AND SOLUTIONS Speaker: Davide Piumatti, Politecnico di Torino, IT Authors: Paolo Bernardi¹, Riccardo Cantoro¹, Andrea Floridia¹, Davide Piumatti¹, Cozmin Pogonea¹, Annachiara Ruospo¹, Ernesto Sanchez¹, Sergio De Luca² and Alessandro Sansonetti² ¹Politecnico di Torino, IT; ²STMicroelectronics, IT Abstract Today, safety-critical applications require self-tests and self-diagnosis approaches to be applied during the lifetime of the device. In general, the fault coverage values required by the standards (like ISO 26262) in the whole System-on-Chip (SoC) are very high. Therefore, different strategies are adopted. In the case of the processor core, the required fault coverage can be achieved by scheduling the periodical execution of a set of test programs or Software-Test Library (STL). However, the STL for in-field testing should be able to comply with the operating system specifications without affecting the mission operation of the device application. In this paper, the most relevant problems for the development of the STL are first discussed. Then, it presents a set of strategies and solutions oriented to produce an efficient and non-intrusive STL to be used exclusively during the in-field testing of automotive processor cores. The proposed approach was experimented on an automotive SoC developed by STMicroelectronics. Download Paper (PDF; Only available from the DATE venue WiFi)
12:30		End of session Lunch Break in Lunch Area Coffee Breaks in the Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Lunch Breaks (Lunch Area) On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area. Tuesday, March 26, 2019 Coffee Break 10:30 - 11:30 Lunch Break 13:00 - 14:30 Keynote Lecture "Leonardo da Vinci, Humanism and Engineering between Florence and Milan" by Claudio Giorgione in room 1 13:50 - 14:20 Coffee Break 16:00 - 17:00 Wednesday, March 27, 2019 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:30 Keynote Lecture "Heterogeneous, High Scale Computing in the Era of Intelligent, Cloud-Connected" by David Pellerin, Amazon, US in room 1 13:50 - 14:20 Coffee Break 16:00 - 17:00 Thursday, March 28, 2019 Coffee Break 10:00 - 11:00 University Booth Best Demo Award Presentation at the University Booth 10:30 Lunch Break 12:30 - 14:00 Keynote Lecture "A Fundamental Look at Models and Intelligence" by Edward A. Lee, University of California, Berkeley, US in room 1 13:20 - 13:50 Coffee Break 15:30 - 16:00

Time

Label

Presentation Title
Authors

11:00

6.3.1

SENSOR-BASED APPROXIMATE ADDER DESIGN FOR ACCELERATING ERROR-TOLERANT AND DEEP-LEARNING APPLICATIONS
Speaker:
Ning-Chi Huang, National Chiao Tung University, TW
Authors:
Ning-Chi Huang, Szu-Ying Chen and Kai-Chiang Wu, Department of Computer Science, National Chiao Tung University, TW
Abstract
Approximate computing is an emerging strategy which trades computational accuracy for computational cost in terms of performance, energy, and/or area. In this paper, we propose a novel sensor-based approximate adder for high-performance energy-efficient arithmetic computation, while considering the accuracy requirement of error-tolerant applications. This is the first work using in-situ sensors for approximate adder design, based on monitoring online transition activity on the carry chain and speculating on carry propagation/truncation. On top of a fully-optimized ripple-carry adder, the performance of our adder is enhanced by 2.17X. When applied in error-tolerant applications such as image processing and handwritten digit recognition, our approximate adder leads to very promising quality of results compared to the case when an accurate adder is used.
Download Paper (PDF; Only available from the DATE venue WiFi)

11:30

6.3.2

LOW-POWER VARIATION-AWARE CORES BASED ON DYNAMIC DATA-DEPENDENT BITWIDTH TRUNCATION
Speaker:
Ioannis Tsiokanos, Queen's University Belfast, GR
Authors:
Ioannis Tsiokanos, Lev Mukhanov and Georgios Karakonstantis, Queen's University Belfast, GB
Abstract
Increasing variability of transistor parameters in nanoscale era renders modern circuits prone to timing failures. To address such failures, designers adopt pessimistic timing/voltage guardbands, which are estimated under rare worst-case conditions, thus leading to power and performance overheads. Recent approximation schemes based on precision reduction may help to limit the incurred overheads, but the precision is reduced statically in all operations. This results in unnecessary quality loss, since these schemes neglect the fact that only few long latency paths (LLPs) may be prone to failures, and such paths may be activated rarely. In this paper, we propose a variation-aware framework that minimizes any quality loss by dynamically truncating the bitwidth only for operands triggering the LLPs. This is achieved by predicting at runtime the excitation of the LLPs based on the processed operands. The applied truncation, which we implement by setting a number of least-significant bits to a constant value of zero, can effectively reduce the delay of the excited LLPs, providing sufficient timing slack to avoid any failure without using conservative guardbands. To facilitate the adoption of such a scheme within pipelined cores and limit the incurred overheads, we also shape the path distribution appropriately for isolating the LLPs in a single pipeline stage. Additionally, to evaluate the efficacy of our framework, we perform post-layout dynamic timing analysis based on real operands that we extract from a variety of applications. When applied to the implementation of an IEEE-754 compatible double precision floating-point unit (FPU) in a 45nm technology, our approach eliminates timing failures under 8% delay variations with no performance loss. Our design comes at a cost of up-to 4.48% power and 0.34% area overheads, while the occasional operand truncation incurs minimal quality-loss in terms of relative error, up-to 4.1 · 10^−6. Finally, when compared to an FPU with pessimistic margins, our technique can save up-to 44.3% power.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:00

6.3.3

A SMART FAULT DETECTION SCHEME FOR RELIABLE IMAGE PROCESSING APPLICATIONS
Speaker:
Luca Cassano, Politecnico di Milano, IT
Authors:
Matteo Biasielli, Cristiana Bolchini, Luca Cassano and Antonio Miele, Politecnico di Milano, IT
Abstract
Traditional fault detection/tolerance techniques exploit multiple instances of the nominal processing and then perform a bit-wise comparison of the outputs to detect the occurrence of faults. In specific application scenarios, e.g., image/signal processing, the elaboration has an inherent degree of fault tolerance because it is possible to use the output even in the presence of slight alterations. In these contexts, the classical bit-wise comparison may be inefficient. Indeed, it may lead to conservatively discard outputs that have been only slightly altered by the fault and that could still be usefully exploited. In this paper, we propose a smart checking scheme based on Convolutional Neural Networks that rather than distinguishing between faulty and not faulty images, discriminates between usable and not usable images according to the ability of the end user to correctly process the output. The experimental evaluation shows that this solution enables an execution time saving of about 6.35% with a 99.42% accuracy, on average.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:30

IP3-1, 662

NON-INTRUSIVE SELF-TEST LIBRARY FOR AUTOMOTIVE CRITICAL APPLICATIONS: CONSTRAINTS AND SOLUTIONS
Speaker:
Davide Piumatti, Politecnico di Torino, IT
Authors:
Paolo Bernardi¹, Riccardo Cantoro¹, Andrea Floridia¹, Davide Piumatti¹, Cozmin Pogonea¹, Annachiara Ruospo¹, Ernesto Sanchez¹, Sergio De Luca² and Alessandro Sansonetti²
¹Politecnico di Torino, IT; ²STMicroelectronics, IT
Abstract
Today, safety-critical applications require self-tests and self-diagnosis approaches to be applied during the lifetime of the device. In general, the fault coverage values required by the standards (like ISO 26262) in the whole System-on-Chip (SoC) are very high. Therefore, different strategies are adopted. In the case of the processor core, the required fault coverage can be achieved by scheduling the periodical execution of a set of test programs or Software-Test Library (STL). However, the STL for in-field testing should be able to comply with the operating system specifications without affecting the mission operation of the device application. In this paper, the most relevant problems for the development of the STL are first discussed. Then, it presents a set of strategies and solutions oriented to produce an efficient and non-intrusive STL to be used exclusively during the in-field testing of automotive processor cores. The proposed approach was experimented on an automotive SoC developed by STMicroelectronics.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:30

End of session
Lunch Break in Lunch Area

Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

Coffee Break 10:30 - 11:30
Lunch Break 13:00 - 14:30
Keynote Lecture "Leonardo da Vinci, Humanism and Engineering between Florence and Milan" by Claudio Giorgione in room 1 13:50 - 14:20
Coffee Break 16:00 - 17:00

Wednesday, March 27, 2019

Coffee Break 10:00 - 11:00
Lunch Break 12:30 - 14:30
Keynote Lecture "Heterogeneous, High Scale Computing in the Era of Intelligent, Cloud-Connected" by David Pellerin, Amazon, US in room 1 13:50 - 14:20
Coffee Break 16:00 - 17:00

Thursday, March 28, 2019

Coffee Break 10:00 - 11:00
University Booth Best Demo Award Presentation at the University Booth 10:30
Lunch Break 12:30 - 14:00
Keynote Lecture "A Fundamental Look at Models and Intelligence" by Edward A. Lee, University of California, Berkeley, US in room 1 13:20 - 13:50
Coffee Break 15:30 - 16:00