10.4 Design Methodologies for Hardware Approximation

Time	Label	Presentation Title Authors
11:00	10.4.1	REALM: REDUCED-ERROR APPROXIMATE LOG-BASED INTEGER MULTIPLIER Speaker: Hassaan Saadat, University of New South Wales, AU Authors: Hassaan Saadat¹, Haris Javaid², Aleksandar Ignjatovic¹ and Sri Parameswaran¹ ¹University of New South Wales, AU; ²Xilinx, SG Abstract We propose a new error-configurable approximate unsigned integer multiplier named REALM. It incorporates a novel error-reduction method into the classical approximate log-based multiplier. Each power-of-two-interval of the input operands is partitioned into MxM segments, and an error-reduction factor for each segment is analytically determined. These error-reduction factors can be used across any power-of-two-interval, so we quantize only M^2 factors and store them in the form of read-only hardwired lookup tables to keep the resource overhead to a minimum. Error characterization of REALM shows that it achieves very low error bias (mostly less than or equal to 0.05%), along with lower mean error (from 0.4% to 1.6%), and lower peak error (from 2.08% to 7.4%) than the classical approximate log-based multiplier and its state-of-the-art derivatives (mean errors greater than or equal to 2.6% and peak errors greater than or equal to 7.8%). Synthesis results using TSMC 45nm standard-cell library show that REALM enables significant power-efficiency (66% to 86% reduction) and area-efficiency (50% to 76% reduction) when compared with the accurate integer multiplier. We show that REALM produces Pareto optimal design trade-offs in the design space of state-of-the-art approximate multipliers. Application-level evaluation of REALM demonstrates that it has negligible effect on the output quality. Download Paper (PDF; Only available from the DATE venue WiFi)
11:30	10.4.2	A FAST BDD MINIMIZATION FRAMEWORK FOR APPROXIMATE COMPUTING Speaker: Oliver Keszocze, Friedrich-Alexander-Universität Erlangen-Nürnberg, DE Authors: Andreas Wendler and Oliver Keszocze, Friedrich-Alexander-Universität Erlangen-Nürnberg, DE Abstract Approximate Computing is a design paradigm that trades off computational accuracy for gains in non-functional aspects such as reduced area, increased computation speed, or power reduction. Computing the error of the approximated design is an essential step to determine its quality. The computation time for determining the error can become very large, effectively rendering the entire logic approximation procedure infeasible. As a remedy, we present methods to accelerate the computation of error metric computations by (a) exploiting structural information and (b) computing estimates of the metrics for multi-output Boolean functions represented as BDDs. We further present a novel greedy, bucket-based BDD minimization framework employing the newly proposed error metric computations to produce Pareto-optimal solutions with respect to BDD size and multiple error metrics. The applicability of the proposed minimization framework is demonstrated by an experimental evaluation. We can report considerable speedups while, at the same time, creating high-quality approximated BDDs. Download Paper (PDF; Only available from the DATE venue WiFi)
12:00	10.4.3	ON THE DESIGN OF HIGH PERFORMANCE HW ACCELERATOR THROUGH HIGH-LEVEL SYNTHESIS SCHEDULING APPROXIMATIONS Speaker: Benjamin Carrion Schaefer, University of Texas at Dallas, US Authors: Siyuan Xu and Benjamin Carrion Schaefer, University of Texas at Dallas, US Abstract High-level synthesis (HLS) takes as input a behavioral description (e.g. C/C++) and generates efficient hardware through three main steps: allocation, scheduling, and binding. The scheduling step, times the operations in the behavioral description by scheduling different portions of the code at unique clock steps (control steps). The code portions assigned to each clock step mainly depend on the target synthesis frequency and target technology. This work makes use of this to generate smaller and faster circuits by approximating the program portions scheduled in each clock step and by exploiting the slack between different scheduling step to further increase the performance/reduce the latency of the resultant circuit. In particular, each individual scheduling step is approximated given a maximum error boundary and a library of different approximation techniques. In order to further optimize the resultant circuit, different scheduling steps are merged based on the timing slack of different control step without violating the given timing constraint (target frequency). Experimental results from different domain-specific applications show that our method works well and is able to increase the throughput on average by 82% while at the same time reducing the area by 21% for a given maximum allowable error. Download Paper (PDF; Only available from the DATE venue WiFi)
12:15	10.4.4	FAST KRIGING-BASED ERROR EVALUATION FOR APPROXIMATE COMPUTING SYSTEMS Speaker: Daniel Menard, INSA Rennes, FR Authors: Justine Bonnot¹, Karol Desnos¹ and Daniel Menard² ¹Université de Rennes / Inria / IRISA, FR; ²INSA Rennes, FR Abstract Approximate computing techniques trade-off the performance of an application for its accuracy. The challenge when implementing approximate computing in an application is to efficiently evaluate the quality at the output of the application to optimize the noise budgeting of the different approximation sources. It is commonly achieved with an optimization algorithm to minimize the implementation cost of the application subject to a quality constraint. During the optimization process, numerous approximation configurations are tested, and the quality at the output of the application is measured for each configuration with simulations. The optimization process is a time-consuming task. We propose a new method for infering the accuracy or quality metric at the output of an application using kriging, a geostatistical method. Download Paper (PDF; Only available from the DATE venue WiFi)
12:30	IP5-1, 21	STATISTICAL MODEL CHECKING OF APPROXIMATE CIRCUITS: CHALLENGES AND OPPORTUNITIES Speaker and Author: Josef Strnadel, Brno University of Technology, CZ Abstract Many works have shown that approximate circuits may play an important role in the development of resourceefficient electronic systems. This motivates many researchers to propose new approaches for finding an optimal trade-off between the approximation error and resource savings for predefined applications of approximate circuits. The works and approaches, however, focus mainly on design aspects regarding relaxed functional requirements while neglecting further aspects such as signal and parameter dynamics/stochasticity, relaxed/non-functional equivalence, testing or formal verification. This paper aims to take a step ahead by moving towards the formal verification of time-dependent properties of systems based on approximate circuits. Firstly, it presents our approach to modeling such systems by means of stochastic timed automata whereas our approach goes beyond digital, combinational and/or synchronous circuits and is applicable in the area of sequential, analog and/or asynchronous circuits as well. Secondly, the paper shows the principle and advantage of verifying properties of modeled approximate systems by the statistical model checking technique. Finally, the paper evaluates our approach and outlines future research perspectives. Download Paper (PDF; Only available from the DATE venue WiFi)
12:31	IP5-2, 912	RUNTIME ACCURACY-CONFIGURABLE APPROXIMATE HARDWARE SYNTHESIS USING LOGIC GATING AND RELAXATION Speaker: Tanfer Alan, Karlsruhe Institute of Technology, DE Authors: Tanfer Alan¹, Andreas Gerstlauer² and Joerg Henkel¹ ¹Karlsruhe Institute of Technology, DE; ²University of Texas at Austin, US Abstract Approximate computing trades off computation accuracy against energy efficiency. Algorithms from several modern application domains such as decision making and computer vision are tolerant to approximations while still meeting their requirements. The extent of approximation tolerance, however, significantly varies with a change in input characteristics and applications. We propose a novel hybrid approach for the synthesis of runtime accuracy configurable hardware that minimizes energy consumption at area expense. To that end, first we explore instantiating multiple hardware blocks with different fixed approximation levels. These blocks can be selected dynamically and thus allow to configure the accuracy during runtime. They benefit from having fewer transistors and also synthesis relaxations in contrast to state-of-the-art gating mechanisms which only switch off a group of logic. Our hybrid approach combines instantiating such blocks with area-efficient gating mechanisms that reduce toggling activity, creating a fine-grained design-time knob on energy vs. area. Examining total energy savings for a Sobel Filter under different workloads and accuracy tolerances show that our method finds Pareto-optimal solutions providing up to 16% and 44% energy savings compared to state-of-the-art accuracy-configurable gating mechanism and an exact hardware block, respectively, at 2x area cost Download Paper (PDF; Only available from the DATE venue WiFi)
12:30		End of session

Time

Label

Presentation Title
Authors

11:00

10.4.1

REALM: REDUCED-ERROR APPROXIMATE LOG-BASED INTEGER MULTIPLIER
Speaker:
Hassaan Saadat, University of New South Wales, AU
Authors:
Hassaan Saadat¹, Haris Javaid², Aleksandar Ignjatovic¹ and Sri Parameswaran¹
¹University of New South Wales, AU; ²Xilinx, SG
Abstract
We propose a new error-configurable approximate unsigned integer multiplier named REALM. It incorporates a novel error-reduction method into the classical approximate log-based multiplier. Each power-of-two-interval of the input operands is partitioned into MxM segments, and an error-reduction factor for each segment is analytically determined. These error-reduction factors can be used across any power-of-two-interval, so we quantize only M^2 factors and store them in the form of read-only hardwired lookup tables to keep the resource overhead to a minimum. Error characterization of REALM shows that it achieves very low error bias (mostly less than or equal to 0.05%), along with lower mean error (from 0.4% to 1.6%), and lower peak error (from 2.08% to 7.4%) than the classical approximate log-based multiplier and its state-of-the-art derivatives (mean errors greater than or equal to 2.6% and peak errors greater than or equal to 7.8%). Synthesis results using TSMC 45nm standard-cell library show that REALM enables significant power-efficiency (66% to 86% reduction) and area-efficiency (50% to 76% reduction) when compared with the accurate integer multiplier. We show that REALM produces Pareto optimal design trade-offs in the design space of state-of-the-art approximate multipliers. Application-level evaluation of REALM demonstrates that it has negligible effect on the output quality.
Download Paper (PDF; Only available from the DATE venue WiFi)

11:30

10.4.2

A FAST BDD MINIMIZATION FRAMEWORK FOR APPROXIMATE COMPUTING
Speaker:
Oliver Keszocze, Friedrich-Alexander-Universität Erlangen-Nürnberg, DE
Authors:
Andreas Wendler and Oliver Keszocze, Friedrich-Alexander-Universität Erlangen-Nürnberg, DE
Abstract
Approximate Computing is a design paradigm that trades off computational accuracy for gains in non-functional aspects such as reduced area, increased computation speed, or power reduction. Computing the error of the approximated design is an essential step to determine its quality. The computation time for determining the error can become very large, effectively rendering the entire logic approximation procedure infeasible. As a remedy, we present methods to accelerate the computation of error metric computations by (a) exploiting structural information and (b) computing estimates of the metrics for multi-output Boolean functions represented as BDDs. We further present a novel greedy, bucket-based BDD minimization framework employing the newly proposed error metric computations to produce Pareto-optimal solutions with respect to BDD size and multiple error metrics. The applicability of the proposed minimization framework is demonstrated by an experimental evaluation. We can report considerable speedups while, at the same time, creating high-quality approximated BDDs.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:00

10.4.3

ON THE DESIGN OF HIGH PERFORMANCE HW ACCELERATOR THROUGH HIGH-LEVEL SYNTHESIS SCHEDULING APPROXIMATIONS
Speaker:
Benjamin Carrion Schaefer, University of Texas at Dallas, US
Authors:
Siyuan Xu and Benjamin Carrion Schaefer, University of Texas at Dallas, US
Abstract
High-level synthesis (HLS) takes as input a behavioral description (e.g. C/C++) and generates efficient hardware through three main steps: allocation, scheduling, and binding. The scheduling step, times the operations in the behavioral description by scheduling different portions of the code at unique clock steps (control steps). The code portions assigned to each clock step mainly depend on the target synthesis frequency and target technology. This work makes use of this to generate smaller and faster circuits by approximating the program portions scheduled in each clock step and by exploiting the slack between different scheduling step to further increase the performance/reduce the latency of the resultant circuit. In particular, each individual scheduling step is approximated given a maximum error boundary and a library of different approximation techniques. In order to further optimize the resultant circuit, different scheduling steps are merged based on the timing slack of different control step without violating the given timing constraint (target frequency). Experimental results from different domain-specific applications show that our method works well and is able to increase the throughput on average by 82% while at the same time reducing the area by 21% for a given maximum allowable error.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:15

10.4.4

FAST KRIGING-BASED ERROR EVALUATION FOR APPROXIMATE COMPUTING SYSTEMS
Speaker:
Daniel Menard, INSA Rennes, FR
Authors:
Justine Bonnot¹, Karol Desnos¹ and Daniel Menard²
¹Université de Rennes / Inria / IRISA, FR; ²INSA Rennes, FR
Abstract
Approximate computing techniques trade-off the performance of an application for its accuracy. The challenge when implementing approximate computing in an application is to efficiently evaluate the quality at the output of the application to optimize the noise budgeting of the different approximation sources. It is commonly achieved with an optimization algorithm to minimize the implementation cost of the application subject to a quality constraint. During the optimization process, numerous approximation configurations are tested, and the quality at the output of the application is measured for each configuration with simulations. The optimization process is a time-consuming task. We propose a new method for infering the accuracy or quality metric at the output of an application using kriging, a geostatistical method.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:30

IP5-1, 21

STATISTICAL MODEL CHECKING OF APPROXIMATE CIRCUITS: CHALLENGES AND OPPORTUNITIES
Speaker and Author:
Josef Strnadel, Brno University of Technology, CZ
Abstract
Many works have shown that approximate circuits may play an important role in the development of resourceefficient electronic systems. This motivates many researchers to propose new approaches for finding an optimal trade-off between the approximation error and resource savings for predefined applications of approximate circuits. The works and approaches, however, focus mainly on design aspects regarding relaxed functional requirements while neglecting further aspects such as signal and parameter dynamics/stochasticity, relaxed/non-functional equivalence, testing or formal verification. This paper aims to take a step ahead by moving towards the formal verification of time-dependent properties of systems based on approximate circuits. Firstly, it presents our approach to modeling such systems by means of stochastic timed automata whereas our approach goes beyond digital, combinational and/or synchronous circuits and is applicable in the area of sequential, analog and/or asynchronous circuits as well. Secondly, the paper shows the principle and advantage of verifying properties of modeled approximate systems by the statistical model checking technique. Finally, the paper evaluates our approach and outlines future research perspectives.
Download Paper (PDF; Only available from the DATE venue WiFi)

12:31

IP5-2, 912

RUNTIME ACCURACY-CONFIGURABLE APPROXIMATE HARDWARE SYNTHESIS USING LOGIC GATING AND RELAXATION
Speaker:
Tanfer Alan, Karlsruhe Institute of Technology, DE
Authors:
Tanfer Alan¹, Andreas Gerstlauer² and Joerg Henkel¹
¹Karlsruhe Institute of Technology, DE; ²University of Texas at Austin, US
Abstract
Approximate computing trades off computation accuracy against energy efficiency. Algorithms from several modern application domains such as decision making and computer vision are tolerant to approximations while still meeting their requirements. The extent of approximation tolerance, however, significantly varies with a change in input characteristics and applications. We propose a novel hybrid approach for the synthesis of runtime accuracy configurable hardware that minimizes energy consumption at area expense. To that end, first we explore instantiating multiple hardware blocks with different fixed approximation levels. These blocks can be selected dynamically and thus allow to configure the accuracy during runtime. They benefit from having fewer transistors and also synthesis relaxations in contrast to state-of-the-art gating mechanisms which only switch off a group of logic. Our hybrid approach combines instantiating such blocks with area-efficient gating mechanisms that reduce toggling activity, creating a fine-grained design-time knob on energy vs. area. Examining total energy savings for a Sobel Filter under different workloads and accuracy tolerances show that our method finds Pareto-optimal solutions providing up to 16% and 44% energy savings compared to state-of-the-art accuracy-configurable gating mechanism and an exact hardware block, respectively, at 2x area cost
Download Paper (PDF; Only available from the DATE venue WiFi)

12:30

End of session