IP1 Interactive Presentations

Printer-friendly version PDF version

Date: Tuesday 10 March 2015
Time: 16:00 - 16:30
Location / Room: Exhibition Area

Interactive Presentations run simultaneously during a 30-minute slot. A poster associated to the IP paper is on display throughout the afternoon. Additionally, each IP paper is briefly introduced in a one-minute presentation in a corresponding regular session, prior to the actual Interactive Presentation. At the end of each afternoon Interactive Presentations session the award 'Best IP of the Day' is given.

LabelPresentation Title
Authors
IP1-1HIGH-RESOLUTION ONLINE POWER MONITORING FOR MODERN MICROPROCESSORS
Speakers:
Fabian Oboril, Jos Ewert and Mehdi Tahoori, Karlsruhe Institute of Technology, DE
Abstract
The power consumption of computing systems is nowadays a major design constraint that affects performance and reliability. To co-optimize these aspects, fine-grained adaptation techniques at runtime are of growing importance. However, to use these tools efficiently, fine-grained information about the power consumption of various on-chip components at runtime is required. Therefore, here we propose a novel software-implemented high-resolution (spatial and temporal) power monitoring approach that relies on micro-models to estimate the power consumption of all microarchitectural components inside a processor core. Combined with a self-calibration technique that uses an available on-chip power sensor, our power estimation approach can achieve an accuracy of more than 99 % and provides deep insights about the power dissipation inside a processor core during workload execution.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-2REDUCING ENERGY CONSUMPTION IN MICROCONTROLLER-BASED PLATFORMS WITH LOW DESIGN MARGIN CO-PROCESSORS
Speakers:
Andres Gomez1, Christian Pinto2, Andrea Bartolini3, Davide Rossi2, Hamed Fatemi4, Jose Pineda de Gyvez4 and Luca Benini5
1Swiss Federal Institute of Technology in Zurich (ETHZ), CH; 2Università di Bologna, IT; 3Università di Bologna, IT / ETH Zürich, CH; 4NXP Semiconductors, NL; 5Università di Bologna / Swiss Federal Institute of Technology in Zurich (ETHZ), IT
Abstract
Advanced energy minimization techniques (i.e. DVFS, Thermal Management, etc) and their high-level HW/SW requirements are well established in high-throughput multi-core systems. These techniques would have an intolerable overhead in low-cost, performance-constrained microcontroller units (MCU's). These devices can further reduce power by operating at a lower voltage, at the cost of increased sensitivity to PVT variation and increased design margins. In this paper, we propose an runtime environment for next-generation dual-core MCU platforms. These platforms complement a single-core with a low area overhead, reduced design margin shadow-processor. The runtime decreases the overall energy consumption by exploiting design corner heterogeneity between the two cores, rather than increasing the throughput. This allows the platform's power envelope to be dynamically adjusted to application-specific requirements. Our simulations show that, depending on the ratio of core to platform energy, total energy savings can be up to 20%.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-3DE-ELASTISATION: FROM ASYNCHRONOUS DATAFLOWS TO SYNCHRONOUS CIRCUITS
Speakers:
Mahdi Jelodari Mamaghani, Jim Garside and Doug Edwards, University of Manchester, GB
Abstract
Whilst asynchronous VLSI programming provides a flexible abstract formalism to realise concurrent systems, the resulting performance is still an issue when adapting the flow in the industrial context. The asynchronous design paradigm provides `elasticity' which enables the system to tolerate delays in communication and computation; the drawback is that it imposes a communication overhead to the system which becomes prohibitively expensive when applied at a fine-grained level. This paper proposes a 'de-elastisation' technique in a CAD flow for asynchronous dataflow networks to improve the circuits' performance and area. To preserve the architectural advantages of asynchronous design (e.g. short cycles) the type of circuits are classified into blocking and non-blocking loops upon which our de-elastisation scheme relies. The technique is incorporated in the Teak CAD flow. Experimental results on several substantial case studies show significant performance and area improvement. This work shows 3x improvement for the first category of circuits, suitable for iterative realisations and DSP-like architectures and 4x for the second category which are suitable for concurrent realisations.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-4AUTOMATED FEATURE LOCALIZATION FOR DYNAMICALLY GENERATED SYSTEMC DESIGNS
Speakers:
Jannis Stoppe1, Robert Wille1 and Rolf Drechsler2
1University of Bremen, DE; 2University of Bremen/DFKI GmbH, DE
Abstract
Due to the large complexity of today's circuits and systems, all components e.g. in a System of Chip (SoC) cannot be designed from scratch anymore. As a consequence, designers frequently work on components which they did not create themselves and, hence, design understanding becomes a critical issue. Approaches for feature localization may help here by pinpointing to distinguished characteristics of a design. However, existing approaches for feature localization of SoCs mainly focused on the Register Transfer Level; existing solutions for the Electronic System Level (using languages such as SystemC) have severe limits. In this work, we propose an approach for advanced feature localization in SystemC designs. By this, we overcome main limitations of previously proposed solutions, in particular the missing support for dynamic descriptions, while keep the proposed solution as non-intrusive as possible. The benefits of the proposed approach are confirmed by means of a case study.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-5INDUCTOR OPTIMIZATION FOR ACTIVE CELL BALANCING USING GEOMETRIC PROGRAMMING
Speakers:
Matthias Kauer1, Swaminathan Narayanaswamy1, Martin Lukasiewycz1, Sebastian Steinhorst1 and Samarjit Chakraborty2
1TUM CREATE, SG; 2TU Munich, DE
Abstract
This paper proposes an optimization methodology for inductor components in active cell balancing architectures of electric vehicle battery packs. For this purpose, we introduce a new mathematical model to quantitatively describe the charge transfer of a family of inductor-based circuits. Utilizing worst case assumptions, this model yields a nonlinear program for designing the inductor and selecting the transfer current. In the next step, we transform this problem into a geometric program that can be efficiently solved. The optimized inductor reduces energy dissipation by at least 20% in various scenarios compared to a previous approach which selected an optimal off-the-shelf inductor.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-6LIGHTWEIGHT AUTHENTICATION FOR SECURE AUTOMOTIVE NETWORKS
Speakers:
Philipp Mundhenk1, Sebastian Steinhorst1, Martin Lukasiewycz1, Suhaib A. Fahmy2 and Samarjit Chakraborty3
1TUM CREATE, SG; 2School of Computer Engineering, Nanyang Technological University, SG; 3TU Munich, DE
Abstract
We propose a framework to bridge the gap between secure authentication in automotive networks and on the internet. Our proposed framework allows runtime key exchanges with minimal overhead for resource-constrained in-vehicle networks. It combines symmetric and asymmetric cryptography to establish secure communication and enable secure updates of keys and software throughout the lifetime of the vehicle. For this purpose, we tailor authentication protocols for devices and authorization protocols for streams to the automotive domain. As a result, our framework natively supports multicast and broadcast communication. We show that our lightweight framework is able to initiate secure message streams over 15 times faster than conventional frameworks, for the first time meeting the real-time requirements of automotive networks.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-7MINIMIZING THE NUMBER OF PROCESS CORNER SIMULATIONS DURING DESIGN VERIFICATION
Speakers:
Michael Shoniker, Bruce Cockburn, Jie Han and Witold Pedrycz, University of Alberta, CA
Abstract
Integrated circuit designs need to be verified in simulation over a large number of process corners that represent the expected range of transistor properties, supply voltages, and die temperatures. Each process corner can require substantial simulation time. Unfortunately, the required number of corners has been growing rapidly in the latest semiconductor technologies. We consider the problem of minimizing the required number of process corner simulations by iteratively learning a model of the output functions in order to confidently estimate key maximum and/or minimum properties of those functions. Depending on the output function, the required number of corner simulations can be reduced by factors of up to 95%.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-8AN APPROXIMATE VOTING SCHEME FOR RELIABLE COMPUTING
Speakers:
Ke Chen1, Jie Han2 and Fabrizio Lombardi1
1Northeastern University, US; 2University of Alberta, CA
Abstract
This paper relies on the principles of inexact computing to alleviate the issues arising in static masking by voting for reliable computing. A scheme that utilizes approximate voting is proposed; it is referred to as inexact double modular redundancy (IDMR). IDMR does not resort to triplication, thus saving overhead due to modular replication; moreover, this scheme is adaptive in its operation, i.e., it allows a threshold to determine the validity of the module outputs. IDMR operates by initially establishing the difference between the values of the outputs of the two modules; only if the difference is below a preset threshold, then the voter calculates the average value of the two module outputs. An extensive analysis of the voting circuits and an application to image processing are presented.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-9FLINT: LAYOUT-ORIENTED FPGA-BASED METHODOLOGY FOR FAULT TOLERANT ASIC DESIGN
Speakers:
Rochus Nowosielski, Lukas Gerlach, Stephan Bieband, Guillermo Paya-Vaya and Holger Blume, Leibniz Universität Hannover, Institute of Microelectronic Systems, DE
Abstract
Research of efficient fault tolerance techniques for digital systems requires insight into the fault propagation mechanism inside the ASIC design. Radiation, high temperature, or charge sharing effects in ultra-deep submicron technologies influence fault generation and propagation dependent on die location. The proposed methodology links efficient fault injection to fault propagation in the floorplan view of a standard cell ASIC. This is achieved by instrumentation of the gate netlist after place&route, emulation in an FPGA system and experiment control via interactive user interface. Further, automated fault injection campaigns allow exhaustive fault tolerance evaluations taking single faults as well as adjacent cell faults into account. The proposed methodology can be used to identify vulnerable cell nodes in the design and allow the classification of placement strategies of fault tolerant ASIC designs.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-10A UNIFIED HARDWARE/SOFTWARE MPSOC SYSTEM CONSTRUCTION AND RUN-TIME FRAMEWORK
Speakers:
Sam Skalicky1, Andrew Schmidt2, Matthew French2 and Sonia Lopez1
1Rochester Institute of Technology, US; 2USC/ISI, US
Abstract
With the continual enhancement of heterogeneous resources in FPGA devices, utilizing these resources becomes a challenging burden for developers. Especially with the inclusion of sophisticated multiple processor system-on-chips, the necessary skill set to effectively leverage these resources spans both hardware and software expertise. The maturation of high level synthesis tools and programming languages aim to alleviate these complexities, yet there still exist systematic gaps that must be bridged to provide a more cohesive hardware/software development environment. High level MPSoC design initiatives such as Redsharc have reduced the costs of entry, simplifying application implementation. We propose a unified hardware/software framework for system construction, leveraging Redsharc's APIs, efficient on-chip interconnects, and run-time controllers. We present system level abstractions that enable compilation and implementation tools for hardware and software to be merged into a single configurable system development environment. Finally, we demonstrate our proposed framework with Redsharc, using AES encryption/decryption spanning software implementations on ARM and MicroBlaze processors and hardware kernels.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-11(AS)^2: ACCELERATOR SYNTHESIS USING ALGORITHMIC SKELETONS FOR RAPID DESIGN SPACE EXPLORATION
Speakers:
Shakith Fernando1, Mark Wijtvliet1, Cedric Nugteren1, Akash Kumar2 and Henk Corporaal3
1Eindhoven University of Technology, NL; 2National University of Singapore, SG; 3TU/e (Eindhoven University of Technology), NL
Abstract
Hardware accelerators in heterogeneous multiprocessor system-on-chips are becoming popular as a means of meeting performance and energy efficiency requirements of modern embedded systems. Current design methods for accelerator synthesis, such as High-Level Synthesis, are not fully automated. Therefore, time consuming manual iterations are required to explore efficient accelerator alternatives: the programmer is still required to think in terms of the underlying architecture. In this paper, we present (AS)^2: a design flow for Accelerator Synthesis using Algorithmic Skeletons. Skeletonization separates the structure of a parallel computation from an algorithms' functionality, enabling efficient implementations without requiring the programmer to have hardware knowledge. We define three such skeletons (for three image processing kernels), enabling FPGA specific parallelization techniques and optimizations. As a case study, we present a design space exploration of these skeletons and show how multiple design points with area-performance trade-offs, for the accelerators, can be efficiently and rapidly synthesized. We show that (AS)^2 is a promising direction for accelerator synthesis as it generates a Pareto front of 8 design points in under half an hour, for each of the three image processing kernels.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-12ASSISTED GENERATION OF FRAME CONDITIONS FOR FORMAL MODELS
Speakers:
Philipp Niemann, Frank Hilken, Martin Gogolla and Robert Wille, University of Bremen, DE
Abstract
Modeling languages such as UML or SysML allow for the validation and verification of the structure and the behavior of designs even in the absence of a specific implementation. However, formal models inherit a severe drawback: Most of them hardly provide a comprehensive and determinate description of transitions from one system state to another. This problem can be addressed by additionally specifying so-called frame conditions. However, only naive "workarounds" based on trivial heuristics or completely relying on a manual creation have been proposed for their generation thus far. In this work, we aim for a solution which neither leaves the burden of generating frame conditions entirely on the designer (avoiding the introduction of another time-consuming and expensive design step) nor is completely automatic (which, due to ambiguities, is not possible anyway). For this purpose, a systematic design methodology for the assisted generation of frame conditions is proposed.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-13TOWARDS A META-LANGUAGE FOR THE CONCURRENCY CONCERN IN DSLS
Speakers:
Julien Deantoni1, Papa Issa Diallo2, Ciprian Teodorov2, Joel Champeau2 and Benoit Combemale3
1I3S, University of Nice Sophia Antipolis, FR; 2Lab-STICC - ENSTA Bretagne, FR; 3IRISA, Universty of Rennes1, FR
Abstract
Concurrency is of primary interest in the development of complex software-intensive systems, as well as the deployment on modern platforms. Furthermore, Domain-Specific Languages (DSLs) are increasingly used in industrial processes to separate and abstract the various concerns of complex systems. % However, reifying the definition of the DSL concurrency remains a challenge. This not only prevents leveraging the concurrency concern of a particular domain or platform, but it also hinders: (1) the development of a complete understanding of the DSL semantics; (2) the effectiveness of concurrency-aware analysis techniques; (3) the analysis of the deployment on parallel architectures. % In this paper, we introduce the key ideas leading toward MoCCML, a dedicated meta-language for formally specifying the concurrency concern within the definition of a DSL. The concurrency constraints can reflect the knowledge in a particular domain, but also the constraints of a particular platform. MoCCML comes with a complete language workbench to help a DSL designer in the definition of the concurrency directly within the concepts of the DSL itself, and a generic workbench to simulate and analyze any model conforming to this DSL. % MoCCML is illustrated on the definition of an lightweight extension of SDF (Synchronous Data Flow).

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-14FAST AND ACCURATE BRANCH PREDICTOR SIMULATION
Speakers:
Antoine Faravelon, Nicolas Fournel and Frédéric Pétrot, TIMA Laboratory, Université de Grenoble-Alpes/CNRS, FR
Abstract
Embedded processors complexity has raised dramatically, due to the addition of architectural add-ons which improve performances significantly. High level models used in system simulation usually ignore these additions as the major issue is functional correctness. However, accurate estimates of software execution is sometimes required, therefore we focus in this paper on one of theses architectural features, the branch predictor. Unfortunately, advanced branch predictors use large tables, and a direct implementation of the scheme slows down simulation dramatically. To limit the simulation overhead, we define a modeling approach that we demonstrate on a state-of-the art predictor. We implemented the model in a dynamic binary translation based instruction set simulator and measured an accuracy of prediction of about 95% for a run-time overhead inferior to 5%.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-15COMPARATIVE STUDY OF TEST GENERATION METHODS FOR SIMULATION ACCELERATORS
Speakers:
Wisam Kadry1, Dimtry Krestyashyn1, Arkadiy Morgenshtein1, Amir Nahir1, Vitali Sokhin1, Jae Cheol Son2, Wookyeong Jeong2, Sung-Boem Park2 and Jin Sung Park2
1IBM Research - Haifa, IL; 2Samsung, KR
Abstract
Hardware-accelerated simulation platforms are quickly becoming a major vehicle for the functional verification of modern systems and processors. Accelerator platforms provide functional verification with valuable simulation cycles. Yet, the high cost and limited bandwidth of accelerator platforms dictate a requirement for continuous utilization improvement. In this work, we perform a comparative analysis of two approaches of test generation for accelerator platforms. An exerciser tool is used as experimental vehicle for the study. An off-platform test generation methodology is implemented and is compared to on-platform test generation typically used in exercisers. We present experimental results from simulation of latest IBM POWER8 processor on Awan accelerator platform, as well as from simulation of an eight-core ARMv8-based design on Veloce emulation platform. Our results indicate that the utilization of accelerator platforms can be improved by up to ×7 ratio when using off-platform test generation. In addition, increase of up to 24% is observed in test coverage. Off-platform mode features significantly bigger image size, but maintains tolerable build and load times.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-16USING STRUCTURAL RELATIONS FOR CHECKING COMBINATIONALITY OF CYCLIC CIRCUITS
Speakers:
Wan-Chen Weng1, Yung-Chih Chen2, Jui-Hung Chen1, Ching-Yi Huang1 and Chun-Yao Wang1
1National Tsing Hua University, TW; 2Yuan Ze University, TW
Abstract
Functionality and combinationality are two main issues that have to be dealt with in cyclic combinational circuits, which are combinational circuits containing loops. Cyclic circuits are combinational if nodes within the circuits have definite values under all input assignments. For a cyclified circuit, we have to check whether it is combinational or not. Thus, this paper proposes an efficient two-stage algorithm to verify the combinationality of cyclic circuits. A set of cyclified IWLS 2005 benchmarks are performed to demonstrate the efficiency of the proposed algorithm. Compared to the state-of-the-art algorithm, our approach has a speedup of about 4000 times on average.

Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-17NFRS EARLY ESTIMATION THROUGH SOFTWARE METRICS
Speakers:
Andrws Vieira1, Pedro Faustini1, Luigi Carro2 and Erika Cota1
1Federal University of Rio Grande do Sul (UFRGS), BR; 2Federal University of Rio Grande do Sul (UFRGS),
Abstract
We propose the use of regression analysis to generate accurate predictive models for physical metrics using design metrics as input. We validate our approach with 40+ implementations of three systems in two development scenarios: system evolution and first design. Results show maximum prediction errors of 1.66% during system evolution. In a first design scenario, the average error is 15% with the maximum error still below 20% for all physical metrics. This approach provides a fast and accurate strategy to boost embedded software productivity and quality, by estimating Non-Functional Requirements (NFRs) during the first design stages.

Download Paper (PDF; Only available from the DATE venue WiFi)