IP1 Interactive Presentations

Date: Tuesday 25 March 2014
Time: 16:00 - 16:30
Location / Room: Conference Level, foyer

Interactive Presentations run simultaneously during a 30-minute slot. A poster associated to the IP paper is on display throughout the afternoon. Additionally, each IP paper is briefly introduced in a one-minute presentation in a corresponding regular session, prior to the actual Interactive Presentation. At the end of each afternoon Interactive Presentations session the award 'Best IP of the Day' is given.

Label	Presentation Title Authors
IP1-1	SAFE: SECURITY-AWARE FLEXRAY SCHEDULING ENGINE Speakers: Gang Han¹, Haibo Zeng², Yaping Li³ and Wenhua Dou¹ ¹National University of Defense Technology, CN; ²McGill University, CA; ³The Chinese University of Hong Kong, CN Abstract In this paper, we propose SAFE (Security Aware FlexRay scheduling Engine), to provide a problem definition and a design framework for FlexRay static segment schedule to address the new challenge on security. From a high level specification of the application, the architecture and communication middleware are synthesized to satisfy security requirements, in addition to extensibility, costs, and end-to-end latencies. The proposed design process is applied to two industrial case studies consisting of a set of active safety functions and an X-by-wire system respectively.
IP1-2	TRANSIENT ERRORS RESILIENCY ANALYSIS TECHNIQUE FOR AUTOMOTIVE SAFETY CRITICAL APPLICATIONS Speakers: Sujan Pandey and Bart Vermeulen, NXP Semiconductors, NL Abstract When a single bit is flipped as a result of a transient error in an electronic circuit, its effect can have a severe impact if the circuit is deployed in safety critical domains such as automotive, aeronautics, and industrial automation. In the design phase it is therefore essential to evaluate, and where necessary improve, the resilience of a circuit to all possible transient errors. In this paper, we present a method to analyze the transient error resiliency of a digital circuit. This method is based on an analytical model. It models a transient error as a random function and finds the vulnerable number of bits for each node. We perform a case study on a circuit implementation of a well-known adaptive filter algorithm. The results from the analytical and simulation models show that the analytical model is accurate enough to estimate the effects of transient errors on the performance of a digital circuit. Our analytical method also reduces the run time significantly in a design phase.
IP1-3	MODEL BASED HIERARCHICAL OPTIMIZATION STRATEGIES FOR ANALOG DESIGN AUTOMATION Speakers: Engin Afacan¹, Gunhan Dundar¹, Faik Baskaya¹, Simge Ay¹ and Francisco Fernandez² ¹Bogazici University, TR; ²Universidad de Sevilla, TR Abstract The design of complex analog circuits by using flat optimization-based approaches is inefficient, even impossible, due to the high number of design variables and the growth of the cost of performance evaluation with the circuit size. Over the past two decades, top-down hierarchical design approaches have been developed and applied. They are based on hierarchical circuit decomposition and specification transmission from top-level to lower level blocks. However, such specification transmission is usually performed with little knowledge on the feasibility of the specifications, leading, therefore, to costly redesign iterations. Even if the specification transmission is successful, there is no guarantee that it is optimal in terms of e.g., power consumption or area occupation. To palliate this problem, two novel model-based hierarchical synthesis methods are proposed in this paper: Model-Based Hierarchical Optimization (MBHO) and Improved Model-Based Hierarchical Optimization (IMBHO). They are based on the concurrent design at higher and lower hierarchical levels and appropriate communication between the different processes. Experimental results on a filter example comparing the new approaches and the conventional top-down design approach are provided.
IP1-4	A NOVEL LOW POWER 11-BIT HYBRID ADC USING FLASH AND DELAY LINE ARCHITECTURES Speakers: Hsun-Cheng Lee and Jacob Abraham, the University of Texas at Austin, US Abstract This paper presents a novel low power 11-bit hybrid ADC using flash and delay line architectures, where a 4-bit flash ADC is followed by a 7-bit delay-line ADC. This hybrid ADC inherits accuracy and power efficiency from flash ADCs and delay-line ADCs, respectively. Also, in order to reduce the power of the first stage flash ADC, a power-saving technique is adopted by biasing the DC tail current of the pre-amplifiers at 5μA instead of the operational current, 47μA in stand-by mode. The hybrid ADC was designed and simulated in a commercial 65nm process. With a 1.1 V supply and 100 MS/s, the ADC achieves an SNDR of 60 dB and consumes 1.6 mW, which results in a figure of merit (FOM) of 19.4 fJ/conversion-step without any calibration technique. Also, Monte Carlo simulations are performed with a 3σ device mismatch for the SNDR estimation, and the SNDR is observed to be better than 58.5 dB.
IP1-5	SEMI-SYMBOLIC ANALYSIS OF MIXED-SIGNAL SYSTEMS INCLUDING DISCONTINUITIES Speakers: Carna Radojicic, Christoph Grimm, Javier Moreno and Xiao Pan, TU Kaiserslautern, DE Abstract The paper describes an approach for semi-symbolic analysis of mixed-signal systems that contain discontinuous functions, e.g. due to modeling comparators. For modeling and semi- symbolic simulation, we use extended Affine Arithmetic. Affine Arithmetic is currently limited to accurate analysis of linear func- tions and mild non-linear functions, but not yet discontinuities. In this paper we extend the approach to also handle discontinuities. For demonstration, we symbolically analyze a Σ∆-modulator.
IP1-6	(Best Paper Award Candidate) NOVEL CIRCUIT TOPOLOGY SYNTHESIS METHOD USING CIRCUIT FEATURE MINING AND SYMBOLIC COMPARISON Speakers: Cristian Ferent and Alex Doboli, Stony Brook University, US Abstract This paper presents a reasoning-based approach to analog circuit synthesis using ordered node clustering representations (ONCR) to describe alternative circuit features and symbolic circuit comparison to characterize performance trade-offs of synthesized solutions. Case studies illustrate application of the proposed methods to topology selection and refinement.
IP1-7	AN EMBEDDED OFFSET AND GAIN INSTRUMENT FOR OPAMP IPS Speakers: Jinbo Wan and Hans KerkHoff, CAES-TDT, CTIT, University of Twente, NL Abstract Analog and mixed-signal IPs are increasingly required to use digital fabrication technologies and are deeply embedded into system-on-chips (SoC). These developments append more requirements and challenges on analog testing methodologies. Traditional analog testing methods suffer from less accessibility and control with regard to these embedded analog circuits in SoCs. As an alternative, an embedded instrument for analog OpAmp IP tests is proposed in this paper. It can provide the exact gain and offset values of OpAmps instead of only pass/fail result. What's more, it is an non-invasive monitor and can work online without isolating the DUT Opamp from its surrounding feedback networks. Nor does it require accurate test stimulations. In addition, the monitor can remove its own offsets without additional complex self-calibration circuits. All self-calibrations are completed in the digital domain after each measurement in real time. Therefore it is also suitable for aging-sensitive applications, in which the monitor may suffer from aging mechanisms and has additional offset drifts as well. The monitor measurement range for offset is from 0.2mV to 70mV, and for gain it is from 0dB to 40dB. The error for offset measurements can be 10% of the measurement value with plus/minus 0.1mV, and -2.5dB for gain measurements.
IP1-8	EVX: VECTOR EXECUTION ON LOW POWER EDGE CORES Speakers: Milovan Duric¹, Oscar Palomar¹, Aaron Smith², Osman Unsal¹, Adrian Cristal¹, Mateo Valero¹ and Doug Burger² ¹Barcelona Supercomputing Center, ES; ²Microsoft Research, US Abstract In this paper, we present a vector execution model that provides the advantages of vector processors on low power, general purpose cores, with limited additional hardware. While accelerating data-level parallel (DLP) workloads, the vector model increases the efficiency and hardware resources utilization. We use a modest dual issue core based on an Explicit Data Graph Execution (EDGE) architecture to implement our approach, called EVX. Unlike most DLP accelerators which utilize additional hardware and increase the complexity of low power processors, EVX leverages the available resources of EDGE cores, and with minimal costs allows for specialization of the resources. EVX adds a control logic that increases the core area by 2.1%. We show that EVX yields an average speedup of 3x compared to a scalar baseline and outperforms multimedia SIMD extensions.
IP1-9	PROGRAM AFFINITY PERFORMANCE MODELS FOR PERFORMANCE AND UTILIZATION Speakers: Ryan Moore and Bruce Childers, University of Pittsburgh, US Abstract Multithreaded applications have a wide variety of behavior, causing complex interactions with today's chip multiprocessor machines. Application threads may have large private working sets, and may compete for cache space and memory bandwidth. These threads benefit from large private caches. Other threads may share data or communicate, and thus, execute more quickly if using shared caches. Many applications fall somewhere in between, requiring careful thread-to-core assignments to maximize performance. Yet because of the large number of thread-to-core assignments on today's chip multiprocessors, it is time and energy prohibitive to exhaustively try and determine the best assignment. In this paper, we present and demonstrate application performance models that predict application performance given a proposed thread-to-core assignment. We show how these models can be quickly built and used to select thread-to-core assignments for multiple programs and to improve system utilization.
IP1-10	ADVANCED SIMD: EXTENDING THE REACH OF CONTEMPORARY SIMD ARCHITECTURES Speakers: Matthias Boettcher¹, Giacomo Gabrielli², Mbou Eyole², Alastair Reid² and Bashir M. Al-Hashimi¹ ¹University of Southampton, GB; ²ARM Ltd., GB Abstract SIMD extensions have gained widespread acceptance in modern microprocessors as a way to exploit data-level parallelism in general-purpose cores. Popular SIMD architectures (e.g. Intel SSE/AVX) have evolved by adding support for wider registers and datapaths, and advanced features like indexed memory accesses, per-lane predication and inter-lane instructions, at the cost of additional silicon area and design complexity. This paper evaluates the performance impact of such advanced features on a set of workloads considered hard to vectorize for traditional SIMD architectures. Their sensitivity to the most relevant design parameters (e.g. register/datapath width and L1 data cache configuration) is quantified and discussed. We developed an ARMv7 NEON based ISA extension (ARGON), augmented a cycle accurate simulation framework for it, and derived a set of benchmarks from the Berkeley dwarfs. Our analyses demonstrate how ARGON can, depending on the structure of an algorithm, achieve speedups of 1.5x to 16x.
IP1-11	A TIGHTLY-COUPLED HARDWARE CONTROLLER TO IMPROVE SCALABILITY AND PROGRAMMABILITY OF SHARED-MEMORY HETEROGENEOUS CLUSTERS Speakers: Paolo Burgio¹, Robin Danilo², Andrea Marongiu³, Philippe Coussy⁴ and Luca Benini⁵ ¹University of Bologna, Université de Bretagne-Sud, IT; ²Université de Bretagne-Sud, FR; ³University of Bologna, IT; ⁴Universite de Bretagne-Sud / Lab-STICC, FR; ⁵Università di Bologna, IT Abstract Modern designs for embedded many-core systems increasingly include application-specific units to accelerate key computational kernels with orders-of-magnitude higher execution speed and energy efficiency compared to software counterparts. A promising architectural template is based on heterogeneous clusters, where simple RISC cores and specialized HW units (HWPU) communicate in a tightly-coupled manner via L1 shared memory. Efficiently integrating processors and a high number of HW Processing Units (HWPUs) in such an system poses two main challenges, namely, architectural scalability and programmability. In this paper we describe an optimized Data Pump (DP) which connects several accelerators to a restricted set of communication ports, and acts as a virtualization layer for programming, exposing FIFO queues to offload "HW tasks" to them through a set of lightweight APIs. In this work, we aim at optimizing both these mechanisms, for respectively reducing modules area and making programming sequence easier and lighter.
IP1-12	INFORMER: AN INTEGRATED FRAMEWORK FOR EARLY-STAGE MEMORY ROBUSTNESS ANALYSIS Speakers: Shrikanth Ganapathy¹, Ramon Canal¹, Dan Alexandrescu², Enrico Costenaro², Antonio Gonzalez³ and Antonio Rubio¹ ¹Universitat Politecnica de Catalunya, ES; ²iRoC Technologies, FR; ³Intel and Universitat Politecnica de Catalunya, ES Abstract With the growing importance of parametric (process and environmental) variations in advanced technologies, it has become a serious challenge to design reliable, fast and low-power embedded memories. Adopting a variation-aware design paradigm requires a holistic perspective of memory-wide metrics such as yield, power and performance. However, accurate estimation of such metrics is largely dependent on circuit implementation styles, technology parameters and architecture-level specifics. In this paper, we propose a fully automated tool - INFORMER that helps high-level designers estimate memory reliability metrics rapidly and accurately. The tool relies on accurate circuit-level simulations of failure mechanisms such as ageing, soft-errors and parametric failures. The obtained statistics can then help couple low-level metrics with higher-level design choices. A new technique for rapid estimation of low-probability failure events is also proposed. We present three use-cases of our prototype tool to demonstrate its diverse capabilities in autonomously guiding large SRAM based robust memory designs.
IP1-13	WEAR-OUT ANALYSIS OF ERROR CORRECTION TECHNIQUES IN PHASE-CHANGE MEMORY Speakers: Caio Hoffman, Luiz Ramos, Rodolfo Azevedo and Guido Araújo, University of Campinas, BR Abstract Phase-Change Memory (PCM) is new memory technology and a possible replacement for DRAM, whose scaling limitations require new lithography technologies. Despite being promising, PCM has limited endurance (its cells withstand roughly 10^8 bit-flips before failing), which prompted the adoption of Error Correction Techniques (ECTs). However, previous lifetime analyses of ECTs did not consider the difference between the bit-flip frequencies of data and code bits, which may lead to inaccurate wear-out analyses for the ECTs. In this work, we improve the wear-out analysis of PCM by modeling and analyzing the bit-flip probabilities of five ECTs. Our models also enable an accurate estimation of energy consumption and analysis of the endurance-energy trade-off for each ECT.
IP1-14	APPROXIMATING THE AGE OF RF/ANALOG CIRCUITS THROUGH RE-CHARACTERIZATION AND STATISTICAL ESTIMATION Speakers: Doohwang Chang¹, Sule Ozev¹, Ozgur Sinanoglu² and Ramesh Karri³ ¹Arizona State University, US; ²New York University Abu Dhabi, AE; ³Polytechnic Institute of New York University, US Abstract Counterfeit ICs have become an issue for semiconductor manufacturers due to impacts on their reputation and lost revenue. Counterfeit ICs are either products that are intentionally mislabeled or legitimate products that are extracted from electronic waste. The former is easier to detect whereas the latter is harder since they are identical to new devices but display degraded performance due to environmental and use stress conditions. Detecting counterfeit ICs that are extracted from electronic waste requires an approach that can approximate the age of manufactured devices based on their parameters. In this paper, we present a methodology that uses information on both fresh and aged ICs and tries to distinguish between the fresh and aged population based on an estimate of the age. Since analog devices age mainly due to their bias stress, input signals play less of a role. Hence, it is possible to use simulation models to approximate the aging process, which would give us access to a large population of aged devices. Using this information, we can construct a statistical model that approximates the age of a given circuit. We use a Low noise amplifier (LNA) and an NMOS LC oscillator to demonstrate that individual aged devices can be accurately classified using the proposed method.
IP1-15	PACKAGE GEOMETRIC AWARE THERMAL ANALYSIS BY INFRARED-RADIATION THERMAL IMAGES Speakers: Jui-Hung Chien¹, Hao Yu², Ruei-Siang Hsu³, Hsueh-Ju Lin³ and Shih-Chieh Chang³ ¹Industrial Technology Research Institute, TW; ²None, TW; ³NTHU, TW Abstract Since packages affect the amount of heat transfer, it is important to include package and heat sink in thermal analysis. In this paper, we study the full-chip thermal response with different packages. We first discuss the difficulties of obtaining accurate package models for simulation. To facilitate a designer to perform thermal simulation with different packages, we propose to use a matrix called the package-transfer matrix which can transform a temperature profile of one package to another temperature profile of the desired package. To estimate and verify a package-transfer matrix, we propose an efficient method which uses Infrared Radiation (IR) images from two carefully design test chips with PBGA packages. Our experimental results show that the default package model CBGA in HotSpot can be accurately transferred to any other package through the package-transfer matrix.
IP1-16	COST-EFFECTIVE DECAP SELECTION FOR BEYOND DIE POWER INTEGRITY Speakers: Yi-En Chen¹, Tu-Hsung Tsai¹, Shi-Hao Chen² and Hung-Ming Chen¹ ¹Department of Electronics Engineering National Chiao Tung University Hsinchu, Taiwan 300, R.O.C., TW; ²Global Unichip Corp, Hsinchu, Taiwan, TW Abstract In designing reliable power distribution networks (PDN) for power integrity (PI), it is essential to stabilize voltage supply to devices on chip. We usually employ decoupling capacitor (decap) to suppress the noise generated by the switching of devices. There have been numerous prior works on how to select/insert decaps in chip, package, or board to maintain PI, however optimal decap selection is usually not applicable due to design budget and manufacturability. Moreover, design cost is seldom touched or mentioned. In this research, we propose an efficientmethodology "PDCPSO" to automatically optimizing the selection of available decaps. This algorithm not only takes advantage of particle swarm optimization (PSO) to stochastically search the design space, but takes the most effective range of decaps into consideration to outperform the basic PSO. We apply this to three real package designs and the results show that, compared to the original decap selection by rules of thumb, our approach could shorten the design period and we have better combination of decaps at the same or lower cost. In addition, our methodology can also consider package-board co-design in optimizing different operation frequencies.
IP1-17	CHARACTERIZING POWER DELIVERY SYSTEMS WITH ON/OFF-CHIP VOLTAGE REGULATORS FOR MANY-CORE PROCESSORS Speakers: Xuan Wang, Jiang Xu, Zhe Wang, Kevin J. Chen, Xiaowen Wu and Zhehui Wang, HKUST, HK Abstract Design of power delivery system has great influence on the power management in many-core processor systems. Moving voltage regulators from off-chip to on-chip gains more and more interest in the power delivery system design, because it is able to provide fast voltage scaling and multiple power domains. Previous works are proposed to implement power efficient on-chip regulators. It is also important to analyze the characteristics of the entire power delivery system to explore the tradeoff between the promising properties and costs of employing on-chip regulators. In this work, we develop an analytical model to evaluate important characteristics of the power delivery system, including on-chip/off-chip voltage regulators and the passive on-chip/on-board parasitic. Compared with SPICE simulations, our model achieves a fast system-level evaluation with comparable accuracy. Based on the model, geometric programming is utilized to find the optimal power efficiency of different architectures of power delivery systems under constraints of output voltage stability and area. Experiments show that compared with the conventional architecture using off-chip regulators, the hybrid one using both on-chip and off-chip voltage regulators achieves 1.0% power efficiency improvement and 68% area reduction of voltage regulators on average. We conclude that the hybrid architecture has potential for high power efficiency and small area at heavy workload, but careful account for the overhead of on-chip regulators is needed.
IP1-18	MASK-COST-AWARE ECO ROUTING Speakers: Hsi-An Chien¹, Zhen-Yu Peng¹, Yun-Ru Wu², Ting-Hsiung Wang², Hsin-Chang Lin², Chi-Feng Wu² and Ting-Chi Wang¹ ¹National Tsing Hua University, TW; ²Realtek Semiconductor Corp., TW Abstract In this paper, we study a mask-cost-aware routing problem for engineering change order (ECO). By taking into account old routes for possible reuse, we present an approach for the problem. Encouraging experimental results are reported to demonstrate the effectiveness of our approach.
IP1-19	EXPLOITING NARROW-WIDTH VALUES FOR IMPROVING NON-VOLATILE CACHE LIFETIME Speakers: Guangshan Duan and Shuai Wang, Nanjing University, CN Abstract Due to the high cell density, low leakage power consumption, and less vulnerability to soft errors, the non-volatile memory technologies are among the most promising alternatives for replacing the traditional DRAM and SRAM technologies used in implementing main memory and caches in the modern microprocessor. However, one of the difficulties is the limited write endurance of most non-volatile memory technologies. In this paper, we propose to exploit the narrow-width values to improve the lifetime of the non-volatile last level caches. Leading zeros masking scheme is first proposed to reduce the write stress to the upper half of the narrow-width data. To balance the write variations between the upper half and the lower half of the narrow-width data, two swap schemes, the swap on write (SW) and swap on replacement (SRepl), are proposed. To further reduce the write stress to the non-volatile cache, we adopt two optimization schemes, the multiple dirty bit (MDB) and read before write (RBW), to improve its lifetime. Our experimental results show that by combining all our proposed schemes, the lifetime of the non-volatile caches can be improved by 245% on average.
IP1-20	PARTIAL-SET: WRITE SPEEDUP OF PCM MAIN MEMORY Speakers: Li Bing¹, Shan Shuchang², Hu Yu² and Li XiaoWei³ ¹ICT,UCAS, CN; ²ICT,CAS, CN; ³ICT.CAS, CN Abstract Phase change memory (PCM) is a promising nonvolatile memory technology developed as a possible DRAM replacement. Although it offers the read latency close to that of DRAM, PCM generally suffers from the long write latency. Long write request may block the read requests on the critical path of cache/memory access, incurring adverse impact on the system performance. Besides, the write performance of PCM is very asymmetric, i.e, the SET operation (writing '1') is much slower than that of the RESET operation (writing '0'). In this work, we re-examine the resistance transform process during the SET operation of PCM and propose a novel Partial-SET scheme to alleviate the long write latency issue of PCM. During a write access to a memory line, a short Partial-SET pulse is applied first to program the PCM cells to a pre-stable state, achieving the same write latency as RESET. The partially-SET cells are then fully programmed within the retention window to preserve the data integrity. Experimental results show that our Partial-SET scheme can improve the memory access performance of PCM by more than 45% averagely with very marginal storage overhead.
IP1-21	GARBAGE COLLECTION FOR MULTI-VERSION INDEX ON FLASH MEMORY Speakers: Kam-Yiu Lam¹, Jian-Tao Wang¹, Yuan-Hao Chang², Jen-Wei Hsieh³, Po-Chun Huang⁴, Chung Keung Poon⁵ and ChunJiang Zhu¹ ¹City University of Hong Kong, HK; ²Academia Sinica, TW; ³National Taiwan University of Science and Technology, TW; ⁴Acadmia Sinica, TW; ⁵City University of Hong Kong, TW Abstract In this paper, we study the important performance issues in using the purging-range query to reclaim old data versions to be free blocks in a flash-based multi-version database. To reduce the overheads for using the purging-range query in garbage collection, the physical block labeling (PBL) scheme is proposed to provide a better estimation on the purging version number to be used for purging old data versions. With the use of the frequency-based placement (FBP) scheme to place data versions in a block, the efficiency in garbage collection can be further enhanced by increasing the deadspans of data versions and reducing reallocation cost especially when the spaces of the flash memory for the databases are limited.
IP1-22	D2CYBER: A DESIGN AUTOMATION TOOL FOR DEPENDABLE CYBERCARS Speakers: Arslan Munir and Farinaz Koushanfar, Rice University, US Abstract The next generation of automobiles (also known as cybercars) will increasingly incorporate electronic control units (ECUs) in novel automotive control applications. Recent work has demonstrated vulnerability of modern car control systems to security attacks that directly impacts the cybercar's physical safety and dependability. In this paper, we provide an integrated approach for the design of secure and dependable cybercars using a case study: a steer-by-wire (SBW) application over controller area network (CAN). The challenge is to embed both security and dependability over CAN while ensuring that the real-time constraints of the cybercar applications are not violated. Our approach enables early design feasibility analysis by embedding essential security primitives (i.e., confidentiality, integrity, and authentication) over CAN subject to the real-time constraints imposed by the desired quality of service and behavioral reliability. Our method leverages multi-core ECUs for providing fault-tolerance by redundant multi-threading (RMT) and also further enhances RMT for quick error detection. We quantify the error resilience of our approach and evaluate the interplay of performance, fault-tolerance, security, and scalability for our SBW case study.
IP1-23	CONTRACT-BASED DESIGN OF CONTROL PROTOCOLS FOR SAFETY-CRITICAL CYBER-PHYSICAL SYSTEMS Speakers: Pierluigi Nuzzo, John Finn, Antonio Iannopollo and Alberto Sangiovanni-Vincentelli, University of California at Berkeley, US Abstract We introduce a platform-based design methodology that addresses the complexity and heterogeneity of cyber-physical systems by using assume-guarantee contracts to formalize the design process and enable realization of control protocols in a hierarchical and compositional manner. Given the architecture of the physical plant to be controlled, the design is carried out as a sequence of refinement steps from an initial specification to a final implementation, including synthesis from requirements and mapping of higher-level functional and non-functional models into a set of candidate solutions built out of a library of components at the lower level. Initial top-level requirements are captured as contracts and expressed using linear temporal logic (LTL) and signal temporal logic (STL) formulas to enable requirement analysis and early detection of inconsistencies. Requirements are then refined into a controller architecture by combining reactive synthesis steps from LTL specifications with simulation-based design space exploration steps. We demonstrate our approach on the design of embedded controllers for aircraft electric power distribution.
IP1-24	A FAULT DETECTION MECHANISM IN A DATA-FLOW SCHEDULED MULTITHREADED PROCESSOR Speakers: Jian Fu¹, Qiang Yang¹, Raphael Poss¹, Chris Jesshope¹ and Chunyuan Zhang² ¹University of Amsterdam, NL; ²National University of Defense Technology, CN Abstract This paper designs and implements the Redundant Multi-Threading (RMT) in a Data-flow scheduled Multi-Threaded (DMT) multicore processor, called Data-flow scheduled Redundant Multi-Threading (DRMT). Meanwhile, It presents Asynchronous Output Comparison (AOC) for RMT techniques to avoid fault detection related inter-core communication and alleviate the performance and hardware overheads induced by output comparison. Results show that the performance overhead of DRMT is less than 60% even when the number of threads is four times the number of processing elements. Also the performance and hardware overheads of AOC are insignificant.

< Return to last page

Submissions

IP1 Interactive Presentations