Date: Tuesday 25 March 2014
Time: 16:00 - 16:30
Location / Room: Conference Level, foyer
Interactive Presentations run simultaneously during a 30-minute slot. A poster associated to the IP paper is on display throughout the afternoon. Additionally, each IP paper is briefly introduced in a one-minute presentation in a corresponding regular session, prior to the actual Interactive Presentation. At the end of each afternoon Interactive Presentations session the award 'Best IP of the Day' is given.
Label | Presentation Title Authors |
---|---|
IP1-1 | SAFE: SECURITY-AWARE FLEXRAY SCHEDULING ENGINE Speakers: Gang Han1, Haibo Zeng2, Yaping Li3 and Wenhua Dou1 1National University of Defense Technology, CN; 2McGill University, CA; 3The Chinese University of Hong Kong, CN Abstract In this paper, we propose SAFE (Security Aware FlexRay scheduling Engine), to provide a problem definition and a design framework for FlexRay static segment schedule to address the new challenge on security. From a high level specification of the application, the architecture and communication middleware are synthesized to satisfy security requirements, in addition to extensibility, costs, and end-to-end latencies. The proposed design process is applied to two industrial case studies consisting of a set of active safety functions and an X-by-wire system respectively. |
IP1-2 | TRANSIENT ERRORS RESILIENCY ANALYSIS TECHNIQUE FOR AUTOMOTIVE SAFETY CRITICAL APPLICATIONS Speakers: Sujan Pandey and Bart Vermeulen, NXP Semiconductors, NL Abstract When a single bit is flipped as a result of a transient error in an electronic circuit, its effect can have a severe impact if the circuit is deployed in safety critical domains such as automotive, aeronautics, and industrial automation. In the design phase it is therefore essential to evaluate, and where necessary improve, the resilience of a circuit to all possible transient errors. In this paper, we present a method to analyze the transient error resiliency of a digital circuit. This method is based on an analytical model. It models a transient error as a random function and finds the vulnerable number of bits for each node. We perform a case study on a circuit implementation of a well-known adaptive filter algorithm. The results from the analytical and simulation models show that the analytical model is accurate enough to estimate the effects of transient errors on the performance of a digital circuit. Our analytical method also reduces the run time significantly in a design phase. |
IP1-3 | MODEL BASED HIERARCHICAL OPTIMIZATION STRATEGIES FOR ANALOG DESIGN AUTOMATION Speakers: Engin Afacan1, Gunhan Dundar1, Faik Baskaya1, Simge Ay1 and Francisco Fernandez2 1Bogazici University, TR; 2Universidad de Sevilla, TR Abstract The design of complex analog circuits by using flat optimization-based approaches is inefficient, even impossible, due to the high number of design variables and the growth of the cost of performance evaluation with the circuit size. Over the past two decades, top-down hierarchical design approaches have been developed and applied. They are based on hierarchical circuit decomposition and specification transmission from top-level to lower level blocks. However, such specification transmission is usually performed with little knowledge on the feasibility of the specifications, leading, therefore, to costly redesign iterations. Even if the specification transmission is successful, there is no guarantee that it is optimal in terms of e.g., power consumption or area occupation. To palliate this problem, two novel model-based hierarchical synthesis methods are proposed in this paper: Model-Based Hierarchical Optimization (MBHO) and Improved Model-Based Hierarchical Optimization (IMBHO). They are based on the concurrent design at higher and lower hierarchical levels and appropriate communication between the different processes. Experimental results on a filter example comparing the new approaches and the conventional top-down design approach are provided. |
IP1-4 | A NOVEL LOW POWER 11-BIT HYBRID ADC USING FLASH AND DELAY LINE ARCHITECTURES Speakers: Hsun-Cheng Lee and Jacob Abraham, the University of Texas at Austin, US Abstract This paper presents a novel low power 11-bit hybrid ADC using flash and delay line architectures, where a 4-bit flash ADC is followed by a 7-bit delay-line ADC. This hybrid ADC inherits accuracy and power efficiency from flash ADCs and delay-line ADCs, respectively. Also, in order to reduce the power of the first stage flash ADC, a power-saving technique is adopted by biasing the DC tail current of the pre-amplifiers at 5μA instead of the operational current, 47μA in stand-by mode. The hybrid ADC was designed and simulated in a commercial 65nm process. With a 1.1 V supply and 100 MS/s, the ADC achieves an SNDR of 60 dB and consumes 1.6 mW, which results in a figure of merit (FOM) of 19.4 fJ/conversion-step without any calibration technique. Also, Monte Carlo simulations are performed with a 3σ device mismatch for the SNDR estimation, and the SNDR is observed to be better than 58.5 dB. |
IP1-5 | SEMI-SYMBOLIC ANALYSIS OF MIXED-SIGNAL SYSTEMS INCLUDING DISCONTINUITIES Speakers: Carna Radojicic, Christoph Grimm, Javier Moreno and Xiao Pan, TU Kaiserslautern, DE Abstract The paper describes an approach for semi-symbolic analysis of mixed-signal systems that contain discontinuous functions, e.g. due to modeling comparators. For modeling and semi- symbolic simulation, we use extended Affine Arithmetic. Affine Arithmetic is currently limited to accurate analysis of linear func- tions and mild non-linear functions, but not yet discontinuities. In this paper we extend the approach to also handle discontinuities. For demonstration, we symbolically analyze a Σ∆-modulator. |
IP1-6 | (Best Paper Award Candidate) NOVEL CIRCUIT TOPOLOGY SYNTHESIS METHOD USING CIRCUIT FEATURE MINING AND SYMBOLIC COMPARISON Speakers: Cristian Ferent and Alex Doboli, Stony Brook University, US Abstract This paper presents a reasoning-based approach to analog circuit synthesis using ordered node clustering representations (ONCR) to describe alternative circuit features and symbolic circuit comparison to characterize performance trade-offs of synthesized solutions. Case studies illustrate application of the proposed methods to topology selection and refinement. |
IP1-7 | AN EMBEDDED OFFSET AND GAIN INSTRUMENT FOR OPAMP IPS Speakers: Jinbo Wan and Hans KerkHoff, CAES-TDT, CTIT, University of Twente, NL Abstract Analog and mixed-signal IPs are increasingly required to use digital fabrication technologies and are deeply embedded into system-on-chips (SoC). These developments append more requirements and challenges on analog testing methodologies. Traditional analog testing methods suffer from less accessibility and control with regard to these embedded analog circuits in SoCs. As an alternative, an embedded instrument for analog OpAmp IP tests is proposed in this paper. It can provide the exact gain and offset values of OpAmps instead of only pass/fail result. What's more, it is an non-invasive monitor and can work online without isolating the DUT Opamp from its surrounding feedback networks. Nor does it require accurate test stimulations. In addition, the monitor can remove its own offsets without additional complex self-calibration circuits. All self-calibrations are completed in the digital domain after each measurement in real time. Therefore it is also suitable for aging-sensitive applications, in which the monitor may suffer from aging mechanisms and has additional offset drifts as well. The monitor measurement range for offset is from 0.2mV to 70mV, and for gain it is from 0dB to 40dB. The error for offset measurements can be 10% of the measurement value with plus/minus 0.1mV, and -2.5dB for gain measurements. |
IP1-8 | EVX: VECTOR EXECUTION ON LOW POWER EDGE CORES Speakers: Milovan Duric1, Oscar Palomar1, Aaron Smith2, Osman Unsal1, Adrian Cristal1, Mateo Valero1 and Doug Burger2 1Barcelona Supercomputing Center, ES; 2Microsoft Research, US Abstract In this paper, we present a vector execution model that provides the advantages of vector processors on low power, general purpose cores, with limited additional hardware. While accelerating data-level parallel (DLP) workloads, the vector model increases the efficiency and hardware resources utilization. We use a modest dual issue core based on an Explicit Data Graph Execution (EDGE) architecture to implement our approach, called EVX. Unlike most DLP accelerators which utilize additional hardware and increase the complexity of low power processors, EVX leverages the available resources of EDGE cores, and with minimal costs allows for specialization of the resources. EVX adds a control logic that increases the core area by 2.1%. We show that EVX yields an average speedup of 3x compared to a scalar baseline and outperforms multimedia SIMD extensions. |
IP1-9 | PROGRAM AFFINITY PERFORMANCE MODELS FOR PERFORMANCE AND UTILIZATION Speakers: Ryan Moore and Bruce Childers, University of Pittsburgh, US Abstract Multithreaded applications have a wide variety of behavior, causing complex interactions with today's chip multiprocessor machines. Application threads may have large private working sets, and may compete for cache space and memory bandwidth. These threads benefit from large private caches. Other threads may share data or communicate, and thus, execute more quickly if using shared caches. Many applications fall somewhere in between, requiring careful thread-to-core assignments to maximize performance. Yet because of the large number of thread-to-core assignments on today's chip multiprocessors, it is time and energy prohibitive to exhaustively try and determine the best assignment. In this paper, we present and demonstrate application performance models that predict application performance given a proposed thread-to-core assignment. We show how these models can be quickly built and used to select thread-to-core assignments for multiple programs and to improve system utilization. |
IP1-10 | ADVANCED SIMD: EXTENDING THE REACH OF CONTEMPORARY SIMD ARCHITECTURES Speakers: Matthias Boettcher1, Giacomo Gabrielli2, Mbou Eyole2, Alastair Reid2 and Bashir M. Al-Hashimi1 1University of Southampton, GB; 2ARM Ltd., GB Abstract SIMD extensions have gained widespread acceptance in modern microprocessors as a way to exploit data-level parallelism in general-purpose cores. Popular SIMD architectures (e.g. Intel SSE/AVX) have evolved by adding support for wider registers and datapaths, and advanced features like indexed memory accesses, per-lane predication and inter-lane instructions, at the cost of additional silicon area and design complexity. This paper evaluates the performance impact of such advanced features on a set of workloads considered hard to vectorize for traditional SIMD architectures. Their sensitivity to the most relevant design parameters (e.g. register/datapath width and L1 data cache configuration) is quantified and discussed. We developed an ARMv7 NEON based ISA extension (ARGON), augmented a cycle accurate simulation framework for it, and derived a set of benchmarks from the Berkeley dwarfs. Our analyses demonstrate how ARGON can, depending on the structure of an algorithm, achieve speedups of 1.5x to 16x. |
IP1-11 | A TIGHTLY-COUPLED HARDWARE CONTROLLER TO IMPROVE SCALABILITY AND PROGRAMMABILITY OF SHARED-MEMORY HETEROGENEOUS CLUSTERS Speakers: Paolo Burgio1, Robin Danilo2, Andrea Marongiu3, Philippe Coussy4 and Luca Benini5 1University of Bologna, Université de Bretagne-Sud, IT; 2Université de Bretagne-Sud, FR; 3University of Bologna, IT; 4Universite de Bretagne-Sud / Lab-STICC, FR; 5Università di Bologna, IT Abstract Modern designs for embedded many-core systems increasingly include application-specific units to accelerate key computational kernels with orders-of-magnitude higher execution speed and energy efficiency compared to software counterparts. A promising architectural template is based on heterogeneous clusters, where simple RISC cores and specialized HW units (HWPU) communicate in a tightly-coupled manner via L1 shared memory. Efficiently integrating processors and a high number of HW Processing Units (HWPUs) in such an system poses two main challenges, namely, architectural scalability and programmability. In this paper we describe an optimized Data Pump (DP) which connects several accelerators to a restricted set of communication ports, and acts as a virtualization layer for programming, exposing FIFO queues to offload "HW tasks" to them through a set of lightweight APIs. In this work, we aim at optimizing both these mechanisms, for respectively reducing modules area and making programming sequence easier and lighter. |
IP1-12 | INFORMER: AN INTEGRATED FRAMEWORK FOR EARLY-STAGE MEMORY ROBUSTNESS ANALYSIS Speakers: Shrikanth Ganapathy1, Ramon Canal1, Dan Alexandrescu2, Enrico Costenaro2, Antonio Gonzalez3 and Antonio Rubio1 1Universitat Politecnica de Catalunya, ES; 2iRoC Technologies, FR; 3Intel and Universitat Politecnica de Catalunya, ES Abstract With the growing importance of parametric (process and environmental) variations in advanced technologies, it has become a serious challenge to design reliable, fast and low-power embedded memories. Adopting a variation-aware design paradigm requires a holistic perspective of memory-wide metrics such as yield, power and performance. However, accurate estimation of such metrics is largely dependent on circuit implementation styles, technology parameters and architecture-level specifics. In this paper, we propose a fully automated tool - INFORMER that helps high-level designers estimate memory reliability metrics rapidly and accurately. The tool relies on accurate circuit-level simulations of failure mechanisms such as ageing, soft-errors and parametric failures. The obtained statistics can then help couple low-level metrics with higher-level design choices. A new technique for rapid estimation of low-probability failure events is also proposed. We present three use-cases of our prototype tool to demonstrate its diverse capabilities in autonomously guiding large SRAM based robust memory designs. |
IP1-13 | WEAR-OUT ANALYSIS OF ERROR CORRECTION TECHNIQUES IN PHASE-CHANGE MEMORY Speakers: Caio Hoffman, Luiz Ramos, Rodolfo Azevedo and Guido Araújo, University of Campinas, BR Abstract Phase-Change Memory (PCM) is new memory technology and a possible replacement for DRAM, whose scaling limitations require new lithography technologies. Despite being promising, PCM has limited endurance (its cells withstand roughly 10^8 bit-flips before failing), which prompted the adoption of Error Correction Techniques (ECTs). However, previous lifetime analyses of ECTs did not consider the difference between the bit-flip frequencies of data and code bits, which may lead to inaccurate wear-out analyses for the ECTs. In this work, we improve the wear-out analysis of PCM by modeling and analyzing the bit-flip probabilities of five ECTs. Our models also enable an accurate estimation of energy consumption and analysis of the endurance-energy trade-off for each ECT. |
IP1-14 | APPROXIMATING THE AGE OF RF/ANALOG CIRCUITS THROUGH RE-CHARACTERIZATION AND STATISTICAL ESTIMATION Speakers: Doohwang Chang1, Sule Ozev1, Ozgur Sinanoglu2 and Ramesh Karri3 1Arizona State University, US; 2New York University Abu Dhabi, AE; 3Polytechnic Institute of New York University, US Abstract Counterfeit ICs have become an issue for semiconductor manufacturers due to impacts on their reputation and lost revenue. Counterfeit ICs are either products that are intentionally mislabeled or legitimate products that are extracted from electronic waste. The former is easier to detect whereas the latter is harder since they are identical to new devices but display degraded performance due to environmental and use stress conditions. Detecting counterfeit ICs that are extracted from electronic waste requires an approach that can approximate the age of manufactured devices based on their parameters. In this paper, we present a methodology that uses information on both fresh and aged ICs and tries to distinguish between the fresh and aged population based on an estimate of the age. Since analog devices age mainly due to their bias stress, input signals play less of a role. Hence, it is possible to use simulation models to approximate the aging process, which would give us access to a large population of aged devices. Using this information, we can construct a statistical model that approximates the age of a given circuit. We use a Low noise amplifier (LNA) and an NMOS LC oscillator to demonstrate that individual aged devices can be accurately classified using the proposed method. |
IP1-15 | PACKAGE GEOMETRIC AWARE THERMAL ANALYSIS BY INFRARED-RADIATION THERMAL IMAGES Speakers: Jui-Hung Chien1, Hao Yu2, Ruei-Siang Hsu3, Hsueh-Ju Lin3 and Shih-Chieh Chang3 1Industrial Technology Research Institute, TW; 2None, TW; 3NTHU, TW Abstract Since packages affect the amount of heat transfer, it is important to include package and heat sink in thermal analysis. In this paper, we study the full-chip thermal response with different packages. We first discuss the difficulties of obtaining accurate package models for simulation. To facilitate a designer to perform thermal simulation with different packages, we propose to use a matrix called the package-transfer matrix which can transform a temperature profile of one package to another temperature profile of the desired package. To estimate and verify a package-transfer matrix, we propose an efficient method which uses Infrared Radiation (IR) images from two carefully design test chips with PBGA packages. Our experimental results show that the default package model CBGA in HotSpot can be accurately transferred to any other package through the package-transfer matrix. |
IP1-16 | COST-EFFECTIVE DECAP SELECTION FOR BEYOND DIE POWER INTEGRITY Speakers: Yi-En Chen1, Tu-Hsung Tsai1, Shi-Hao Chen2 and Hung-Ming Chen1 1Department of Electronics Engineering National Chiao Tung University Hsinchu, Taiwan 300, R.O.C., TW; 2Global Unichip Corp, Hsinchu, Taiwan, TW Abstract In designing reliable power distribution networks (PDN) for power integrity (PI), it is essential to stabilize voltage supply to devices on chip. We usually employ decoupling capacitor (decap) to suppress the noise generated by the switching of devices. There have been numerous prior works on how to select/insert decaps in chip, package, or board to maintain PI, however optimal decap selection is usually not applicable due to design budget and manufacturability. Moreover, design cost is seldom touched or mentioned. In this research, we propose an efficientmethodology "PDCPSO" to automatically optimizing the selection of available decaps. This algorithm not only takes advantage of particle swarm optimization (PSO) to stochastically search the design space, but takes the most effective range of decaps into consideration to outperform the basic PSO. We apply this to three real package designs and the results show that, compared to the original decap selection by rules of thumb, our approach could shorten the design period and we have better combination of decaps at the same or lower cost. In addition, our methodology can also consider package-board co-design in optimizing different operation frequencies. |
IP1-17 | CHARACTERIZING POWER DELIVERY SYSTEMS WITH ON/OFF-CHIP VOLTAGE REGULATORS FOR MANY-CORE PROCESSORS Speakers: Xuan Wang, Jiang Xu, Zhe Wang, Kevin J. Chen, Xiaowen Wu and Zhehui Wang, HKUST, HK Abstract Design of power delivery system has great influence on the power management in many-core processor systems. Moving voltage regulators from off-chip to on-chip gains more and more interest in the power delivery system design, because it is able to provide fast voltage scaling and multiple power domains. Previous works are proposed to implement power efficient on-chip regulators. It is also important to analyze the characteristics of the entire power delivery system to explore the tradeoff between the promising properties and costs of employing on-chip regulators. In this work, we develop an analytical model to evaluate important characteristics of the power delivery system, including on-chip/off-chip voltage regulators and the passive on-chip/on-board parasitic. Compared with SPICE simulations, our model achieves a fast system-level evaluation with comparable accuracy. Based on the model, geometric programming is utilized to find the optimal power efficiency of different architectures of power delivery systems under constraints of output voltage stability and area. Experiments show that compared with the conventional architecture using off-chip regulators, the hybrid one using both on-chip and off-chip voltage regulators achieves 1.0% power efficiency improvement and 68% area reduction of voltage regulators on average. We conclude that the hybrid architecture has potential for high power efficiency and small area at heavy workload, but careful account for the overhead of on-chip regulators is needed. |
IP1-18 | MASK-COST-AWARE ECO ROUTING Speakers: Hsi-An Chien1, Zhen-Yu Peng1, Yun-Ru Wu2, Ting-Hsiung Wang2, Hsin-Chang Lin2, Chi-Feng Wu2 and Ting-Chi Wang1 1National Tsing Hua University, TW; 2Realtek Semiconductor Corp., TW Abstract In this paper, we study a mask-cost-aware routing problem for engineering change order (ECO). By taking into account old routes for possible reuse, we present an approach for the problem. Encouraging experimental results are reported to demonstrate the effectiveness of our approach. |
IP1-19 | EXPLOITING NARROW-WIDTH VALUES FOR IMPROVING NON-VOLATILE CACHE LIFETIME Speakers: Guangshan Duan and Shuai Wang, Nanjing University, CN Abstract Due to the high cell density, low leakage power consumption, and less vulnerability to soft errors, the non-volatile memory technologies are among the most promising alternatives for replacing the traditional DRAM and SRAM technologies used in implementing main memory and caches in the modern microprocessor. However, one of the difficulties is the limited write endurance of most non-volatile memory technologies. In this paper, we propose to exploit the narrow-width values to improve the lifetime of the non-volatile last level caches. Leading zeros masking scheme is first proposed to reduce the write stress to the upper half of the narrow-width data. To balance the write variations between the upper half and the lower half of the narrow-width data, two swap schemes, the swap on write (SW) and swap on replacement (SRepl), are proposed. To further reduce the write stress to the non-volatile cache, we adopt two optimization schemes, the multiple dirty bit (MDB) and read before write (RBW), to improve its lifetime. Our experimental results show that by combining all our proposed schemes, the lifetime of the non-volatile caches can be improved by 245% on average. |
IP1-20 | PARTIAL-SET: WRITE SPEEDUP OF PCM MAIN MEMORY Speakers: Li Bing1, Shan Shuchang2, Hu Yu2 and Li XiaoWei3 1ICT,UCAS, CN; 2ICT,CAS, CN; 3ICT.CAS, CN Abstract Phase change memory (PCM) is a promising nonvolatile memory technology developed as a possible DRAM replacement. Although it offers the read latency close to that of DRAM, PCM generally suffers from the long write latency. Long write request may block the read requests on the critical path of cache/memory access, incurring adverse impact on the system performance. Besides, the write performance of PCM is very asymmetric, i.e, the SET operation (writing '1') is much slower than that of the RESET operation (writing '0'). In this work, we re-examine the resistance transform process during the SET operation of PCM and propose a novel Partial-SET scheme to alleviate the long write latency issue of PCM. During a write access to a memory line, a short Partial-SET pulse is applied first to program the PCM cells to a pre-stable state, achieving the same write latency as RESET. The partially-SET cells are then fully programmed within the retention window to preserve the data integrity. Experimental results show that our Partial-SET scheme can improve the memory access performance of PCM by more than 45% averagely with very marginal storage overhead. |
IP1-21 | GARBAGE COLLECTION FOR MULTI-VERSION INDEX ON FLASH MEMORY Speakers: Kam-Yiu Lam1, Jian-Tao Wang1, Yuan-Hao Chang2, Jen-Wei Hsieh3, Po-Chun Huang4, Chung Keung Poon5 and ChunJiang Zhu1 1City University of Hong Kong, HK; 2Academia Sinica, TW; 3National Taiwan University of Science and Technology, TW; 4Acadmia Sinica, TW; 5City University of Hong Kong, TW Abstract In this paper, we study the important performance issues in using the purging-range query to reclaim old data versions to be free blocks in a flash-based multi-version database. To reduce the overheads for using the purging-range query in garbage collection, the physical block labeling (PBL) scheme is proposed to provide a better estimation on the purging version number to be used for purging old data versions. With the use of the frequency-based placement (FBP) scheme to place data versions in a block, the efficiency in garbage collection can be further enhanced by increasing the deadspans of data versions and reducing reallocation cost especially when the spaces of the flash memory for the databases are limited. |
IP1-22 | D2CYBER: A DESIGN AUTOMATION TOOL FOR DEPENDABLE CYBERCARS Speakers: Arslan Munir and Farinaz Koushanfar, Rice University, US Abstract The next generation of automobiles (also known as cybercars) will increasingly incorporate electronic control units (ECUs) in novel automotive control applications. Recent work has demonstrated vulnerability of modern car control systems to security attacks that directly impacts the cybercar's physical safety and dependability. In this paper, we provide an integrated approach for the design of secure and dependable cybercars using a case study: a steer-by-wire (SBW) application over controller area network (CAN). The challenge is to embed both security and dependability over CAN while ensuring that the real-time constraints of the cybercar applications are not violated. Our approach enables early design feasibility analysis by embedding essential security primitives (i.e., confidentiality, integrity, and authentication) over CAN subject to the real-time constraints imposed by the desired quality of service and behavioral reliability. Our method leverages multi-core ECUs for providing fault-tolerance by redundant multi-threading (RMT) and also further enhances RMT for quick error detection. We quantify the error resilience of our approach and evaluate the interplay of performance, fault-tolerance, security, and scalability for our SBW case study. |
IP1-23 | CONTRACT-BASED DESIGN OF CONTROL PROTOCOLS FOR SAFETY-CRITICAL CYBER-PHYSICAL SYSTEMS Speakers: Pierluigi Nuzzo, John Finn, Antonio Iannopollo and Alberto Sangiovanni-Vincentelli, University of California at Berkeley, US Abstract We introduce a platform-based design methodology that addresses the complexity and heterogeneity of cyber-physical systems by using assume-guarantee contracts to formalize the design process and enable realization of control protocols in a hierarchical and compositional manner. Given the architecture of the physical plant to be controlled, the design is carried out as a sequence of refinement steps from an initial specification to a final implementation, including synthesis from requirements and mapping of higher-level functional and non-functional models into a set of candidate solutions built out of a library of components at the lower level. Initial top-level requirements are captured as contracts and expressed using linear temporal logic (LTL) and signal temporal logic (STL) formulas to enable requirement analysis and early detection of inconsistencies. Requirements are then refined into a controller architecture by combining reactive synthesis steps from LTL specifications with simulation-based design space exploration steps. We demonstrate our approach on the design of embedded controllers for aircraft electric power distribution. |
IP1-24 | A FAULT DETECTION MECHANISM IN A DATA-FLOW SCHEDULED MULTITHREADED PROCESSOR Speakers: Jian Fu1, Qiang Yang1, Raphael Poss1, Chris Jesshope1 and Chunyuan Zhang2 1University of Amsterdam, NL; 2National University of Defense Technology, CN Abstract This paper designs and implements the Redundant Multi-Threading (RMT) in a Data-flow scheduled Multi-Threaded (DMT) multicore processor, called Data-flow scheduled Redundant Multi-Threading (DRMT). Meanwhile, It presents Asynchronous Output Comparison (AOC) for RMT techniques to avoid fault detection related inter-core communication and alleviate the performance and hardware overheads induced by output comparison. Results show that the performance overhead of DRMT is less than 60% even when the number of threads is four times the number of processing elements. Also the performance and hardware overheads of AOC are insignificant. |