2.6 Design and Analysis of Dependable Systems

Date: Tuesday 10 March 2015
Time: 11:30 - 13:00
Location / Room: Bayard

Chair:
Arne Hamann, Robert Bosch GmbH, DE

Co-Chair:
Viacheslav Izosimov, Semcon/KTH, SE

This section introduces new methods for uncertainty-aware reliability analysis and soft error vulnerability estimation as well as techniques for error recovery in safety-critical systems and security attacks through the JTAG port

Time	Label	Presentation Title Authors
11:30	2.6.1	LOW-COST CHECKPOINTING IN AUTOMOTIVE SAFETY-RELEVANT SYSTEMS Speakers: Carles Hernandez and Jaume Abella, Barcelona Supercomputing Center (BSC-CNS), ES Abstract The use of checkpointing and roll-back recovery (CRR) schemes is common practice to increase the likelihood of a task completing with the correct result despite the presence of faults. However, the use of CRR mechanisms is challenging in the severely constrained design space of safety-relevant embedded systems, such as those controlling critical functions in the automotive domain. CRR schemes introduce non-negligible time and memory overheads that may jeopardize the feasibility of their implementation. In this paper we propose a low-cost checkpointing mechanism suitable for safety-relevant embedded systems deploying light-lockstep architectures. The proposed checkpointing mechanism increases the reliability of the system while keeping timing and memory overhead low enough. Download Paper (PDF; Only available from the DATE venue WiFi)
12:00	2.6.2	UNCERTAINTY-AWARE RELIABILITY ANALYSIS AND OPTIMIZATION Speakers: Faramarz Khosravi, Malte Müller, Michael Glaß and Jürgen Teich, Friedrich-Alexander-Universität Erlangen-Nürnberg, DE Abstract Due to manufacturing tolerances and aging effects, future embedded systems have to cope with unreliable components. The intensity of such effects depends on uncertain aspects like environmental or usage conditions such that highly safety-critical systems are pessimistically designed for worst-case mission profiles. In this work, we propose to explicitly model the uncertain characteristics of system components, i.e. we model components using reliability functions with parameters distributed between a best and worst case. Since destructive effects like temperature may affect several components simultaneously (e.g. those in the same package), a correlation between uncertainties of components exists. The proposed uncertainty-aware method combines a formal analysis approach and a Monte Carlo simulation to consider uncertain characteristics and their different correlations. It delivers a holistic view on the system's reliability with best/worst/average-case behavior but also insights on variance and quantiles. However, existing optimization approaches typically assume design objectives to be single values or follow a predefined distribution known as noise. As a remedy, we propose a dominance criterion for meta-heuristic optimization approaches like evolutionary algorithms that enables the comparison of system implementations with arbitrarily distributed characteristics. Our presented experimental results show that (a) the proposed analysis comes at low overhead while capturing existing uncertainties with sufficient accuracy, and (b) the optimization process is significantly enhanced when guiding the search process by additional aspects like variance and the 95% quantile, delivering better system implementations as found by uncertainty-oblivious optimization approaches. Download Paper (PDF; Only available from the DATE venue WiFi)
12:30	2.6.3	EFFICIENT SOFT ERROR VULNERABILITY ESTIMATION OF COMPLEX DESIGNS Speakers: Shahrzad Mirkhani¹, Subhasish Mitra², Chen-Yong Cher³ and Jacob Abraham⁴ ¹University of Texas at Austin, US; ²Stanford University, US; ³IBM Research, US; ⁴University of Texas, US Abstract Analyzing design vulnerability for soft errors has become a challenging process in large systems with a large number of memory elements. Error injection in a complex system with a sufficiently large sample of error candidates for reasonable accuracy takes a large amount of time. In this paper we describe RAVEN, a statistical method to estimate the outcomes of a system in the presence of soft errors injected into flip-flops, as well as the vulnerability for each memory element. This method takes advantage of fast local simulations for each error injection, and calculates the probabilities for the system outcomes for every possible soft error in a period of time. Experimental results, on an out-of-order processor with SPECINT2000 workloads, show that RAVEN is an order of magnitude faster compared with traditional error injection while maintaining accuracy. Download Paper (PDF; Only available from the DATE venue WiFi)
12:45	2.6.4	DETECTION OF ILLEGITIMATE ACCESS TO JTAG VIA STATISTICAL LEARNING IN CHIP Speakers: Xuanle Ren¹, Vítor Grade Tavares² and Shawn Blanton¹ ¹Carnegie Mellon University, US; ²Faculdade de Engenharia da Universidade do Porto, PT Abstract IEEE 1149.1, commonly known as the joint test action group (JTAG), is the standard for the test access port and the boundary-scan architecture. The JTAG is primarily utilized at the time of the integrated circuit (IC) manufacture but also in the field, giving access to internal sub-systems of the IC, or for failure analysis and debugging. Because the JTAG needs to be left intact and operational for use, it inevitably provides a "backdoor" that can be exploited to undermine the security of the chip. Potential attackers can then use the JTAG to dump critical data or reverse engineer IP cores, for example. Since an attacker will use the JTAG differently from a legitimate user, it is possible to detect the difference using machine-learning algorithms. A JTAG protection scheme, SLIC-J, is proposed to monitor user behavior and detect illegitimate accesses to the JTAG. Specifically, JTAG access is characterized using a set of specifically-defined features, and then an on-chip classifier is used to predict whether the user is legitimate or not. To validate the effectiveness of the approach, both legitimate and illegitimate JTAG accesses are simulated using the OpenSPARC T2 benchmark. The results show that the detection accuracy is 99.2%, and the escape rate is 0.8%. Download Paper (PDF; Only available from the DATE venue WiFi)
13:00	IP1-8, 24	AN APPROXIMATE VOTING SCHEME FOR RELIABLE COMPUTING Speakers: Ke Chen¹, Jie Han² and Fabrizio Lombardi¹ ¹Northeastern University, US; ²University of Alberta, CA Abstract This paper relies on the principles of inexact computing to alleviate the issues arising in static masking by voting for reliable computing. A scheme that utilizes approximate voting is proposed; it is referred to as inexact double modular redundancy (IDMR). IDMR does not resort to triplication, thus saving overhead due to modular replication; moreover, this scheme is adaptive in its operation, i.e., it allows a threshold to determine the validity of the module outputs. IDMR operates by initially establishing the difference between the values of the outputs of the two modules; only if the difference is below a preset threshold, then the voter calculates the average value of the two module outputs. An extensive analysis of the voting circuits and an application to image processing are presented. Download Paper (PDF; Only available from the DATE venue WiFi)
13:01	IP1-9, 278	FLINT: LAYOUT-ORIENTED FPGA-BASED METHODOLOGY FOR FAULT TOLERANT ASIC DESIGN Speakers: Rochus Nowosielski, Lukas Gerlach, Stephan Bieband, Guillermo Paya-Vaya and Holger Blume, Leibniz Universität Hannover, Institute of Microelectronic Systems, DE Abstract Research of efficient fault tolerance techniques for digital systems requires insight into the fault propagation mechanism inside the ASIC design. Radiation, high temperature, or charge sharing effects in ultra-deep submicron technologies influence fault generation and propagation dependent on die location. The proposed methodology links efficient fault injection to fault propagation in the floorplan view of a standard cell ASIC. This is achieved by instrumentation of the gate netlist after place&route, emulation in an FPGA system and experiment control via interactive user interface. Further, automated fault injection campaigns allow exhaustive fault tolerance evaluations taking single faults as well as adjacent cell faults into account. The proposed methodology can be used to identify vulnerable cell nodes in the design and allow the classification of placement strategies of fault tolerant ASIC designs. Download Paper (PDF; Only available from the DATE venue WiFi)
13:00		End of session Lunch Break, Keynote session from 1320 - 1420 (Room Oisans) sponsored by Mentor Graphics in front of the session room Salle Oisans and in the Exhibition area Coffee Break in Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Lunch Break On Tuesday and Wednesday, lunch boxes will be served in front of the session room Salle Oisans and in the exhibition area for fully registered delegates (a voucher will be given upon registration on-site). On Thursday, lunch will be served in Room Les Ecrins (for fully registered conference delegates only). Tuesday, March 10, 2015 Coffee Break 10:30 - 11:30 Lunch Break 13:00 - 14:30; Keynote session from 13:20 - 14:20 (Room Oisans) sponsored by Mentor Graphics Coffee Break 16:00 - 17:00 Wednesday, March 11, 2015 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:30, Keynote lectures from 12:50 - 14:20 (Room Oisans) Coffee Break 16:00 - 17:00 Thursday, March 12, 2015 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:00, Keynote lecture from 13:20 - 13:50 Coffee Break 15:30 - 16:00