Date: Thursday 12 March 2015
Time: 10:00 - 10:30
Location / Room: Exhibition Area
Interactive Presentations run simultaneously during a 30-minute slot. A poster associated to the IP paper is on display throughout the morning. Additionally, each IP paper is briefly introduced in a one-minute presentation in a corresponding regular session, prior to the actual Interactive Presentation. At the end of each afternoon Interactive Presentations session the award 'Best IP of the Day' is given.
Label | Presentation Title Authors |
---|---|
IP4-1 | PWL: A PROGRESSIVE WEAR LEVELING TO MINIMIZE DATA MIGRATION OVERHEADS FOR NAND FLASH DEVICES Speakers: Fu-Hsin Chen1, Ming-Chang Yang2, Yuan-Hao Chang3 and Tei-Wei Kuo4 1Department of Computer Science and Information Engineering, National Taiwan University, TW; 2Graduate Institute of Networking and Multimedia, National Taiwan University, TW; 3Institute of Information Science, Academia Sinica, TW; 4Academia Sinica & National Taiwan University, TW Abstract As the endurance of flash memory keeps deteriorating, exploiting wear leveling techniques to improve the lifetime/endurance of flash memory has become a critical issue in the design of flash storage devices. In contrast to existing wear-leveling techniques that aggressively distributes the erases to all flash blocks by a fixed threshold, we propose a progressive wear leveling design to perform wear leveling in a "progressive" way to prevent any block from being worn out prematurely, and thereby to ultimately minimize the performance overheads caused by the unnecessary data migration. The results reveal that, instead of sacrificing the device lifetime, performing wear leveling in such a progressive way can not only minimize the performance overheads but even have potentials to extend the device lifespan. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-2 | TOWARDS TRUSTABLE STORAGE USING SSDS WITH PROPRIETARY FTL Speakers: Xiaotong Cui1, Minhui Zou1, Liang Shi2 and Kaijie Wu1 1Chongqing University, CN; 2College of Computer Science, Chongqing University, CN Abstract In recent years, we have seen an increasing deployment of flash-based storage, such as SSD, in mission-critical applications due to its fast read/write speed, small form factor, strong shock resistance, and etc. SSD uses a host interface and a middle layer called flash translation layer (FTL) to maintain the compatibility with the traditional magnetic-based HDD. Unlike the traditional HDD where the host OS has the full control on where to access the data, SSD uses FTL to translate and implement all operations, and OS has no such control. Even worse, FTL, which is considered as one of most important intellectual property of SSD, is often proprietary. This brings up a security concern on design trustworthiness: what if the manufacturer either accidentally or intentionally implement those operations incorrectly or even maliciously? In this paper we analyze the possible threats and propose a simple yet effective countermeasure. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-3 | USER-SPECIFIC SKIN TEMPERATURE-AWARE DVFS FOR SMARTPHONES Speakers: Begum Birsen Egilmez1, Gokhan Memik1, Seda Ogrenci-Memik1 and Oğuz Ergin2 1Northwestern University, US; 2TOBB University of Economics and Technology, TR Abstract Skin temperature of mobile devices intimately affects the user experience. Power management schemes built into smartphones can lead to quickly crossing a user's threshold of tolerable skin temperature. Furthermore, there is a significant variation among users in terms of their sensitivity. Hence, controlling the skin temperature as part of the device's power management scheme is paramount. To achieve this, we first present a method for estimating skin and screen temperature at run-time using a combination of available on-device thermal sensors and performance indicators. In an Android-based smartphone, we achieve 99.05% and 99.14% accuracy in estimations of back cover and screen temperatures, respectively. Leveraging this run-time predictor, we develop User-specific Skin Temperature-Aware (USTA) DVFS mechanism to control the skin temperature. Performance of USTA is tested both with benchmarks and user tests comparing USTA to the standard Android governor. The results show that more users prefer to use USTA as opposed to the default DVFS mechanism. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-4 | FORMAL PROBABILISTIC ANALYSIS OF DISTRIBUTED DYNAMIC THERMAL MANAGEMENT Speaker: Muhammad Shafique, Karlsruhe Institute of Technology (KIT), DE Authors: Shafaq Iqtedar1, Osman Hasan2, Muhammad Shafique3 and Joerg Henkel3 1National University of Sciences and Technology (NUST), Islamabad, PK; 2National University of Sciences and Technology (NUST), Islamabad, ; 3Karlsruhe Institute of Technology (KIT), DE Abstract The prevalence of Dynamic Thermal Management (DTM) schemes coupled with demands for high reliability motivate the rigorous verification and testing of these schemes before deployment. Conventionally, these schemes are analyzed using either simulations or by running on real systems. But these traditional analysis techniques cannot exhaustively validate the distributed DTM schemes and thus compromise on the accuracy of the analysis results. Moreover, the randomness due to task assignments, task completion times and re-mappings, is often ignored in the analysis of distributed DTM schemes. We propose to overcome both of these limitations by using probabilistic model checking, which is a formal method for modeling and verifying concurrent systems with randomized behaviors. The paper presents a case study on the formal verification of a state-ofthe- art distributed DTM scheme using the PRISM model checker. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-5 | A HYBRID QUASI MONTE CARLO METHOD FOR YIELD AWARE ANALOG CIRCUIT SIZING TOOL Speakers: Engin Afacan, Günhan Dündar, Gonenc Berkol, Ali Emre Pusane and İsmail Faik Baskaya, Bogazici University, TR Abstract Efficient yield estimation methods are required by yield aware automatic sizing tools, where many iterative variability analyses are performed. Quasi Monte Carlo (QMC) is a popular approach, in which samples are generated more homogeneously, hence faster convergence is obtained compared to the conventional MC. However, since QMC is deterministic and has no natural variance, there is no convenient way to obtain estimation error bounds. To determine the confidence interval of the estimated yield, scrambled QMC, in which samples are randomly permuted, is run multiple times to obtain stochastic variance by sacrificing computational cost. To palliate this challenge, this paper proposes a hybrid method, where a single QMC is performed to determine infeasible solutions in terms of yield, which is followed by a few scrambled QMC analyses providing variance and confidence interval of the estimated yield. Yield optimization is performed considering the worst case of the current estimation, thus the optimizer guarantees that the solution will satisfy the confidence interval. Furthermore, a yield ranking mechanism is also developed to enforce the optimizer to search for more robust solutions. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-6 | FEATURE SELECTION FOR ALTERNATE TEST USING WRAPPERS: APPLICATION TO AN RF LNA CASE STUDY Speakers: Manuel Barragan1 and Gildas Leger2 1TIMA Laboratory, FR; 2Instituto de Microelectronica de Sevilla, IMSE-CNM, (CSIC - Universidad de Sevilla), ES Abstract Testing analog, mixed-signal and RF circuits represents the main cost component for testing complex SoCs. A promising solution to alleviate this cost is the Alternate Test strategy. Alternate test is an indirect test approach that replaces costly specification measurements by simpler signatures. Machine learning techniques are then used to map signatures and performances. One key point that still remains as an open problem is the conception of adequate simple measurement candidates. This work presents efficient algorithms for selecting information rich signatures. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-7 | IMPROVING SIMD CODE GENERATION IN QEMU Speakers: Sheng-Yu Fu1, Jan-Jan Wu2 and Wei-Chung Hsu1 1Department of Computer Science National Taiwan University, TW; 2Institute of Information Science Academia Sinica, TW Abstract Modern processors are often enhanced with SIMD instructions. For examples, the MMX, SSE, and AVX instruction set in the x86 architecture, and the Neon instruction set in the ARM architecture are SIMD instructions. Using these SIMD instructions could significantly increase the performance of applications, hence application binaries are likely to have a good fraction of instructions that are SIMD instructions. However, SIMD instruction translation has not attacked much attention in Dynamic Binary Translation (DBT). For example, in the popular QEMU system emulator, guest SIMD instructions are often emulated with a sequence of scalar instructions even when the host machines do have SIMD instructions to support such parallel computation, leaving a large potential for performance enhancement. In this paper, we propose two approaches, one to leverage the existing helper function implementation in QEMU, and the other to use a newly introduced vector IR (Intermediate Representation) to enhance the performance of SIMD instructions translation in DBT of QEMU. The approaches have been implemented in the QEMU to support ARM and IA32 frontend and x86-64 backend. Our preliminary experiments show that adding vector IR can significantly enhance the performance of guest applications containing SIMD instructions for both ARM and IA32 architectures when running with QEMU on the x86-64 platform. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-8 | REUSE DISTANCE ANALYSIS FOR LOCALITY OPTIMIZATION IN LOOP-DOMINATED APPLICATIONS Speakers: Christakis Lezos, Grigoris Dimitroulakos and Konstantinos Masselos, University of Peloponnese, GR Abstract This paper discusses MemAddIn, a compiler assisted dynamic code analysis tool that analyzes C code and exposes critical parts for memory related optimizations on embedded systems that can heavily affect systems performance, power and cost. The tool includes enhanced features for data reuse distance analysis and source code transformation recommendations for temporal locality optimization. Several of data reuse distance measurement algorithms have been implemented leading to different trade-offs between accuracy and profiling execution time. The proposed tool can be easily and seamlessly integrated into different software development environments offering a unified environment for application development and optimization. The novelties of our work over a similar optimization tool are also discussed. MemAddIn has been applied for the dynamic computation of data reuse distance for a number of different applications. Experimental results prove the effectiveness of the tool through the analysis and optimization of a realistic image processing application. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-9 | TAPP: TEMPERATURE-AWARE APPLICATION MAPPING FOR NOC-BASED MANY-CORE PROCESSORS Speakers: Di Zhu, Lizhong Chen, Timothy Pinkston and Massoud Pedram, University of Southern California, US Abstract Application mapping with its ability to spread out high-power components can potentially be a good approach to mitigate the looming issue of hotspots in many-core processors. However, very few works have explored effective ways of making tradeoff between temperature and network latency. Moreover, on-chip routers, which are of high power density and may lead to hotspots, are not considered in these works. In this paper, we propose TAPP (Temperature-Aware Partitioning and Placement), an efficient application mapping algorithm to reduce on-chip hotspots while sacrificing little network performance. This algorithm "spreads" high-power cores and routers across the chip by performing hierarchical bi-partitioning of the cores and concurrently conducting placement of the cores onto tiles, and achieves high efficiency and superior scalability. Simulation results show that the proposed algorithm reduces the temperature by up to 6.80°C with minimal latency increase compared to the latency-oriented mapping solution. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-10 | MALLEABLE NOC: DARK SILICON INSPIRED ADAPTABLE NETWORK ON CHIP Speakers: Haseeb Bokhari1, Haris Javaid2, Muhammad Shafique3, Joerg Henkel3 and Sri Parameswaran1 1University of New South Wales, AU; 2Google Inc., ; 3Karlsruhe Institute of Technology (KIT), DE Abstract Network on Chip (NoC) has been envisioned as a scalable fabric for many core chips. However, NoCs can consume a considerable share of chip power. Moreover, diverse applications are executed in these multicore, where each application imposes a unique load on the NoC. To realise a NoC which is Energy and Delay efficient, we propose combining multiple VF optimized routers for each node (in traditional NoCs, we have only a single router per node) for efficient NoC for Dark Silicon chips. We present a generic NoC with routers designed for different VF levels, which are distributed across the chip. At runtime, depending on application profile, we combine these VF optimized routers to form constantly changing energy efficient NoC fabric. We call our architecture Malleable NoC. In this paper, we describe the architectural details of the proposed architecture and the runtime algorithms required to dynamically adapt the NoC resources. We show that for a variety of multi program benchmarks executing on Malleable NoC, Energy Delay product (EDP) can be reduced by up to 46% for widely differing workloads. We further show the effect on EDP savings for differing amounts of dark silicon area budget. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-11 | TOPOLOGY IDENTIFICATION FOR SMART CELLS IN MODULAR BATTERIES Speakers: Sebastian Steinhorst and Martin Lukasiewycz, TUM CREATE, SG Abstract This paper proposes an approach to automatically identifying the topological order of smart cells in modular batteries. Emerging smart cell architectures enable battery management without centralized control by coordination of activities via communication. When connecting smart cells in series to form a battery pack, the topological order of the cells is not known and it cannot be automatically identified using the available communication bus. This order, however, is of particular importance for several battery management functions, including temperature control and active cell balancing which relate properties of the cells and their location. Therefore, this paper presents a methodology to automatically identify a topological order on the smart cells in a battery pack using a hybrid communication approach, involving both the communication and the balancing layer of the smart cell architecture. A prototypic implementation on a development platform shows the feasibility and scalability of the approach. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-12 | LVS CHECK FOR PHOTONIC INTEGRATED CIRCUIT - CURVILINEAR FEATURE EXTRACTION AND VALIDATION Speakers: Ruping Cao1, Julien Billoudet1, John Ferguson1, Lionel Couder2, John Cayo2, Alexandre Arriordaz1 and Ian O'Connor3 1Mentor Graphics Corp, FR; 2Mentor Graphics Corp, US; 3Lyon Institute of Nanotechnology, FR Abstract This work is motivated by the demand of an electronic design automation (EDA) approach for the emerging ecosystem of the photonic integrated circuit (PIC) technology. A reliable physical verification flow cannot be achieved without the adaption of the traditional EDA tools to the photonic design verification needs. We analyze how layout versus schematic (LVS) checking is performed differently for photonic designs, and propose an LVS flow that addresses the particular need of curvilinear feature validation (curved path length and bend curvature extraction). We show that it is possible to reuse and extend the current LVS tools to perform such critical but non-traditional checks, which ensures a more reliable photonic layout implementation in term of functionality and circuit yield. Going forward, we propose possible future studies that can further improve the flows. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-13 | FP-SCHEDULING FOR MODE-CONTROLLED DATAFLOW: A CASE STUDY Speakers: Alok Lele1, Orlando Moreira2 and Kees van Berkel2 1Eindhoven University of Technology, NL; 2Ericsson B.V., NL Abstract Dual-Radio Simultaneous Access (DRSA) is an emerging topic in Software Defined Radio (SDR) in which two SDRs are running simultaneously on a shared hardware, typically a heterogeneous Multi-Processor System-on-Chip (MPSoC). Each SDR has a independent hard latency and/or throughput requirement and needs rigorous timing analysis. Moreover, SDRs are often modeled in enriched variants of dataflow to accommodate the growing dynamic execution of SDRs, making it a challenge to perform timing analysis on them. This paper considers the preemptive Fixed Priority Scheduling (FPS) of SDRs modeled in emph{Mode-Controlled Dataflow}. To the best of our knowledge this is the first attempt on static timing analysis of FPS for a (semi-)dynamic variant of synchronous dataflow. We propose a two-phase algorithm to determine the worst-case response time of an actor. We demonstrate our analysis results for a DRSA case study of two 4G-LTE receivers. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-14 | AGEING SIMULATION OF ANALOGUE CIRCUITS AND SYSTEMS USING ADAPTIVE TRANSIENT EVALUATION Speakers: Felix Salfelder and Lars Hedrich, Goethe-Universitat Frankfurt a. M., DE Abstract Simulating ageing effects in analogue circuits requires both ageing models and a circuit simulator which is capable of a stress dependent, ageing and recovery aware model evaluation during long term transient simulation. Common approaches on reliability simulation often involve aged models, age precomputation, or lookup tables instead of integrated ageing simulation using memory aware ageing models. Long term transient ageing simulation enhances reliability simulation. This paper presents a framework to model and simulate ageing effects using an adaptive two-times evaluation scheme. This integrates full ageing effect models into behavioural device models. In addition, we introduce semantics for modelling stress levels and ageing parameters in hardware description languages. Our approach is a fully integrated simulation solution, enabling correct and efficient simulation of ageing systems over their lifetimes. We demonstrate how transistor level ageing effects critically affect the operation of a circuit. Our examples incorporate ageing monitors, redundant parts, and self-repair functionality into analogue systems. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-15 | A TOOL FOR THE ASSISTED DESIGN OF CHARGE REDISTRIBUTION SAR ADCS Speakers: Stefano Brenna1, Andrea Bonetti2, Andrea Bonfanti1 and Andrea L. Lacaita1 1Politecnico di Milano, IT; 2École Polytechnique Fédérale de Lausanne (EPFL), CH Abstract The optimal design of SAR ADCs requires the accurate estimate of nonlinearity and parasitic effects in the feedback charge-redistribution DAC. Since the effects of both mismatch and stray capacitances depend on the specific array topology, complex calculations, custom modeling and heavy simulations in common circuit design environments are often required. This paper presents a MATLAB-based numerical tool to assist the design of the charge redistribution DACs adopted in SAR ADCs. The tool performs both parametric and statistical simulations taking into account capacitive mismatch and parasitic capacitances thus computing both differential and integral nonlinearity (DNL, INL). SNDR and ENoB degradation due to static non-linear effects is also estimated. An excellent agreement is obtained with the results of circuit simulators (e.g. Cadence Spectre) featuring up to 104 shorter simulation time, allowing statistical simulations which would be otherwise impracticable. Measurements on two fabricated SAR ADCs confirm that the proposed tool can be used as avalid instrument to assist the design of a charge redistribution SAR ADC and predict its static and dynamic metrics. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-16 | DETECTION OF ASYMMETRIC AGING-CRITICAL VOLTAGE CONDITIONS IN ANALOG POWER-DOWN MODE Speakers: Michael Zwerger and Helmut Graeb, Technische Universitaet Muenchen, DE Abstract In this work, a new verification method for the power-down mode of analog circuit blocks is presented. In power-down mode, matched transistors can be stressed with asymmetric voltages. This will cause time-dependent mismatch due to transistor aging. In order to avoid reliability problems, a new method for automatic detection of asymmetric power-down stress conditions is presented. Therefore, power-down voltage-matching rules are formulated. The method combines structural analysis and voltage propagation. Experimental results demonstrate the efficiency and effectiveness of the approach. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-17 | (Best Paper Award Candidate) HIGH PERFORMANCE SINGLE SUPPLY CMOS INVERTER LEVEL UP SHIFTER FOR MULTI-SUPPLY VOLTAGES DOMAINS Speakers: José-C. García1, Juan A. Montiel-Nelson1, J. Sosa1 and Saeid Nooshabadi2 1Institute for Applied Microelectronics, ES; 2Department of Electrical and Computer Engineering of Michigan Technological University, US Abstract A single supply CMOS inverter level shifter (ssqc-ls) for upconverting signals from 0.4V-1V logic level range up to 1.1V power supply domain is introduced. For guaranteing a low energy consumption, the proposed shifter is based on topological modifications of the structure qc-level shifter reported in [1]. For 0.5V input square wave switching at 500MHz, the inverter level shifter ssqc-ls using 1.2V of power supply achieves a 60% of Figure of Merit improvement in comparison against jy-ls [8] with a dual power supply voltage of 0.6V and 1.2V. Post-layout simulation results shown that ssqc-ls reaches a propagation delay of 0.75ns, an energy consumption of only 2.3pJ, and an energy-delay product of 1.73pJns for a capacitive loading condition of 950fF. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-18 | EXPLORING THE IMPACT OF FUNCTIONAL TEST PROGRAMS RE-USED FOR POWER-AWARE TESTING Speakers: Aymen Touati1, Alberto Bosio2, Luigi Dilillo2, Patrick Girard2, Arnaud Virazel2, Paolo Bernardi3 and Mateo Sonza Reorda3 1LIRMM, FR; 2LIRMM-UM2/CNRS, FR; 3Politecnico di Torino, IT Abstract Abstract— High power consumption during at-speed delay fault testing may lead to yield loss and premature aging. On the other hand, reducing too much test power might lead to test escape and reliability problems. Thus, to avoid these issues, test power has to map the power consumed during functional mode. Existing works target the generation of functional test programs able to maximize the power consumption in functional mode of microprocessor cores. The obtained power consumption will be used as threshold to tune the power consumed during testing. This paper investigates the impact of re-using such functional test programs for testing purposes. We propose to apply them by exploiting existing DfT architecture to maximize the delay fault coverage. Then, we combine them with the classical at-speed LOC and LOS delay fault testing schemes to further increase the fault coverage. Results show that it is possible to achieve a global test solution able to maximize the delay fault coverage while respecting the functional power budget. Keywords—Power Aware Test; Functional and Structural test; microprocessor test; ATPG. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-19 | A BREAKPOINT-BASED SILICON DEBUG TECHNIQUE WITH CYCLE-GRANULARITY FOR HANDSHAKE-BASED SOC Speakers: Hsin-Chen Chen1, Chen-Rong Wu1, Katherine Shu-Min Li2 and Kuen-Jon Lee1 1National Cheng Kung University, TW; 2National Sun Yat-sen University, TW Abstract The breakpoint-based silicon debug approach allows users to stop the normal (system) operations of the circuits under debug (CUDs), extract the internal states of the CUDs for examination, and then resume the normal operations for further debugging. However, most previous work on this approach adopts the transaction-level or handshake-level of granularity, i.e., the CUDs can be stopped only when a transaction or a handshake operation is completed. The granulations at these levels are often too coarse when a transaction or a handshake operation requires a large number of cycles to complete. In this paper, we present a novel debug mechanism, called the Protocol Agency Mechanism (PAM), which allows the breakpoint-based debug technique to be applied at the cycle- level granularity. The PAM can deal with transaction invalidation as well as protocol violation that may occur when a system is stopped and resumed. Experimental results show that the area overhead of the PAM is quite small and the performance impact on the system is negligible. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-20 | FAULT DIAGNOSIS IN DESIGNS WITH EXTREME LOW PIN TEST DATA COMPRESSORS Speakers: Subhadip Kundu1, Parthajit Bhattacharya1 and Rohit Kapur2 1Synopsys India, IN; 2Synopsys Inc., US Abstract Diagnosis plays an important role to ramp up yield during IC manufacturing process. Limited observability due to test response compaction negatively affects the diagnosis procedure. With modern compressors - targeting very high test data compression, diagnosis becomes even more complicated. In this paper, a complete diagnosis methodology focussing on a novel mapping algorithm has been described. The mapping algorithm maps failures from compressor pins to scan cells with great accuracy (even in presence of don't cares in the responses), so that, normal scan diagnosis can be used to find out the actual defects. Experimental results on different industrial designs have proved that the proposed method almost match scan based diagnosis results. Download Paper (PDF; Only available from the DATE venue WiFi) |
IP4-21 | OPTIMIZING DYNAMIC TRACE SIGNAL SELECTION USING MACHINE LEARNING AND LINEAR PROGRAMMING Speakers: Charlie Shucheng Zhu and Sharad Malik, Princeton University, US Abstract The success of post-silicon validation is limited by the low observability of the signals on the chip under debug. Trace buffers are used to enhance visibility of a subset of the internal signals during the chip's operation. These trace signals can be selected statically, i.e. the same trace signals are used through an entire debugging run, or dynamically where a different set of signals can be used in different parts of a debugging run. The focus of this work is on dynamic trace signal selection. Our technique uses machine learning for classification of different groups of inputs that are likely to trigger different faults, and a linear programming based optimization method for selecting the different sets of trace signals for different combinations of inputs and states. In contrast to existing methods, this technique is applicable to both transient and permanent faults. Download Paper (PDF; Only available from the DATE venue WiFi) |