3.5 Breaking Simulation Boundaries

Date: Tuesday 10 March 2015
Time: 14:30 - 16:00
Location / Room: Meije

Chair:
Elena Ioana Vatajelu, Politecnico di Torino, IT

Co-Chair:
Florian Letombe, Synopsys, FR

Faster, faster, faster .... that's all you expect when you are simulating your designs. This session takes you through a journey of super fast simulation techniques at different abstraction levels.

Time	Label	Presentation Title Authors
14:30	3.5.1	VARIATION-AWARE EVALUATION OF MPSOC TASK ALLOCATION AND SCHEDULING STRATEGIES USING STATISTICAL MODEL CHECKING Speakers: Mingsong Chen¹, Daian Yue¹, Xiaoke Qin², Xin Fu³ and Prabhat Mishra² ¹East China Normal University, CN; ²University of Florida, US; ³University of Houston, US Abstract To maximize the overall performance yield, variation-aware analysis is becoming a key step in Multiprocessor System-on-Chip (MPSoC) Task Allocation and Scheduling (TAS). Although various approaches have been investigated to improve performance yields, most of them cannot perform quantitative comparison among existing TAS heuristics, which is important for MPSoC designers to make decisions. Based on the statistical model checker UPPAAL-SMC, we propose a framework that can automatically evaluate the performance yield of TAS strategies under time and power constraints with variations. Experimental results show that our approach can not only filter inferior strategies efficiently, but also support the automated tuning of architecture and constraint parameters to achieve the required performance yield. Download Paper (PDF; Only available from the DATE venue WiFi)
15:00	3.5.2	A FAST PARALLEL SPARSE SOLVER FOR SPICE-BASED CIRCUIT SIMULATORS Speaker: Hehe Li, Department of Electronic Engineering, Tsinghua University, CN Authors: Xiaoming Chen, Yu Wang and Huazhong Yang, Tsinghua University, CN Abstract The sparse solver is a serious bottleneck in SPICEbased circuit simulators. Although several existing researches have proposed some circuit simulation-oriented parallel solvers, there is still some room to improve the speed and scalability of these solvers. This paper proposes a fast parallel sparse solver based on a pivoting-reduction technique which takes full advantage of features of circuit simulation. Experimental results show that on average, the proposed solver is up to 50% faster than the state-of-the-art solver NICSLU, and up to 3.3X faster than KLU. Real DC simulation reveals that our solver is faster than NICSLU, PARDISO, and commercial solvers. Download Paper (PDF; Only available from the DATE venue WiFi)
15:30	3.5.3	MRP: MIX REAL CORES AND PSEUDO CORES FOR FPGA-BASED CHIP-MULTIPROCESSOR SIMULATION Speakers: Xinke Chen¹, Guangfei Zhang², Huandong Wang³, Ruiyang Wu¹, Peng Wu¹ and Longbing Zhang¹ ¹Institute of Computing Technology, CAS, CN; ²Shannon Laboratory, Huawei Technologies Co., Ltd, CN; ³Loongson Technology Corporation Limited, CN Abstract Facing the speed bottleneck of software-based simulators, FPGA-based simulation has been explored more and more. This paper proposes a novel methodology to simulate a chip-multiprocessor (CMP) on the limited FPGA resource. By mixing real cores and pseudo cores together (MRP), we can simulate a multicore system with fewer FPGA resource requirements and achieve a much higher simulation speed. We propose several methods to construct the pseudo cores. We implement our idea on a dual Virtex-6 FPGA board to simulate a general-purpose 4-core high performance CMP processor. Comparison experiments against the corresponding tape-out chip prove the effectiveness of MRP. We also evaluate MRP prototype's performance by running SPEC CPU2006 benchmarks on an unmodified Linux operating system, achieving tens to hundreds speedup compared to two other commonly-used simulators. Download Paper (PDF; Only available from the DATE venue WiFi)
15:45	3.5.4	SOURCE LEVEL PERFORMANCE SIMULATION OF GPU CORES Speakers: Christoph Gerum¹, Oliver Bringmann² and Wolfgang Rosenstiel¹ ¹University of Tuebingen, DE; ²University of Tuebingen / FZI, DE Abstract Graphic processing units (GPUs) contain a lot of complex architectural features, which make performance analysis and simulation of applications using them for general purpose computation very difficult. Especially when trying to do performance simulations at a higher abstraction level than interpreted instruction set simulators these features are not handled accurately by state of the art simulation techniques. This paper proposes a method for source level performance simulation of the microarchitecture of a GPU core that provides high enough simulation speeds to make testing of large application scenarios possible. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00	IP1-14, 595	FAST AND ACCURATE BRANCH PREDICTOR SIMULATION Speakers: Antoine Faravelon, Nicolas Fournel and Frédéric Pétrot, TIMA Laboratory, Université de Grenoble-Alpes/CNRS, FR Abstract Embedded processors complexity has raised dramatically, due to the addition of architectural add-ons which improve performances significantly. High level models used in system simulation usually ignore these additions as the major issue is functional correctness. However, accurate estimates of software execution is sometimes required, therefore we focus in this paper on one of theses architectural features, the branch predictor. Unfortunately, advanced branch predictors use large tables, and a direct implementation of the scheme slows down simulation dramatically. To limit the simulation overhead, we define a modeling approach that we demonstrate on a state-of-the art predictor. We implemented the model in a dynamic binary translation based instruction set simulator and measured an accuracy of prediction of about 95% for a run-time overhead inferior to 5%. Download Paper (PDF; Only available from the DATE venue WiFi)
16:01	IP1-15, 605	COMPARATIVE STUDY OF TEST GENERATION METHODS FOR SIMULATION ACCELERATORS Speakers: Wisam Kadry¹, Dimtry Krestyashyn¹, Arkadiy Morgenshtein¹, Amir Nahir¹, Vitali Sokhin¹, Jae Cheol Son², Wookyeong Jeong², Sung-Boem Park² and Jin Sung Park² ¹IBM Research - Haifa, IL; ²Samsung, KR Abstract Hardware-accelerated simulation platforms are quickly becoming a major vehicle for the functional verification of modern systems and processors. Accelerator platforms provide functional verification with valuable simulation cycles. Yet, the high cost and limited bandwidth of accelerator platforms dictate a requirement for continuous utilization improvement. In this work, we perform a comparative analysis of two approaches of test generation for accelerator platforms. An exerciser tool is used as experimental vehicle for the study. An off-platform test generation methodology is implemented and is compared to on-platform test generation typically used in exercisers. We present experimental results from simulation of latest IBM POWER8 processor on Awan accelerator platform, as well as from simulation of an eight-core ARMv8-based design on Veloce emulation platform. Our results indicate that the utilization of accelerator platforms can be improved by up to ×7 ratio when using off-platform test generation. In addition, increase of up to 24% is observed in test coverage. Off-platform mode features significantly bigger image size, but maintains tolerable build and load times. Download Paper (PDF; Only available from the DATE venue WiFi)
16:02	IP1-16, 171	USING STRUCTURAL RELATIONS FOR CHECKING COMBINATIONALITY OF CYCLIC CIRCUITS Speakers: Wan-Chen Weng¹, Yung-Chih Chen², Jui-Hung Chen¹, Ching-Yi Huang¹ and Chun-Yao Wang¹ ¹National Tsing Hua University, TW; ²Yuan Ze University, TW Abstract Functionality and combinationality are two main issues that have to be dealt with in cyclic combinational circuits, which are combinational circuits containing loops. Cyclic circuits are combinational if nodes within the circuits have definite values under all input assignments. For a cyclified circuit, we have to check whether it is combinational or not. Thus, this paper proposes an efficient two-stage algorithm to verify the combinationality of cyclic circuits. A set of cyclified IWLS 2005 benchmarks are performed to demonstrate the efficiency of the proposed algorithm. Compared to the state-of-the-art algorithm, our approach has a speedup of about 4000 times on average. Download Paper (PDF; Only available from the DATE venue WiFi)
16:00		End of session Coffee Break in Exhibition Area Coffee Break in Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Lunch Break On Tuesday and Wednesday, lunch boxes will be served in front of the session room Salle Oisans and in the exhibition area for fully registered delegates (a voucher will be given upon registration on-site). On Thursday, lunch will be served in Room Les Ecrins (for fully registered conference delegates only). Tuesday, March 10, 2015 Coffee Break 10:30 - 11:30 Lunch Break 13:00 - 14:30; Keynote session from 13:20 - 14:20 (Room Oisans) sponsored by Mentor Graphics Coffee Break 16:00 - 17:00 Wednesday, March 11, 2015 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:30, Keynote lectures from 12:50 - 14:20 (Room Oisans) Coffee Break 16:00 - 17:00 Thursday, March 12, 2015 Coffee Break 10:00 - 11:00 Lunch Break 12:30 - 14:00, Keynote lecture from 13:20 - 13:50 Coffee Break 15:30 - 16:00