3.5 Breaking Simulation Boundaries

Printer-friendly version PDF version

Date: Tuesday 10 March 2015
Time: 14:30 - 16:00
Location / Room: Meije

Chair:
Elena Ioana Vatajelu, Politecnico di Torino, IT

Co-Chair:
Florian Letombe, Synopsys, FR

Faster, faster, faster .... that's all you expect when you are simulating your designs. This session takes you through a journey of super fast simulation techniques at different abstraction levels.

TimeLabelPresentation Title
Authors
14:303.5.1VARIATION-AWARE EVALUATION OF MPSOC TASK ALLOCATION AND SCHEDULING STRATEGIES USING STATISTICAL MODEL CHECKING
Speakers:
Mingsong Chen1, Daian Yue1, Xiaoke Qin2, Xin Fu3 and Prabhat Mishra2
1East China Normal University, CN; 2University of Florida, US; 3University of Houston, US
Abstract
To maximize the overall performance yield, variation-aware analysis is becoming a key step in Multiprocessor System-on-Chip (MPSoC) Task Allocation and Scheduling (TAS). Although various approaches have been investigated to improve performance yields, most of them cannot perform quantitative comparison among existing TAS heuristics, which is important for MPSoC designers to make decisions. Based on the statistical model checker UPPAAL-SMC, we propose a framework that can automatically evaluate the performance yield of TAS strategies under time and power constraints with variations. Experimental results show that our approach can not only filter inferior strategies efficiently, but also support the automated tuning of architecture and constraint parameters to achieve the required performance yield.

Download Paper (PDF; Only available from the DATE venue WiFi)
15:003.5.2A FAST PARALLEL SPARSE SOLVER FOR SPICE-BASED CIRCUIT SIMULATORS
Speaker:
Hehe Li, Department of Electronic Engineering, Tsinghua University, CN
Authors:
Xiaoming Chen, Yu Wang and Huazhong Yang, Tsinghua University, CN
Abstract
The sparse solver is a serious bottleneck in SPICEbased circuit simulators. Although several existing researches have proposed some circuit simulation-oriented parallel solvers, there is still some room to improve the speed and scalability of these solvers. This paper proposes a fast parallel sparse solver based on a pivoting-reduction technique which takes full advantage of features of circuit simulation. Experimental results show that on average, the proposed solver is up to 50% faster than the state-of-the-art solver NICSLU, and up to 3.3X faster than KLU. Real DC simulation reveals that our solver is faster than NICSLU, PARDISO, and commercial solvers.

Download Paper (PDF; Only available from the DATE venue WiFi)
15:303.5.3MRP: MIX REAL CORES AND PSEUDO CORES FOR FPGA-BASED CHIP-MULTIPROCESSOR SIMULATION
Speakers:
Xinke Chen1, Guangfei Zhang2, Huandong Wang3, Ruiyang Wu1, Peng Wu1 and Longbing Zhang1
1Institute of Computing Technology, CAS, CN; 2Shannon Laboratory, Huawei Technologies Co., Ltd, CN; 3Loongson Technology Corporation Limited, CN
Abstract
Facing the speed bottleneck of software-based simulators, FPGA-based simulation has been explored more and more. This paper proposes a novel methodology to simulate a chip-multiprocessor (CMP) on the limited FPGA resource. By mixing real cores and pseudo cores together (MRP), we can simulate a multicore system with fewer FPGA resource requirements and achieve a much higher simulation speed. We propose several methods to construct the pseudo cores. We implement our idea on a dual Virtex-6 FPGA board to simulate a general-purpose 4-core high performance CMP processor. Comparison experiments against the corresponding tape-out chip prove the effectiveness of MRP. We also evaluate MRP prototype's performance by running SPEC CPU2006 benchmarks on an unmodified Linux operating system, achieving tens to hundreds speedup compared to two other commonly-used simulators.

Download Paper (PDF; Only available from the DATE venue WiFi)
15:453.5.4SOURCE LEVEL PERFORMANCE SIMULATION OF GPU CORES
Speakers:
Christoph Gerum1, Oliver Bringmann2 and Wolfgang Rosenstiel1
1University of Tuebingen, DE; 2University of Tuebingen / FZI, DE
Abstract
Graphic processing units (GPUs) contain a lot of complex architectural features, which make performance analysis and simulation of applications using them for general purpose computation very difficult. Especially when trying to do performance simulations at a higher abstraction level than interpreted instruction set simulators these features are not handled accurately by state of the art simulation techniques. This paper proposes a method for source level performance simulation of the microarchitecture of a GPU core that provides high enough simulation speeds to make testing of large application scenarios possible.

Download Paper (PDF; Only available from the DATE venue WiFi)
16:00IP1-14, 595FAST AND ACCURATE BRANCH PREDICTOR SIMULATION
Speakers:
Antoine Faravelon, Nicolas Fournel and Frédéric Pétrot, TIMA Laboratory, Université de Grenoble-Alpes/CNRS, FR
Abstract
Embedded processors complexity has raised dramatically, due to the addition of architectural add-ons which improve performances significantly. High level models used in system simulation usually ignore these additions as the major issue is functional correctness. However, accurate estimates of software execution is sometimes required, therefore we focus in this paper on one of theses architectural features, the branch predictor. Unfortunately, advanced branch predictors use large tables, and a direct implementation of the scheme slows down simulation dramatically. To limit the simulation overhead, we define a modeling approach that we demonstrate on a state-of-the art predictor. We implemented the model in a dynamic binary translation based instruction set simulator and measured an accuracy of prediction of about 95% for a run-time overhead inferior to 5%.

Download Paper (PDF; Only available from the DATE venue WiFi)
16:01IP1-15, 605COMPARATIVE STUDY OF TEST GENERATION METHODS FOR SIMULATION ACCELERATORS
Speakers:
Wisam Kadry1, Dimtry Krestyashyn1, Arkadiy Morgenshtein1, Amir Nahir1, Vitali Sokhin1, Jae Cheol Son2, Wookyeong Jeong2, Sung-Boem Park2 and Jin Sung Park2
1IBM Research - Haifa, IL; 2Samsung, KR
Abstract
Hardware-accelerated simulation platforms are quickly becoming a major vehicle for the functional verification of modern systems and processors. Accelerator platforms provide functional verification with valuable simulation cycles. Yet, the high cost and limited bandwidth of accelerator platforms dictate a requirement for continuous utilization improvement. In this work, we perform a comparative analysis of two approaches of test generation for accelerator platforms. An exerciser tool is used as experimental vehicle for the study. An off-platform test generation methodology is implemented and is compared to on-platform test generation typically used in exercisers. We present experimental results from simulation of latest IBM POWER8 processor on Awan accelerator platform, as well as from simulation of an eight-core ARMv8-based design on Veloce emulation platform. Our results indicate that the utilization of accelerator platforms can be improved by up to ×7 ratio when using off-platform test generation. In addition, increase of up to 24% is observed in test coverage. Off-platform mode features significantly bigger image size, but maintains tolerable build and load times.

Download Paper (PDF; Only available from the DATE venue WiFi)
16:02IP1-16, 171USING STRUCTURAL RELATIONS FOR CHECKING COMBINATIONALITY OF CYCLIC CIRCUITS
Speakers:
Wan-Chen Weng1, Yung-Chih Chen2, Jui-Hung Chen1, Ching-Yi Huang1 and Chun-Yao Wang1
1National Tsing Hua University, TW; 2Yuan Ze University, TW
Abstract
Functionality and combinationality are two main issues that have to be dealt with in cyclic combinational circuits, which are combinational circuits containing loops. Cyclic circuits are combinational if nodes within the circuits have definite values under all input assignments. For a cyclified circuit, we have to check whether it is combinational or not. Thus, this paper proposes an efficient two-stage algorithm to verify the combinationality of cyclic circuits. A set of cyclified IWLS 2005 benchmarks are performed to demonstrate the efficiency of the proposed algorithm. Compared to the state-of-the-art algorithm, our approach has a speedup of about 4000 times on average.

Download Paper (PDF; Only available from the DATE venue WiFi)
16:00End of session
Coffee Break in Exhibition Area

Coffee Break in Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Break

On Tuesday and Wednesday, lunch boxes will be served in front of the session room Salle Oisans and in the exhibition area for fully registered delegates (a voucher will be given upon registration on-site). On Thursday, lunch will be served in Room Les Ecrins (for fully registered conference delegates only).

Tuesday, March 10, 2015

Coffee Break 10:30 - 11:30

Lunch Break 13:00 - 14:30; Keynote session from 13:20 - 14:20 (Room Oisans) sponsored by Mentor Graphics

Coffee Break 16:00 - 17:00

Wednesday, March 11, 2015

Coffee Break 10:00 - 11:00

Lunch Break 12:30 - 14:30, Keynote lectures from 12:50 - 14:20 (Room Oisans)

Coffee Break 16:00 - 17:00

Thursday, March 12, 2015

Coffee Break 10:00 - 11:00

Lunch Break 12:30 - 14:00, Keynote lecture from 13:20 - 13:50

Coffee Break 15:30 - 16:00