DATE 2022

Exploiting Architecture Advances for Sparse Solvers in Circuit Simulation

Zhiyuan Yan^1,2,3,a, Biwei Xie^1,2,3,b, Xingquan Li^3,4,c and Yungang Bao^1,2,3,d
¹State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences
²University of Chinese Academy of Sciences
³Peng Cheng Laboratory
⁴Minnan Normal University
^ayanzhiyuan@ict.ac.cn
^bxiebiwei@ict.ac.cn
^cfzulxq@gmail.com
^dbaoyg@ict.ac.cn

ABSTRACT

Sparse direct solvers provide vital functionality for a wide variety of scientific applications. The dominated part of the sparse direct solver, LU factorization, suffers a lot from the irregularity of sparse matrices. Meanwhile, the specific characteristics of sparse solvers in circuit simulation and unique sparse pattern of circuit matrices provide more design spaces and also great challenges.

In this paper, we propose a sparse solver named FLU and re-examine the performance of LU factorization from the perspectives of vectorization, parallelization, and data locality. To improve vectorization efficiency and data locality, F LU introduces a register-level supernode computation method by delicately manipulating data movement. With alternating multiple columns computation, FLU further reduces the off-chip memory accesses greatly. Furthermore, we implement a fine-grained elimination tree based parallelization scheme to fully exploit task-level parallelism. Compared with PARDISO and NICSLU, experimental results show that FLU achieves a speedup up to 19.51× (3.86× on average) and 2.56× (1.66× on average) on Intel Xeon respectively.

Keywords: High Performance Computing, Circuit Simulation, Sparse LU Factorization.

Full Text (PDF)