A Deterministic-Path Routing Algorithm for Tolerating Many Faults on Wafer-Level NoC

Zhongsheng Chen1,a, Ying Zhang1,b, Zebo Peng2 and Jianhui Jiang1,c
1School of Software Engineering, Tongji University, China
a13166219587@163.com
byingzhang@tongji.edu.cn
cjhjiang@tongji.edu.cn
2Embedded Systems Lab, Linköping University, Sweden
zebo.peng@liu.se

ABSTRACT


Wafer-level NoC has emerged as a promising fabric to further improve supercomputer performance, but this new fabric may suffer from the many-fault problem. This paper presents a deterministic-path routing algorithm for tolerating many faults on wafer-level NoCs. The proposed algorithm generates routing tables using a breadth-first traversal strategy, and stores one routing table in each NoC switch. The switch will then transmit packages according to its routing table online. We use the Tarjan algorithm to dynamically reconfigure the routes to avoid the faulty nodes and develop the deprecated link/node rules to ensure deadlock-free communication of the NoCs. Experimental results demonstrate that the proposed algorithm does not only tolerate the effects of many faults, but also maximizes the available nodes in the reconfigured NoC. The performance of the proposed algorithm in terms of average latency, throughput, and energy consumption is also better than those of the existing solutions.



Full Text (PDF)