Reliability Assessment of Fault Tolerant Routing Algorithms in Networks-on-Chip: An Analytic Approach

Sadia Moriam1,2,a and Gerhard P. Fettweis1,2,b
1Dep. of Electrical Engineering and Information Technology / Vodafone Chair Mobile Communication Systems.
2Centre for Advancing Electronics Dresden (CFAED), Technische Universität Dresden, 01062 Dresden, Germany


Rapid scaling of transistor gate sizes has significantly increased the density of on-chip integrations and paved the way for many-core systems-on-chip with highly improved performances. The design of the interconnection network of these complex systems is a critical one and the network-on-chip is now the accepted efficient interconnect for such large core arrays. An unfortunate adverse effect of technology scaling is the increased susceptibility to failures resulting in failing links and routers in the network-on-chip. To keep the network connected, efficient fault adaptive routing algorithms are necessary to route around faults. To design and evaluate the fault resiliency of such adaptive routing algorithms, fast, accurate and flexible analytic models are required, especially in large networks for which simulations are extremely time costly. In this paper, we present an analytic approach to evaluate the reliability of adaptive routing algorithms based on algebraic manipulations of the channel dependency matrix. It allows also to evaluate the number of alternate paths between source-destination pairs, in the presence of any number of permanent faults in the network. The analytic model is general and can be adapted to evaluate network reliability for any network topology and with any adaptive routing algorithm based on the turn model. We present cycle-accurate simulations to compare the accuracy of the model for the 2-D mesh and the hexagonal networks. The model is able to estimate the network fault resilience with an accuracy of about 1% and more than 70 times faster than the cycle accurate simulation.

Full Text (PDF)