# Modelling Circuit Performance Variations due to Statistical Variability: Monte Carlo Static Timing Analysis

Michael Merrett\*, Plamen Asenov<sup>†</sup>, Yangang Wang\*, Mark Zwolinski\*, Dave Reid<sup>†</sup>, Campbell Millar<sup>†</sup>, Scott Roy<sup>†</sup>, Zhenyu Liu<sup>‡</sup>, Steve Furber<sup>‡</sup> and Asen Asenov<sup>†</sup>
\*Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ, UK
<sup>†</sup>School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK
<sup>‡</sup>School of Computer Science, University of Manchester, Manchester M13 9PL, UK

Abstract—The scaling of MOSFETs has improved performance and lowered the cost per function of CMOS integrated circuits and systems over the last 40 years, but devices are subject to increasing amounts of statistical variability within the decanano domain. The causes of these statistical variations and their effects on device performance have been extensively studied, but there have been few systematic studies of their impact on circuit performance. This paper describes a method for modelling the impact of random intra-die statistical variations on digital circuit timing and power consumption. The method allows the variation modelled by large-scale statistical transistor simulations to be propagated up the design flow to the circuit level, by making use of commercial STA and standard cell characterisation tools. The method provides circuit designers with the information required to analyse power, performance and yield trade-offs when fabricating a design, while removing the large levels of pessimism generated by traditional Corner Based Analysis.

# I. INTRODUCTION

The effects of factors such as random discrete dopants (RDD) [1] and line edge roughness (LER) [2], [3] have been investigated and are understood at the device level, but these effects must be modelled at the circuit level in order to improve design for manufacturability methods. The challenge is to provide circuit designers with a transparent method for modelling the impact of MOSFET variability so that important trade off design decisions between power consumption, performance and manufacturing yield can be made.

Traditionally, process variations have been modelled at the circuit level by performing static timing analysis (STA), which makes use of calibrated lookup tables for standard cells at multiple technology dependent corners. The increase in statistical variability of MOSFET behaviour is too complex to be modelled with traditional STA techniques.

The shortfalls of corner based analysis were explained by [4], where it was highlighted that STA can be both overly pessimistic and optimistic at the same time, and hence Statistical Static Timing Analysis (SSTA) was proposed. A great deal of effort has been placed into the development of practical and accurate SSTA tools but modern SSTA algorithms are unable to address issues that are overcome by commonly

978-3-9810801-7-9/DATE11/©2011 EDAA

used STA methods, and there remain large obstacles to the widespread use of SSTA in the industry [5], not least of which is the limited amount of statistical foundry data available in a standardised format.

This study compares the use of traditional corner based STA with a method for introducing statistical variability into a commercially available STA tool, Monte Carlo Static Timing Analysis (MCSTA). MCSTA reduces the significant levels of pessimism associated with traditional STA and provides a more accurate prediction of circuit power, performance and yield. Comprehensive SPICE simulations, based on industrial strength BSIM models [6], are used as a reference. Statistical information is introduced using the statistical circuit simulation and analysis software RandomSpice [7]. Grid technology is used to facilitate large-scale statistical circuit simulations based on ensembles of  $> 10^3$  circuits running in parallel on many cores of a large compute cluster.

## II. THE TESTBED CIRCUIT

A simple test circuit was developed to investigate the effects of MOSFET variability on propagation delay, power and yield: a one bit full adder. The choice of such a small circuit allows for very large scale statistical SPICE simulations of the transistor level netlist, including parasitics and full test vector coverage. The ability to perform comprehensive statistical simulations aids in the understanding of which transistors are critical to the operation of the circuit, and which provide a major contribution to its sensitivity to variability. This wealth of statistical data has been vital in the testing of the proposed methodology. Performing similar statistical analysis on the full layout of larger circuits using transistor level SPICE simulations is prohibitive. The post place and route test circuit consists of 13 gates from 4 standard cells (inverter, NAND, OR and buffer) and contains a total of 52 transistors. The adder was designed using a commercial 130 nm technology, as the project had access to the cell internals of this technology and have been able to compare the accuracy of variation data with the fabrication process.

#### III. SIMULATION METHODOLOGY

The effects of statistical variability on the performance of the test circuit were measured and compared using three methods: Monte Carlo transistor level SPICE simulations, traditional Corner Analysis, and MCSTA. Large scale statistical SPICE simulations were performed as a reference against which the accuracy of Corner Analysis and MCSTA was compared.

#### A. Monte Carlo SPICE Simulations

Previous research into the statistical variation of individual transistors has found that RDD, Poly Silicon Granularity [8], [9] and LER [10] are the three main sources of fluctuations in threshold voltage  $(V_T)$ . Simulations of three dimensional atomistic transistor models have provided distributions of I-V curves for transistors which traditionally would have been represented by a single continuous charge model. These I-V curves have been converted to a library of BSIM models, allowing the range of effects of statistical variations on a transistor to be modelled using SPICE.

The effects of individual transistor variations can be combined and modelled at the circuit level by using RandomSpice, a statistical circuit simulation tool. RandomSpice replaces each MOSFET model instance within a SPICE netlist with BSIM models randomly selected from a process specific statistical library. Each individual transistor can therefore be modelled by a separate atomistic model. The values used to seed the randomisation of each transistor are recorded, creating an audit trail that allows each individual SPICE simulation to be reproduced at a later time.

Statistical variations on the adder netlist were investigated during this work by using a post place and route SPICE netlist, which included RC interconnects. Variability was artificially injected into the simulation process through the generation of transistor libraries used by RandomSpice, where the threshold voltages of individual MOSFETs were assigned in random fashion using a Gaussian distribution with appropriate values for the mean and standard deviation. The amount of variability injected was determined by the standard deviation of the Gaussian distribution, defined as a percentage of the threshold voltage of the N and P-type MOSFETs in the target technology. The values for  $V_{TN}$  and  $V_{TP}$  are approximately 305mV and 380mV respectively.

Seven levels of injected variation were investigated, where  $\sigma_{VT}$  was set at 10%, 15%, 20%, 25%, 30%, 40% and 50% of  $V_T$ . 10,000 randomised SPICE netlists were generated and simulated for each of these levels of variation. A single set of input stimuli was applied to each netlist.

### B. Corner Analysis

Commonly used cells are characterised using SPICE at multiple sets of process and environmental parameters (such as operating temperature and supply voltage) generating multiple Standard Cell Libraries (SCLs). The combinations of parameters are typically chosen to be at the extremes or corners of each parameter, hence the term Corner Analysis. It is assumed that the behaviour of the circuit is guaranteed at any point within the box created by performing STA at these corners [11].

Statistical variations of the adder circuit were modelled by generating SCLs at  $\pm 3\sigma_{VT}$ . This involved setting  $V_T$  for each of the transistor models within the SPICE netlists of four standard cells (buffer, inverter, NAND and OR) to  $V_T \pm 3\sigma_{VT}$ . Industry standard commercial tools were used to characterise the cells, generate the SCLs, and perform STA on the adder circuit.

Two SCLs  $(+3\sigma \text{ and } -3\sigma)$  were generated for each of the seven levels of injected variation, and STA was performed on the adder using each SCL. The input stimuli used within the Monte Carlo SPICE simulations was used to perform power analysis.

While it is assumed that 99.7% of circuits would perform between the  $3\sigma$  corners, the probability of every transistor within a cell being simultaneously at  $\pm 3\sigma_{VT}$  is exceedingly small. This probability is further reduced by the assumption that every instance of a standard cell within a circuit is simultaneously set at the same  $\sigma_{VT}$  level. Corner Analysis is therefore likely to produce very pessimistic predictions of circuit yield in comparison with the Monte Carlo SPICE simulations.

### C. Monte Carlo Static Timing Analysis

The proposed method of MCSTA is a compromise between the accuracy of Monte Carlo SPICE simulations, and the speed and practicality of STA.

This first requires the one-off generation of a Variation Cell Library (VCL), where RandomSpice is used to generate multiple randomised SPICE netlists of each standard cell, rather than full circuits. These randomised cell netlists are then passed into a commercial characterisation tool, generating a VCL in exactly the same fashion as an SCL, with the exception that every cell has multiple instances. These instances reflect the atomistic differences between transistors and provide a mechanism for modelling statistical variation within STA.

Multiple randomised copies of the gate level netlist are then generated, where references to cells within an SCL are replaced by references to the equivalent cell in a VCL, with a randomised suffix. Each individual cell can therefore be modelled by a separate set of atomistic transistor models.

Multiple randomised netlists can then be passed through existing STA tools to provide a distribution of the timing and power consumption of a design, measuring any slack within the critical paths and providing the probability of any paths through the design failing to meet timing requirements.

A VCL was generated for each of the seven levels of injected variation. Each VCL contained 500 randomised instances of each of the four standard cells. STA was then performed for 10,000 randomised gate level netlists for each VCL. The input stimuli used within the Monte Carlo SPICE simulations was used to perform power analysis.



Fig. 1. A comparison of SPICE and Corner Analysis at  $3\sigma$ .

## IV. RESULTS AND ANALYSIS

The results from the use of Corner Analysis are shown in Figure 1. The Corner Analysis prediction for maximum path delay at the highest level of variability is nearly 5.5 times greater than the largest delay generated by Monte Carlo SPICE simulations.

The results from the use of MCSTA are shown in Figure 2. This chart includes scatter plots of power consumption and maximum delay through the adder circuit, for each MCSTA and SPICE run at each level of variation. The cumulative distributions of delays are compared in Figure 3, from which predictions of yield can be made for different values of the system clock period. A comparison between the predictions made by the three different methods for power consumption and maximum delay required for a 99.7% yield are given in Tables I and II respectively. The constraints predicted by MCSTA to meet 99.7% yield are found to be within 1.2% and 2.9% of SPICE for power and delay respectively. This is a huge improvement over Corner Analysis where errors reach 782% and 548% for power and delay respectively.

 TABLE I

 POWER MEASUREMENTS FROM SPICE, MCSTA AND CORNER ANALYSIS

 AT 99.7% YIELD. ERRORS ARE ABSOLUTE WITH RESPECT TO SPICE

|              | SPICE     | Corner Analysis |        | MCSTA     |        |
|--------------|-----------|-----------------|--------|-----------|--------|
| $\sigma V_T$ | Power(uW) | Power(uW)       | %Error | Power(uW) | %Error |
| 10%          | 24        | 25.33           | 5.54   | 23.76     | 1      |
| 15%          | 24.04     | 26.82           | 11.56  | 23.79     | 1.04   |
| 20%          | 24.11     | 29.12           | 20.78  | 23.84     | 1.12   |
| 25%          | 24.16     | 33.25           | 37.62  | 23.9      | 1.08   |
| 30%          | 24.23     | 38.42           | 58.56  | 23.95     | 1.16   |
| 40%          | 24.41     | 78.18           | 220.28 | 24.12     | 1.19   |
| 50%          | 24.69     | 217.8           | 782.14 | 24.4      | 1.17   |

Statistical analysis of these distributions reveals that MC-STA and SPICE are in close agreement, with errors in the means of the distributions consistently below 1%, Table III. There is a consistent error in the standard deviation of the power distributions of around 30% produced by interpolation



Fig. 2. A scatter plot of the average power per input transition against the maximum path delay through the adder.



Fig. 3. Cumulative distribution of MCSTA and SPICE delays.

 TABLE II

 Delay measurements from SPICE, MCSTA and Corner Analysis

 at 99.7% Yield. Errors are absolute with respect to SPICE

|              | SPICE      | Corner Analysis |           | MCSTA      |           |
|--------------|------------|-----------------|-----------|------------|-----------|
| $\sigma V_T$ | Delay (ns) | Delay (ns)      | Error (%) | Delay (ns) | Error (%) |
| 10%          | 0.51       | 0.64            | 24.57     | 0.51       | 0.2       |
| 15%          | 0.52       | 0.73            | 41.59     | 0.52       | 0.39      |
| 20%          | 0.53       | 0.84            | 58.72     | 0.53       | 0.19      |
| 25%          | 0.55       | 0.98            | 79.7      | 0.54       | 1.28      |
| 30%          | 0.56       | 1.2             | 113.94    | 0.55       | 1.96      |
| 40%          | 0.59       | 1.97            | 232.79    | 0.58       | 2.36      |
| 50%          | 0.63       | 4.08            | 547.99    | 0.61       | 2.86      |

TABLE III ERROR BETWEEN MCSTA AND SPICE WHEN GENERATING DISTRIBUTIONS OF DELAY AND POWER CONSUMPTION

|              | Absolute Percentage Error (%) |             |            |             |  |  |  |
|--------------|-------------------------------|-------------|------------|-------------|--|--|--|
| $\sigma V_T$ | Mean Delay                    | Stdev Delay | Mean Power | Stdev Power |  |  |  |
| 10%          | 0.58                          | 1.40        | 0.81       | 31.88       |  |  |  |
| 15%          | 0.67                          | 1.33        | 0.83       | 28.69       |  |  |  |
| 20%          | 0.71                          | 1.38        | 0.83       | 29.35       |  |  |  |
| 25%          | 0.94                          | 4.51        | 0.83       | 29.94       |  |  |  |
| 30%          | 0.88                          | 9.04        | 0.82       | 31.04       |  |  |  |
| 40%          | 0.72                          | 12.07       | 0.78       | 29.81       |  |  |  |
| 50%          | 0.52                          | 14.83       | 0.89       | 27.79       |  |  |  |

and rounding errors within the STA tool when reading power information from the cell libraries.

MCSTA produces similar results to those of SPICE simulations at a fraction of the CPU time. Performance improved by a factor of over 40 for sample sizes of over 100. Larger circuits with greater numbers of test vectors will require significantly longer SPICE simulations, while MCSTA is performed statically without the need for test vectors and simulations. The benefits of using MCSTA therefore become even greater with larger, more complex, circuits.

## A. Variation in Critical Paths

The number of paths identified as most critical for any given analysis increased from two to fifteen at the highest levels of variability tested. This demonstrates the importance of including statistical information within circuit timing analysis, as the 'criticality' of a path can change even within such a small, simple example as the 1-bit adder.

## B. Power, Performance and Yield

The cumulative distribution functions of the power and delay measurements can be combined to create three dimensional plots of power, performance and yield: Figure 4. This information can be used by designers to establish tradeoffs between power consumption and maximum clock speed, in order to maximise the number of fabricated devices that perform within the required constraints.

### V. CONCLUSIONS

MCSTA provides an effective compromise between the accuracy of Monte Carlo SPICE simulations and the speed and practicality of Corner Based STA. Circuit performance predictions at 3-sigma are within 1.2% of SPICE for power consumption and 3% of SPICE for maximum path delay, at the highest levels of injected variability. This is a vast improvement over Corner Analysis which has errors of 782% and 548% for power and delay at the same level of variability.

MCSTA allows designers to obtain an accurate estimation of the Power, Performance and Yield trade-offs of a synthesised circuit. MCSTA can also highlight which paths through a design are likely to become critical, and help to guide the efforts of a designer when minimising path delays.

Large scale parallelisation is possible during MCSTA, as each randomised gate level netlist can be analysed independently. The development of Grid based technology makes this a practical approach to the statistical modelling of circuits.

#### REFERENCES

- X. Tang, V. De, and J. Meindl, "Intrinsic MOSFET parameter fluctuations due to random dopant placement," *Very Large Scale Integration (VLSI) Systems, IEEE Transactions on*, vol. 5, no. 4, pp. 369–376, 1997. [Online]. Available: 10.1109/92.645063
- [2] A. Asenov, S. Kaya, and A. Brown, "Intrinsic parameter fluctuations in decananometer MOSFETs introduced by gate line edge roughness," *Electron Devices, IEEE Transactions on*, vol. 50, no. 5, pp. 1254–1260, 2003. [Online]. Available: 10.1109/TED.2003.813457
- [3] J. Thiault, J. Foucher, J. H. Tortai, O. Joubert, S. Landis, and S. Pauliac, "Line edge roughness characterization with a three-dimensional atomic force microscope: Transfer during gate patterning processes," 10.1116/1.2101789, 2005. [Online]. Available: 10.1116/1.2101789



(a) Predicted Yield increases as clock and power constraints are relaxed



(b) A 2D view showing contour lines at every 10% increase in yield

Fig. 4. Power Performance Yield plots for the Adder circuit at  $\sigma V_t = 50\%$ . Equi-yield contour lines are placed at 10% intervals.

- [4] M. Berkelaar, "Statistical delay calculation, a linear time method," Proc. TAU, pp. 15–24, 1997.
- [5] D. Blaauw, K. Chopra, A. Srivastava, and L. Scheffer, "Statistical timing analysis: From basic principles to state of the art," *Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions* on, vol. 27, no. 4, pp. 589–607, 2008. [Online]. Available: 10.1109/TCAD.2007.907047
- [6] B. Sheu, D. Scharfetter, P. Ko, and M. Jeng, "BSIM: berkeley shortchannel IGFET model for MOS transistors," *Solid-State Circuits, IEEE Journal of*, vol. 22, no. 4, pp. 558–566, 1987.
- [7] "Gold Standard Simulations Ltd. : RandomSpice." [Online]. Available: http://www.goldstandardsimulations.com/services/circuitsimulation/random-spice/
- [8] G. Roy, F. Adamu-Lema, A. Brown, S. Roy, and A. Asenov, "Simulation of combined sources of intrinsic parameter fluctuations in a 'real' 35 nm MOSFET," in *Solid-State Device Research Conference*, 2005. ESSDERC 2005. Proceedings of 35th European, 2005, pp. 337–340.
- [9] X. Wang, B. Cheng, S. Roy, and A. Asenov, "Simulation of strain enhanced variability in nMOSFETs," in *Ultimate Integration of Silicon*, 2008. ULIS 2008. 9th International Conference on, 2008, pp. 89–92.
- [10] S. Xiong and J. Bokor, "A simulation study of gate line edge roughness effects on doping profiles of short-channel MOSFET devices," *Electron Devices, IEEE Transactions on*, vol. 51, no. 2, pp. 228–232, 2004.
- [11] N. H. E. Weste and D. Harris, CMOS VLSI Design, A Circuits and Systems Perspective, third edition ed. Addison Wesley, 2005.