# Process-Variation-Aware Iddq Diagnosis for Nano-Scale CMOS Designs - The First Step Chia-Ling (Lynn) Chang<sup>†</sup>, Charles H.-P. Wen<sup>†</sup>, and Jayanta Bhadra<sup>‡</sup> †Dept. of Electrical & Computer Engineering, National Chiao Tung University, Hsinchu, Taiwan 300 ‡Freescale Semiconductor Inc., Austin, TX 78729 E-mail: tinger.cm98g@nctu.edu.tw, opwen@g2.nctu.edu.tw, jayanta.bhadra@freescale.com Abstract—Along with the shrinking CMOS process and rapid design scaling, both Iddq values and their variation of chips increase. As a result, the defect leakages become less significant when compared to the full-chip currents, making them more in-distinguishable for traditional Iddq diagnosis. Therefore, in this paper, a new approach called $\sigma$ -Iddq diagnosis is proposed for reinterpreting original data and diagnosing failing chips, intelligently. The overall flow consists of two key components, (1) $\sigma$ -Iddq transformation and (2) defect-syndrome matching: $\sigma$ -Iddq transformation first manifests defect leakages by excluding both the process-variation and design-scaling impacts. Later, defect-syndrome matching applies data mining with a pre-built library to identify types and locations of defects on the fly. Experimental results show that an average of 93.68% accuracy with a resolution of 1.75 defect suspects can be achieved on ISCAS'89 and IWLS'05 benchmark circuits using a 45nm technology, demonstrating the effectiveness of $\sigma$ -Iddq diagnosis. #### I. INTRODUCTION Iddq testing have been a critical integral component to screen unreliable circuits, especially for those designs with high reliability demand, such as automotive and medical devices [1]. Iddq testing aims at monitoring the quiescent power supply current (Iddq) of CMOS circuits. Moreover, Iddq data can be also used to facilitate diagnosing failure mechanisms [1][7]-[10]. Open defects and short defects are two common failure mechanisms that have been extensively studied for diagnosis. While open defects can be well-treated through voltage-based diagnosis [2][3], short defects are often diagnosed on current-based diagnosis [4][5]. For current-based testing, threshold current [7][8], $\Delta$ -Iddq [10], or current signature [1][9] were proposed to analyze data and different types of defects were successfully diagnosed. Although these previous works are effective and easy to implement, their resolutions of defect detection often depended on the quality of *Iddq measurement* and the determination of a *threshold value* is no longer trivial. Moreover, for current-based diagnosis, the difference between defect-free leakage and defective leakage becomes more distinguishable along with growing process variation and design scaling. To reveal this problem, Figure 1(a) shows an example of Iddq distribution for a failing chip in a 180nm technology where defects can be easily diagnosed through a threshold current to indicate pass/fail for each pattern. However, with increasing leakages in nanometer designs, it is more difficult to determine a threshold current when considering process variation. Figure 1(b) shows another example of Iddq distribution for the same design in a 45nm technology. As a result, process variation smoothenes the curve of Iddq currents, making the threshold-value determination clueless. More advanced test methods, such as $C\Delta$ -Iddq [5] and $\sigma$ -Iddq [6], were developed later to improve the quality of Iddq testing by restoring defect leakages. Similarly, Iddq diagnosis can be enhanced by removing the process-variation and design-scaling impacts from the total leakage of nanometer designs. Fig. 1. Iddq distributions in different technologies As a result, $\sigma$ -**Iddq diagnosis** is proposed on the basis of such concept [6] and pre-builds a defect directory to rationalize faulty behaviors and to ascertain defect suspects as voltage-based diagnosis uses fault directory on failing chips. For dictionary-based analysis, a model is required to describe the faulty behavior of a defect. According to [8], three types of short defects are frequently used in Iddq diagnosis: *transistor short* [8], *wire-to-VDD/wire-to-GND short* [1], and *wire-to-wire short* [7]. Most of previous works on diagnosis [1][7][8] only target one type of three defects stated above at a time. However, in this paper, our $\sigma$ -Iddq diagnosis targets all three types of defects simultaneously and tries to identify candidates with defect types and their locations correctly. Owing to sharing the same faulty behavior, a transistor short defect behaves like either a wire-to-VDD or a wire-to-GND short defect, and thus is combined, accordingly at this stage. As a result, $\sigma$ -Iddq diagnosis proposed in this paper classifies defects into a wire-to-VDD, a wire-to-GND short or a wire-to-wire short as the first step. Separating of transistor-short from a wire-to-VDD or a wire-to-GND short is our future work as the next step. The proposed $\sigma$ -Iddq diagnosis consists of two key components, (1) $\sigma$ -Iddq transformation and (2) defect-syndrome matching, where the defect bitmap is pre-built for explaining underlying failure mechanisms: wire-to-VDD, wire-to-GND, and wire-to-wire shorts. $\sigma$ -Iddq transformation first converts the original Iddq data to $\sigma$ -Iddq data by removing the impacts from both process variation and design scaling. Later, each defect syndrome extracted by a data-mining algorithm helps match with a pre-built library called *defect bitmap* to identify its defect type and location. To demonstrate the effectiveness of our $\sigma$ -Iddq diagnosis, both original Iddq data and $\sigma$ -Iddq data are applied for comparison. #### II. MODELING DEFECT MECHANISMS In CMOS circuits, open and short defects are two most common failure mechanisms. Since Iddq testing measures the quiescent leakage current, those mechanisms do not take effect if there is no leakage path conducted from VDD to GND. Namely, Iddq testing aims at detecting these defects with extra significant leakage. Moreover, since open defects can usually be diagnosed by voltage-based approaches, only short defects are targeted in this paper. For a more detailed classification, short defects can be divided into wire-to-VDD short (WV), wire-to-GND short (WG) and wire-to-wire short (WW). # A. Behaviors of Short Defects Figure 2 illustrates different types of short defects in different regions of a inverter. Figure 2(a) and Figure 2(b) show a wire-to-VDD (WV) short and a wire-to-GND (WG) short in the intra-cell region respectively. Both of them induce leakage paths from VDD to GND, resulting in a significant leakage on such inverter without dropping the output voltage. The bridge between the outputs of two inverters is considered as a defect in the inter-cell region where two wires have the opposite logic values in a *wire-to-wire* short. Figure 2(c) shows an example for such a wire-to-wire short where a significant leakage current runs through the outputs of two inverters. Fig. 2. Different types of short defects for inverters # B. Defect Bitmap from Simulation Like other dictionary-based diagnosis [8], a pre-built *defect bitmap* is derived and compared with circuit behaviors (i.e. current syndrome) for deducing defect suspects, where the extraction of current syndrome will be provided in Section III. With LBM, DBM can be deduced by Boolean operations to indicate the activations of each defect. For a wire-to-GND short, since it conducts a leakage path from VDD to GND on the output at logic 1 as shown in Figure 2(b), we use 1 to denote such defect activation. On the other hand, a wire-to-GND short cannot induce an extra leakage on the output at Fig. 3. Overall flow of $\sigma$ -Iddq diagnosis logic 0, we use 0 to characterize a defect-free case in DBM. As a result, the result of a wire-to-GND short on this wire is no different to the original logic value of the wire. #### III. $\sigma$ -IDDQ DIAGNOSIS As introduced in the previous section, once the defect bitmap for failing chips is successfully derived, it can be used to locate the defect suspect via comparison with the current syndrome measured from the failing chips. However, due to the increasing leakages from both process variation and design scaling, it is getting more difficult to extract current syndrome precisely. Therefore, a new approach, $\sigma$ -Iddq diagnosis, is proposed to reinterpreted Iddq data and applies data mining for facilitating failing-chip diagnosis. The overall flow of the $\sigma$ -Iddq diagnosis is illustrated as Figure 3 and consists of two key components: (1) $\sigma$ -Iddq transformation and (2) defect-syndrome matching. Details for both components are elaborated in the following sections, respectively. ## A. $\sigma$ -Iddq Transformation Conceptually, $\sigma$ -Iddq applies a variation-aware full-chip leakage estimator and the average-case process parameters to deal with variation and to deduce circuit behavior (i.e. current syndrome). Once the current syndrome for each pattern is derived for each failing chip, the location and type of a defect can be identified, accordingly. There are three steps in $\sigma$ -Iddq transformation. 1) Building Process-aware Leakage Models: Many previous works [11][12] were devoted to developing highly accurate leakage models. In our work, we modify the ones in [11], where two most important leakage components (1) the subthreshold leakage ( $I_{sub}$ ) and (2) the gate-tunneling leakage ( $I_{gate}$ ) are considered in full-chip leakage estimation. Both the subthreshold leakage and the gate-tunneling leakage can be modeled into exponential terms of the variance on effective gate length ( $\triangle L_{eff}$ ) and on the gate oxide thickness ( $\triangle T_{ox}$ ), and can be written as, $$I_{sub} = e^{u_0 + \alpha_1 \triangle L_{eff} + \alpha_2 \triangle T_{ox}} \tag{1}$$ $$I_{gate} = e^{v_0 + \beta_1 \triangle L_{eff} + \beta_2 \triangle T_{ox}} \tag{2}$$ In Equation (1) and Equation (2), $u_0$ and $v_0$ denotes the nominal values of the overall parameter for effective gate length and gate oxide thickness, respectively. $\alpha_1$ , $\alpha_2$ , $\beta_1$ and $\beta_2$ are fitting parameters and depend upon the process variation of different technologies in use. Therefore, the process-aware leakage models for each logic cell can be built by fitting the parameters for $I_{sub}$ in Equation (1) and for $I_{gate}$ in Equation (2) through SPICE simulation. 2) Deducing Approximate Process Parameters: Process-aware leakage model for each cell is used to deduce the process parameters and thus estimate the Iddq for each chip. Unlike estimating average-case process parameters from one single pattern as in [6], the objective of $\sigma$ -Iddq diagnosis is to find a process-parameter combination that minimizes the variance of all residual Iddq with respect to test pattern set $\bf p$ and thus can be formulated as, $$\arg\min_{\triangle L_{eff},\triangle T_{ox}} \{ var(I_{cut}^{\mathbf{p}} - \hat{I}_{cut}^{\mathbf{p}}(\triangle L_{eff}, \triangle T_{ox})) \}$$ where $$\hat{I}_{cut}^{\mathbf{p}}( riangle L_{eff}, riangle T_{ox}) = \sum_{k=1}^{n} I_{sub,k}^{\mathbf{p}} + I_{gate,k}^{\mathbf{p}}$$ $I_{cut}^{\mathbf{p}}$ is the measured Iddq from all Iddq patterns and $\hat{I}_{cut}^{\mathbf{p}}(\triangle L_{eff}, \triangle T_{ox})$ is the defect-free estimated Iddq. Symbol n is the total number of cells in the circuit, and $I_{sub,k}^{\mathbf{p}}$ and $I_{gate,k}^{\mathbf{p}}$ denote the subthreshold leakage and gate-tunneling leakage for cell k, respectively. However, similar to [6], the process parameters $\triangle L_{eff}$ and $\triangle T_{ox}$ are assumed the same on each cell. 3) **Transforming Iddq to** $\sigma$ **-Iddq**: Once the proper process parameters are derived, the residual Iddq can be computed from the measured Iddq and estimated Iddq. Such residual Iddq is called $\sigma$ -Iddq and can be formulated as, $$\sigma$$ - $Iddq_{cut}^{\mathbf{p}} = \underbrace{I_{cut}^{\mathbf{p}}}_{measured\ Iddq} - \underbrace{\hat{I}_{cut}^{\mathbf{p}}}_{expected\ Iddq}$ # B. Defect-Syndrome of Matching Having $\sigma$ -Iddq, we can further compute current syndrome without requiring a threshold value for defect diagnosis. To achieve this, data mining first comes into play and helps cluster data and label activation of defects in each Iddq measurement. Second, defect type and location can be derived through the comparison with the pre-built defect bitmap (DBM). The following sessions elaborates these two steps: 1) Syndrome Extraction from Currents: We apply data mining to help determine if $\sigma$ -Iddq is defect-activated or defect-free. K-means [13] is a well-known data mining algorithm based on Euclidean distance and requires a number k to cluster data. In this paper, a prior knowledge is assumed available and sets the number of clusters as 2 (i.e. k=2). In the other words, if any short defect is activated in the failing chip, $\sigma$ -Iddq data will be grouped into at least two clusters. However, the presence of multiple defects may induce more than two clusters of $\sigma$ -Iddq data. But setting k as 2 is sufficient because K-means only needs to group small $\sigma$ -Iddq values into the defect-free cluster and to group the rest (with large values) into the defect-activated cluster. TABLE I CIRCUIT INFORMATION | Circuit | Gate | Grid | # Iddq | WV/WG | WW | |-----------|--------|--------|----------|--------|--------| | Name | Counts | Number | Patterns | Shorts | Shorts | | s13207 | 3205 | 16x16 | 270 | 6410 | 6485 | | s15850 | 3899 | 16x16 | 237 | 7798 | 7284 | | s35932 | 6964 | 16x16 | 48 | 13928 | 14163 | | s38417 | 10966 | 16x16 | 357 | 21932 | 22438 | | s38584 | 11378 | 16x16 | 459 | 22756 | 24237 | | mem_ctrl | 8364 | 16x16 | 332 | 16728 | 16624 | | ac97_ctrl | 11577 | 16x16 | 215 | 23154 | 25426 | | aes_core | 24057 | 16x16 | 329 | 48114 | 50135 | | ethernet | 71639 | 32x32 | 1090 | 143278 | 157844 | | vga_lcd | 113648 | 32x32 | 3575 | 227296 | 243673 | 2) **Defect Matching using Syndromes**: Once the current syndrome for each failing chip is obtained through $\sigma$ -Iddq transformation and K-means clustering, a pre-built defect bitmap (DBM) will be used and compared with the current syndrome on each bit. For ranking defects: the more matching bits, the higher score. Last, all defects are sorted by their matching scores on this failing chip. #### IV. EXPERIMENTAL RESULTS The proposed $\sigma$ -Iddq diagnosis is implemented in C++. Experiments run on Linux equipped with a 2.8GHz CPU and 16GB RAM. ISCAS'89 and IWLS'05 [14] are used as benchmark circuits for evaluation. The Iddq test patterns are generated by TetraMax and achieves $\sim 100\%$ pseudo-stuckat fault coverage. Moreover, SOC Encounter from Cadence is used to extract layout information using the Nangate 45nm Open Cell library [15] for leakage simulation and wire-to-wire short extraction. For simplicity, only basic cells, e.g. NAND2, NOR2, AND2, OR2, INV and BUF from the Nangate cell library [15] are used in our experiments. The setting of process variation is same as in [6]. For each benchmark circuit, 2000 samples are generated as failing chips, where each chip is randomly injected with one defect either wire-to-VDD, wire-to-GND or wire-to-wire short. To realize actual defect-induced currents in nano-scale CMOS process, high-impedance resistors are used in our SPICE simulation to induce small currents for short defects. These resistors are $100K\Omega$ , $200K\Omega$ and $500K\Omega$ , inducing $10.1\mu A$ , $5.5\mu A$ and $2.2\mu A$ leakage currents, respectively, for a 1.1V supply voltage. In other words, one of nine possible short defects is injected in each failing chip. Note that the all nine short defects only cause small voltage drops at circuit outputs, thus escaping from voltage-based testing and diagnosis. Table I lists the information of benchmark circuits including the gate counts, number of grids used in intra-die variation, number of Iddq patterns generated for pseudo stuck-at faults, number of wire-to-VDD(WD)/wire-to-GND(WG) defects and wire-to-wire(WW) defects. Wire-to-wire (WW) short candidates are extracted from the neighborings of each gate with large coupling capacitances. The fault coverage from Iddq test patterns are $\sim\!100\%$ on all benchmark circuits. TABLE II INJECTED SHORT DEFECTS AND DIAGNOSIS RESULTS | | Injected | | Diagnosis Results | | | | | | | | | | |-----------|---------------|-----------|------------------------------------------|-----------|---------|---------|-------------------------------------------|-----------|-----------|---------|---------|---------| | Circuit | short defects | | original Iddq + defect-syndrome matching | | | | $\sigma$ -Iddq + defect-syndrome matching | | | | | | | Name | WV/WG | WW | WV/WG | WW | 1st-hit | 1st-hit | time | WV/WG | WW | 1st-hit | 1st-hit | time | | | shorts(%) | shorts(%) | shorts(%) | shorts(%) | accu. | size | (s) | shorts(%) | shorts(%) | accu. | size | (s) | | s13207 | 65.90 | 34.10 | 64.25 | 32.65 | 96.90 | 1.88 | 19.48 | 64.90 | 32.80 | 97.70 | 1.87 | 56.51 | | s15850 | 66.80 | 33.20 | 65.35 | 31.80 | 97.15 | 1.69 | 36.23 | 66.75 | 32.40 | 99.15 | 1.63 | 51.38 | | s35932 | 65.95 | 34.05 | 43.65 | 23.35 | 67.00 | 2.05 | 48.76 | 57.25 | 30.90 | 88.15 | 1.66 | 56.60 | | s38417 | 66.45 | 33.55 | 65.05 | 32.05 | 97.65 | 1.54 | 90.88 | 65.95 | 32.80 | 98.75 | 1.52 | 124.31 | | s38584 | 65.75 | 34.25 | 46.75 | 24.80 | 71.55 | 1.72 | 107.19 | 60.95 | 32.10 | 93.05 | 1.45 | 156.28 | | mem_ctrl | 65.25 | 34.75 | 61.90 | 33.35 | 95.25 | 1.63 | 70.90 | 64.50 | 34.30 | 98.80 | 1.57 | 108.35 | | ac97_ctrl | 67.15 | 32.85 | 62.20 | 31.50 | 93.70 | 1.53 | 68.61 | 66.10 | 32.45 | 98.55 | 1.45 | 86.97 | | aes_core | 66.15 | 33.85 | 45.40 | 24.30 | 69.70 | 1.35 | 115.73 | 65.75 | 33.10 | 98.85 | 1.35 | 162.98 | | ethernet | 64.65 | 35.35 | 1.10 | 0.55 | 1.65 | 2.85 | 171.87 | 52.00 | 29.75 | 81.75 | 3.32 | 336.71 | | vga_lcd | 69.05 | 30.95 | 0.25 | 0.15 | 0.40 | 2.34 | 903.82 | 54.93 | 27.13 | 82.06 | 1.77 | 1683.73 | | Average | | | | | 69.09 | 1.86 | | | | 93.68 | 1.75 | | Table II lists the ratio of different injected defects and the diagnosis results. The first column denotes the benchmark circuits used in the experiments. The second and third columns denotes the average ratio of the failing chips injected with wire-to-VDD(WV)/wire-to-GND(WG) shorts and wire-to-wire(WW) shorts among 2000 failing chips, respectively. To demonstrate the effectiveness of $\sigma$ -Iddq transformation, the proposed defect-syndrome matching is also applied onto original Iddq, directly. For the two wide columns representing original Iddq and $\sigma$ -Idda diagnosis, respectively, in Table II, the number of correct detections for wire-to-VDD(WV)/wire-to-GND(WG) shorts, the number of correct detections for wire-to-wire(WW) shorts, the first-hit accuracy, the first-hit group size and runtime are listed, respectively. The first-hit accuracy (denoted by 1st-hit accu.) denotes the percentage of first-hit candidates over the injected defects and the first-hit group size (denoted by 1sthit size) denotes the average number of first-hit candidates. Since the number of defect suspects is much bigger than the number of Iddq test patterns, two or more defect suspects may score the same and are ranked equally. As shown in Table II, the first-hit accuracy using original Iddq and $\sigma$ -Iddq data are 69.90% and 93.68%, respectively. Moreover, according to our results, the average number of first-hit candidates is only 1.75, implying that for most of the cases one or two different defect suspects are found only. ## V. Conclusions Subject to process variation and design scaling, short defects are becoming more difficult for traditional diagnosis. Therefore, a new approach called $\sigma$ -Iddq diagnosis is proposed in this paper and consists of two key components: (1) $\sigma$ -Iddq transformation and (2) defect-syndrome matching. Experimental results show that an average of 93.68% first-hit accuracy on ISCAS'89 and IWLS'05 benchmarks with 1.75 first-hit candidates in a 45nm technology. Comparing diagnosis results on original Iddq and $\sigma$ -Iddq, $\sigma$ -Iddq diagnosis exhibits higher first-hit accuracy and fewer mismatches on defect syndrome, especially on large-scale circuits. Thus, $\sigma$ - Iddq diagnosis not only successfully manifest the defect-induced leakage on failing chips, but also correctly identifies defect type from wire-to-VDD, wire-to-GND, and wire-to-wire shorts and its location. Differentiating transistor shorts from wire-to-VDD/wire-to-GND shorts is our next step to improve $\sigma$ -Iddq diagnosis as future work. #### REFERENCES - [1] A. Kun, R. Arnold, P. Heinrich, G. Maugard, H. Tang, W.-T. Cheng, "Deterministic IDDQ Diagnosis Using a net activation based model," in *Proc. ITC*, pp. 1-10, 2011 - [2] C.-M. Li and E. J. McCluskey, "Diagnosis of resistive-open and stuck-open defects in digital CMOS ICs," in *TCAD*, vol. 24, no. 11, pp.1748-1759, 2005. - [3] X. Yu and R. D. Blanton, "Estimating defect-type distributions through volume diagnosis and defect behavior attribution," in *Proc. ITC*, 2010. - [4] D. B. Lavo, T. Larrabee, and J. E. Colburn "Eliminating the Ouija board: automatic thresholds and probabilistic Iddq diagnosis," in *Proc. ITC*, pp. 1065-1072, 1999. - [5] C. Thibeault and Y. Hariri, "C∆iddq: Improving current-based testing and diagnosis through modified test pattern generation," in *TVLSI*, vol. 19, no. 1, pp.130-141, 2011. - [6] C. L. Chang, C. C. Chang, H.L. Chan, H.P. Wen, "An Intelligent Analysis of Iddq Data for Chip Classification in Very Deep-Submicron (VDSM) CMOS Technology," in *Proc. ASPDAC*, 2012. - [7] H. T. Vierhaus, W. Meyer and U. Glaser, "CMOS bridges and resistive transistor faults: IDDQ versus delay effects," in *Proc. ITC*, pp. 83-91, 1993. - [8] R. C. Aitken, "A comparison of defect models for fault location with Iddq Measurements," in *Proc. ITC*, pp. 1051-1060, 1993. - [9] P. Nigh and A. Gattiker, "Random and systematic defect analysis using iddq signature analysis for understanding fails and guiding test decisions," in *Proc. ITC*, pp. 309-318, 2004. - [10] C. Thibeault and L. Boisvert, "Diagnosis method based on ΔIddq probabilistic signatures: experimental results," in *Proc. ITC*, pp. 1019-1026, 1998. - [11] H. Chang and S. S. Sapatnekar, "Prediction of leakage power under process uncertainties," ACM Trans. Des. Autom. Electron. Syst., vol. 12, no. 2, p. 12, 2007. - [12] H. F. Dadgour, L. Sheng-Chih, and K. Banerjee, "A statistical framework for estimation of full-chip leakage-power distribution under parameter variations," *Electron Devices, IEEE Transactions on*, vol. 54, no. 11, pp. 2930-2945, 2007. - [13] J. B. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press. pp. 281-297, 1967. - [14] IWLS 2005, http://www.iwls.org/iwls2005/benchmarks.html. - [15] FreePDK45. http://www.eda.ncsu.edu/.