# **RTL Test Pattern Generation for High Quality Loosely Deterministic BIST**

M. B. Santos, J.M. Fernandes, I.C. Teixeira and J. P. Teixeira IST / INESC-ID, R. Alves Redol, 9, 1000-029 Lisboa, Portugal jct@inesc-id.pt

#### Abstract

High quality Built-In Self Test (BIST) needs to efficiently tackle the coverage of random-pattern-resistant (r.p.r) defects. Several techniques have been proposed to cover r.p.r faults at logic level, namely, weighted pseudo-random and mixed-mode. In mixed-mode test pattern generation (TPG) techniques, deterministic tests are added to pseudorandom vectors to detect r.p.r. faults. Recently, a RTL mixed-mode TPG technique has been proposed to cover r.p.r defects, the mask-based BIST technique. The purpose of this paper is to present mask-based BIST TPG improvements, namely in two areas: RTL estimation of the test length to be used for each mask, in order to reach high Defects Coverage (DC), and the identification of an optimum mask for each set of nested RTL conditions. Results are used to predict the number of customized vectors for each mask of one ITC'99 benchmark module.

## 1. Introduction

New product development, based on SOC (System on a Chip) and IP (Intellectual Property) cores, requires, as much as possible, design and test *reuse* [1]. High design productivity drives the need for test preparation to be carried out as early as possible in the design flow, thus at RTL (Register Transfer Level) [2]. However, RT-level test patterns are not routinely reused for production test, since high-quality structural tests require detailed knowledge of the final structural implementation, only available after logic synthesis.

Cost effective product development makes use of TRP (Test Resource Partitioning) between the product under development and the target ATE (Automatic Test Equipment), leading to an increased use of BIST (Built-In Self Test) techniques in today SOCs and cores, with built-in sources (TPG, or Test Pattern Generators), TAM (Test Access Mechanisms) and sinks (Signature Analyzers).

High quality BIST needs to efficiently tackle the coverage of random-pattern-resistant (r.p.r) defects. This is crucial, as low-cost built-in TPG usually generate PR (Pseudo Random) vectors, through LFSR (Linear Feedback

Shift Registers) or CA (Cellular Automata) [3,4]. PR tests usually require extremely long BIST sessions. Several techniques have been proposed to cover r.p.r *faults* at logic level [3-5]. These techniques either modify the CUT (Circuit Under Test) by test point insertion, or generate weighted pseudo-random vectors, or introduce mixed-mode test generation. In mixed-mode TPG, deterministic tests are added to PR vectors to detect the r.p.r. faults. e.g., by reseeding an LFSR [6].

Test quality is usually measured by the FC (Fault Coverage) metrics, or more accurately by the DC (Defects Coverage) metrics [7]. FC is usually computed as the percentage of gate-level, single Line Stuck-At (LSA) faults detected by the test pattern. DC is computed as the *weighted* percentage of listed defects which are detected by the test pattern. The weighting factor is the relative likelihood of occurrence of the defects, dependent on process line statistics and defects critical area. One key attribute of the quality of the BIST solution is test length, as it directly impacts IP core test application time and the energy required to perform the BIST session [8].

Recently, a novel RTL mixed-mode technique has been proposed to cover r.p.r defects, referred as mask-based BIST [9]. The technique, reviewed in section 2, relies on the mask concept. For combinational modules, a mask is one partially defined input vector which activates a specific functionality, hard to activate by PR vectors, as the input sub-space for which it is triggered is very limited. Reseeding techniques allow the coverage of hard to detect logic faults at given locations of the CUT's structure. For each hard fault, one seed is applied once. In contrast to reseeding techniques, the mask-based technique considers the RTL description and targets the coverage of hard to activate functionality. For each hard functionality, one mask is applied, the don't care bits are filled with PR values (generated by the built-in TPG), and a set of vectors is applied to fully scrutinize the underlying physical structure which implements it, thus uncovering hard defects with few test vectors (see Fig. 1).

The mask-based BIST technique exhibits several degrees of freedom, associated with the trade-offs among area

overhead, speed degradation, energy consumption and test length. How many masks (and which masks) need to be generated? How many vectors per mask need to be applied, to fully exploit DC enhancement? How can such test length per mask be estimated at RTL? Does test length per mask depends on the correspondent functional complexity? The purpose of this paper is to discuss and present *mask-based BIST TPG improvements*, that exploits the trade-off between total test length and the number of masks.



Figure 1: Masked-based TPG

The paper is organized as follows. In section 2 the maskbased BIST approach is reviewed. Test length dependency on functional complexity is analyzed in section 3. In section 4, test length of masked vectors is estimated. In section 5, multiple mask generation trade-offs are identified. Results are presented in section 6 for an ITC'99 benchmark module, showing high correlation between the estimated number of vectors and the useful test length computed using fault simulation on different structural implementations and fault lists. In section 7, the conclusions and directions for further work are described.

## 2. Mask-based BIST

Low-cost RTL fault simulation, using PR random vectors, easily identifies parts of the system's functionality with low controllability and/or observability (referred as dark corners). Such hard functionality can correspond to independent parts, or dependent ones, namely nesting of RTL conditions (IF, CASE, etc) (Fig. 2). For the activation of a dark corner, a set of bits in the input word must be specified. Masks are partially specified input vectors, for which typically only a small subset of  $m_i$  positional bits are specified  $(m_i \ll l, where l is the word length of the input$ vector). Masked-based BIST is defined, by customizing a PR test with *m* masks (each one forcing  $m_i$  positional bits in each PR vector). As seen in Fig.1, the result is a Loosely deterministic BIST. Test length compression is obtained applying k<sub>i</sub> consecutive customized vectors with each mask. Test length estimation, at RTL, is one of the goals of this work.

RTL test reuse for structural testing is possible due to the high correlation between a RT-level testability metrics, the IFMB metrics, and the logic level DC metrics. IFMB stands for *Implicit Functionality and Multiple Branch* coverage. As shown in [9], if adequate RTL fault models are derived, there is a strong correlation between *multiple* detection of RTL faults and single detection of likely physical defects. In our environment, physical defects are extracted from the IC layout, using a proprietary tool, lobs. Mixed-level fault simulation and metrics computation (IFMB, FC or DC) are carried out by a proprietary tool, VeriDOS [11] Catastrophic defects (modifying circuit topology) are considered, both bridging and open defects.



Figure 2: Hard functionality identification

In order to generate a high-DC RTL test patterns, it is assumed that r.p.r defects are associated with the physical structure implementing the hard functionality, seldom exercised (dark corners). VeriDOS directly identifies in the RTL code the lines associated with the hard functionality.

Mask generation has been first performed manually. An automated process is being implemented using BDDs (Binary Decision Diagrams). through a customized version of an academic tool for the parsing [12], and the SIS environment [13] for the BDD library and utility functions. The RTL ATPG tool for mask generation will be presented elsewhere.

The difficulty of control and/or observe parts of the system's functionality can be associated to the requirement of (1) a special sequence of vectors, in the case of sequential modules [14], or of (2) a specification of a number of bits  $(m_i)$  of the module inputs. The case of sequential modules is beyond the scope of this paper. A dark corner can occur, for example, due to a chain of IF statements, requiring n specific values in order to enable some action. If a mask is generated for the *most* restrictive IF condition (requiring the specification of *n* bits), only the functionality associated to this condition will be exercised and observed. What about the less restrictive conditions? Less restrictive does not mean easy to satisfy. In a previous work [15], the authors have shown that, when using a single mask per dark corner, the test length required for exercising its functionality is minimized when one mask is generated leaving p=n/2 bits intentionally unspecified (i.e., forcing the first tested n/2 bits).

As it will be shown in section 5, when considering *multiple* mask generation for each dark corner, a trade-off exists between the test length required for exercising the complete functionality and the number of masks that are used.

### 3. Test Length and Dark Corners Complexity

Test length estimation began with the computation of useful  $k_i$  values for different masks on a module (AGU control) of an ITC'99 benchmark circuit (CMUDSP [16]). Useful test vectors are the ones that increase DC; after a given number  $k_i$ , no gain in DC can be obtained with a given mask. For each mask, the useful  $k_i$  values (obtained via fault simulation) have been compared with different test length estimation approaches, based on the RTL information such as (1) the number of assignments, (2) the number of bits assigned inside the functionality covered by the mask, and (3) the number of conditions and type of operators. No significant correlation was identified between the computed useful  $k_i$  values and the number of bits assigned, the number of assignments or the number of conditions. In this example, for which the functionality covered by the masks includes only logical operators, the  $k_i$ values did not exhibit a relevant dependence on the type or number of these operators. However, if arithmetic or relational operators were present, high DC values would require *implicit* functionality coverage [12]. In this work we address test length usefulness estimation for masks that exercise basic functionality, such as assignments, IF and CASE conditions, shifts and logical operators.

In order to evaluate the test length independence on the dark corner complexity (with the restrictions specified above), two small, comprehensive, examples were designed and used as test vehicles. These examples consist on 8 and 16 bit data path versions of a circuit with a functionality that executes a different number of shifts on a data path depending on the number of conditions satisfied. Seven conditions were serialized, requiring a specific value on 8, 10, 12, 14, 16, 18 and 20 control bits. Seven masks were generated, each one satisfying only one target condition. After 2000 random vectors, each mask was used to customize 100 vectors, sequencing the masks from the less restrictive to the most restrictive. Fault simulation results, using LSA faults on the synthesized structures, for the 8 and the 16 bits circuits, are depicted in Fig. 3. The number of useful vectors,  $k_i$ , is defined as the number of the last customized vector that detects a fault.. For each mask, this number is presented in Table 1. In this table no correlation is clear between the number of useful vectors and the data path width. Furthermore, an hypothetical correlation should not take into account the two first masks, as it is probable that, through the first 2000 PR vectors, the corresponding dark corners here already partially

exercised, given the reduced number of defined bits in the associated conditions.



Figure 2: Fault simulation results with masks that target 7 dark corners in two similar circuits (8 and 16 bit data path).

| mask | 8 bits | 16 bits |  |  |
|------|--------|---------|--|--|
| 0    | 2      | 7       |  |  |
| 1    | 6      | 8       |  |  |
| 2    | 23     | 17      |  |  |
| 3    | 11     | 10      |  |  |
| 4    | 9      | 11      |  |  |
| 5    | 9      | 9       |  |  |
| 6    | 3      | 8       |  |  |

Table 1: Number of useful vectors for each mask.

## 4. Test Length Estimation

Consider that the most restrictive part of a given nested dark corner requires one unique combination of *n* primary input bits in order to be exercised and observed. If *n*-*p* bits in the mask are specified, a masked vector will have a probability of  $\frac{1}{2^p}$  to satisfy the most restrictive condition. The probability of achieving control and observation with  $k_i$  vectors is given by:

$$CL = 1 - \left(1 - \frac{1}{2^p}\right)^{k_i} \tag{1}$$

This equation can be solved to compute the number of vectors,  $k_i$ , required to satisfy a *n* bits condition defining only *n*-*p* bits, with a certain *confidence level*, *CL*:

$$k_i = \frac{\log(1 - CL)}{\log\left(\frac{2^p - 1}{2^p}\right)}$$
(2)

This is a very useful equation as it allows the estimation of the number of PR vectors that must be customized by a given mask. Please note that  $k_i$  only depends on the number of unspecified bits, p, that are relevant for the functionality that is to be exercised by the mask. Equation (2) is represented graphically in Fig. 4 for CL = 0.8. As shown, the rapid increase of  $k_i$  leads to a limitation in the number of unspecified bits on each mask for acceptable test lengths.

Previous research [10] indicates that, at RT-level, multiple exercise of functionality is rewarding. Therefore, test length estimation should additionally require a given confidence level CL, so that each path is exercised r times. The best correlation with actual useful test lengths is presented in section 6 for different values of r.



Figure 4: Number of PR vectors required for satisfying a *p* bits condition with CL=0.8.



Figure 5: A chain of IF conditions leading to an independent dark corner *n* bits deep.

First consider r=2. The probability of achieving twice the control and observation with  $k_i$  vectors is given by (2) less the probability of single detection:

$$CL = 1 - \left(1 - \frac{1}{2^{p}}\right)^{k_{i}} - \frac{k_{i}!}{(k_{i} - 1)!!!} \left(\frac{1}{2^{p}}\right) \left(1 - \frac{1}{2^{p}}\right)^{k_{i} - 1}$$
(3)

since the probability  $P_r$  of satisfying a condition with p bits, with  $k_i$  vectors, **precisely** r times, , is:

$$P_{r} = \frac{k_{i}!}{(k_{i} - r)!r!} \left(\frac{1}{2^{p}}\right)^{r} \left(1 - \frac{1}{2^{p}}\right)^{k_{i} - r}$$
(4)

The probability of satisfying, with  $k_i$  vectors, **at least** r times, a condition with probability  $CL_r$  is:

$$CL_{r} = CL_{r-1} - \frac{k_{i}!}{(k_{i} - r + 1)!(r - 1)!} \left(\frac{1}{2^{p}}\right)^{r-1} \left(1 - \frac{1}{2^{p}}\right)^{k_{i} - r + 1}$$
(5)

Nevertheless (5) is a recursive expression, which can easily be expanded for the interesting small values of r, However,  $k_i$  can only be estimated numerically. Table 2 presents the computed values of  $k_i$  for  $CL_r=0.7$ .

|   | r  |     |     |     |  |  |
|---|----|-----|-----|-----|--|--|
| p | 2  | 3   | 4   | 5   |  |  |
| 1 | 4  | 6   | 8   | 10  |  |  |
| 2 | 9  | 13  | 18  | 22  |  |  |
| 3 | 19 | 28  | 37  | 46  |  |  |
| 4 | 38 | 57  | 75  | 93  |  |  |
| 5 | 77 | 115 | 151 | 187 |  |  |

Table 2: Number of vectors to achieve confidence level CL=0.7 that a condition with p bits is satisfied at least r times.

### 5. Multiple Masks Optimization

For a *n* bits deep dark corner as the one represented in Figure 5, a deterministic test able to exercise all the branches would require 2n vectors, i.e., two vectors for each condition, for the TRUE / FALSE branches. However, this deterministic test, prepared at RT-level, could exhibit poor FC of LSA faults low DC values, as shown in [10]. Leaving some bits of the test vectors undefined and exercising them randomly leads to a more unbiased test and, thus, to higher values of non target faults coverage at the lower abstraction levels. A test using *p* unspecified bits per mask exercises *p* branches of binary conditions. In order to cover all the *n* branches, *m* masks are required:

$$m = \frac{n}{p} - 1 \tag{6}$$

Using (2), a global number of vectors  $k_m$ :

$$k_m = m \times \frac{\log(1 - CL)}{\log\left(\frac{2^p - 1}{2^p}\right)} \tag{7}$$

is required, corresponding to the application of the *m* masks with the confidence level *CL* and not taking into account the number of unmasked vectors that will be common to all dark corners.  $k_m$  is represented in Figure 6 for *CL*=0.8 for n=10, 20 and 40 using equations (6) and (7), where best choices are signed with circles. From Figure 6 it is clear that, for *n*=10, the best trade-off between *m* and the test length is achieved with *m*=1, *p*=5. For *n*=20 and *n*=40 the best choices are *m*=4, *p*=4 and *m*=9, *p*=4. Globally best trade-offs were achieved for *p*=4.



Figure 6: Number of vectors required for achieving CL=0.8 using *m* masks on a dark corner with n=10, 20 and 40.

## 6. Results

The AGU control (AGU\_ctr) module was synthesized in CMOS with two different optimizations: for minimal area and for maximum speed. Both structures were fault simulated for LSA faults and defects. In each simulation  $k_0=2000$  random vectors were applied, followed by  $k_i=2000$  customized vectors for each mask. The results are presented in Table 3. The columns of Table 3 represent:

- av\_bl the average (for both optimizations and fault lists) number of useful vectors (that detect faults) for each mask, excluding the last useful vector.
- av\_l the average number of useful vectors for each mask.
- *p* the number of relevant unspecified bits (from RTL code analysis)

 r=1..4 - the predicted number of vectors to ensure, with CL=0.7, that a condition with p bits is satisfied r times with random vectors.

The values of the predictions were limited in the interval  $k_i \in [2, 200]$ . In section 5, an optimum value of p=4 was identified. In Table 3, not all the masks use this value, since when a dark corner contains a unique condition of *n* bits, there is no advantage in defining less the *n* bits. The optimization of section 5 is valid for sequences of conditions. Table 3 shows that:

- when using p values near to 4, the number of useful vectors per mask is well estimated as in section 4;
- the av\_bl values are well estimated using r=1;
- the complete useful number of vectors (av\_l) is better estimated when requiring multiple occurrences of combinations of p bits.

| msk                    | av_bl | Av_l | р    | r=1  | r=2  | r=3  | r=4 |
|------------------------|-------|------|------|------|------|------|-----|
| 0                      | 12    | 20   | 4    | 19   | 38   | 57   | 75  |
| 1                      | 7     | 11   | 4    | 19   | 38   | 57   | 75  |
| 2                      | 19    | 129  | 5    | 38   | 77   | 115  | 151 |
| 3                      | 4     | 18   | 2    | 5    | 9    | 13   | 18  |
| 4                      | 9     | 19   | 1    | 2    | 4    | 6    | 8   |
| 5                      | 8     | 32   | 3    | 10   | 19   | 28   | 37  |
| 6                      | 149   | 194  | 5    | 38   | 77   | 115  | 151 |
| 7                      | 320   | 330  | 8    | 200  | 200  | 200  | 200 |
| 8                      | 4     | 17   | 4    | 19   | 38   | 57   | 75  |
| 9                      | 14    | 23   | 3    | 10   | 19   | 28   | 37  |
| 10                     | 73    | 76   | 5    | 38   | 77   | 115  | 151 |
| correlation with av_bl |       |      | 0.94 | 0.93 | 0.86 | 0.76 |     |
| correlation with av_l  |       |      | 0.90 | 0.94 | 0.91 | 0.84 |     |

Table 3: Fault simulation results (av\_bl and av\_l), mask test length predictions (r=1..4) and correlations (AGU\_ctr).

Figure 7 identifies the AGU\_ctr module bits that are tested by conditions when this module is exercised with vectors customized with masks 7,9 and 10. The bus referred in this figure as pdb2 is a 24-bit control input of the module. Figure 7 illustrates how the values of p are computed:

- when mask 9 is forced, three bits are left unspecified (pdb2[4:2]) and thus p=3;
- when mask 10 is forced, there are 5 unspecified bits, four of then indicated in the figure (pdb2[15] and pdb2[4:2]) and an additional one (to higher the probability of exercising one functionality not illustrated in the figure).
- when mask 7 is forced, the most restrictive condition (leading to the functionality exercised by mask 10) tests p=8 bits. This value of p was left intentionally significantly higher than the optimum (near 4), in order to be able to evaluate, using fault simulation, the associated costs.



Figure 7: Bits tested by conditions covered by masks 7, 9 and 10 (AGU\_ctr).

## 7. Conclusions

High-quality BIST requires the coverage of r.p.r defects. In this paper, an optimization procedure for the RTL loosely deterministic mask-based BIST pattern generation technique has been proposed. The procedure drives the generation of the optimal number of masks for each dark corner, and leads to a good estimate of the length of each k<sub>i</sub> customized PR sequence, obtained with each mask. It was shown that k<sub>i</sub> is not dependent on the functional complexity, but only on the number of relevant unspecified bits in the mask. Moreover, it was shown that, given a set of branches in the binary form of the CFG of the RTL system description, requiring the specification of n bit, the optimum number of masks is n/4-1, leaving p=4unspecified bits per mask. The best estimation for the test length for each mask, associated with useful test vectors,  $k_i$ , estimated at RT-level, was obtained using equation (7), for CL of at least 70% and a multiple detection of RTL faults r=2. Results were used to predict the number of customized vectors for each mask of the AGU\_ctr ITC'99 benchmark module.

One of the main limitations of the *implemented* maskbased BIST approach is that it focuses combinational modules. A present additional limitation is the required effort for the manual mask generation. Automatic mask generation is required in order to insert this approach in the design flow. Future work will address these issues.

#### References

- Y. Zorian, E. Marinissen, S. Dye, "Testing Embedded-Core Based System Chips", Proc. Int. Test Conf. (ITC), pp. 130-143, 1998.
- [2] Mahesh A. Iyer, moderator org., "High Time for High-Level ATPG", Panel 1 session, Proc. ITC, pp. 1113-1119, 1999.
- [3] Michael L. Bushnell, Vishwani D. Agrawal, "Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits", Kluwer Academic Publishers, 2000.
- [4] R. David, "Random Testing of Digital Circuits Theory and Application", Marcel Dekker, Inc., 1998.
- [5] N.A. Touba, E.J. McCluskey, "Altering a Pseudo-Random Bit Sequence for Scan-Based BIST", Proc. Int. Test Conf. (ITC), pp. 167-175, 1996.
- [6] J. Rajski, J. Tyszer, N. Zacharia, "Test Data Compression for Multiple Scan Designs with Boundary Scan", IEEE Trans. On Computers, vol. 47, N<sup>o</sup>. 11, pp. 1188-1200.
- [7] J.J.T. Sousa, F.M. Gonçalves, J.P.Teixeira, C. Marzocca, F. Corsi, T.W. Williams, "Defect Level Evaluation in an IC Design Environment", IEEE Trans. on CAD, vol. 15, n°. 10, pp. 1286-1293, 1996.
- [8] M. B. Santos, I.C. Teixeira, J. P. Teixeira, S. Manich, R. Rodriguez and J. Figueras, "RT-Level Preparation of High-Quality / Low Energy / Low-Power BIST", Proc. of the International Test Conf. (ITC), pp. 814-823, 2002.
- [9] M.B. Santos, F.M. Gonçalves, I.C. Teixeira and J.P. Teixeira, "RTL-Based Functional Test Generation for High Defects Coverage in Digital Systems", Journal of Electronic Testing, Theory and Application (JETTA), vol. 17, N° 3/4, pp. 311-319, Kluwer, June/August 2001.
- [10] M.B. Santos, F.M. Gonçalves, I.C. Teixeira and J.P. Teixeira, "Implicit Functionality and Multiple Branch Coverage (IFMB): a Testability Metric for RT-Level", Proc. of the Int. Test Conf. (ITC), pp. 377-385, 2001.
- [11] M. B. Santos, F.M. Gonçalves, I.C. Teixeira and J. P. Teixeira, "Defect-Oriented Verilog Fault Simulation of SoC Macros using a Stratified Fault Sampling Technique", Proc. of the IEEE VLSI Test Symp. (VTS), pp. 326-332, 1999.
- [12] S.-T. Cheng, R.K. Brayton, "Compiling Verilog into Automata", Dept. of Electr. Eng. and Computer Science, University of California, Berkeley, May 1994.
- [13] SIS: A System for Sequential Circuits Synthesis, Electronics Research Laboratory, Memorandum No. UCB/ERL M92/41, Dept. EECS, University of California, Berkeley, May 1992.
- [14] S. Tasiran, F. Fallah, D. Chinnery, S. Weber, K. Keutzer, "A Functional Validation Technique: Biased-Random Simulation Guided by Observability-Based Coverage", Proc. of the 2001 IEEE Int. Conference on Computer Design: VLSI in Computers & Processors (ICCD), pp. 82-88, 2001.
- [15] M.B. Santos, I.C. Teixeira and J.P. Teixeira, European Test Workshop (ETW) Informal Digest, 2002.
- [16] CMUDSP benchmark (I 99 5, ITC 99 5]), http://www.ece.cmu.edu/~lowpower/benchmarks.html