# Feedback-Bus Oscillation Ring: A General Architecture for Delay Characterization and Test of Interconnects

Shi-Yu Huang Meng-Ting Tsai

Electrical Engineering Department, National Tsing Hua University, Taiwan

Kun-Han (Hans) Tsai Wu-Tung Cheng

Silicon Test Solutions, Mentor Graphics

*Abstract*—In this paper we propose a flexible delay characterization and test architecture, called Feedback-Bus Oscillation Ring (FB-OR), for die-to-die interconnects in a 3D IC. As compared to previous works, it is unique in its ability to streamline the characterization/test operations for a set of arbitrary interconnects with multiple pins sprawling multiple dies. During the Design-for-Testability stage, one common feedback-bus (connected to all dies in the IC under characterization/test) is inserted. Through this feedback-bus, an oscillation ring can be formed dynamically and the Variable-Output-Threshold (VOT) technique can be applied to characterize the delay of one interconnect segment at a time. Experimental results indicate that this method is not only flexible and scalable, but requiring only a small area overhead.

## I. INTRODUCTION

In today's 3D-IC technology, several functional dies can be integrated through a combination of various mechanisms, ranging from Through Silicon Via (TSV) based die-stacking[11], side-by-side placement on a silicon interposer[1], to Re-Distribution Layer (RDL) fabricated during Wafer-Level Post-Processing[19].In some cases, these different types of interconnects could be mixed in a single 3D-IC; and they could be fabricated by not only wafer foundries but also by OSAT (Outsourced Semiconductor Assembly and Test) companies. As touted in [1][7][10][13], testing and/or characterization of these die-to-die interconnects is important to ensure the final overall quality of a 3D-IC, to keep track of process variation for yield improvement, and to support diagnosis for identifying the responsible party when a device fails.

Testing TSVs or in general die-to-die interconnecting wires in the interposers/RDL have been intensively studied[3][5][6][8][9][16][18][19]. The faults targeted do not confine to just stuck-at faults but also include parametric faults (such as resistive open/bridging faults). Among these methods, VOT-enhanced oscillation test [9][10] is a characterization-based test method, where VOT stands for Variable Output Threshold. In general, it is able to approximate the delay of an interconnect in silicon by some Design-for-Testability circuit around the target interconnect and the approximated delay is converted into certain form of information (e.g., the *period* of some produced oscillation signal). Then, by an ATE or an on-chip decision center, post-processing using analytical techniques such as test threshold checking or outlier analysis is performed on the derived information to decide if a parametric fault has occurred.

This VOT-enhanced oscillation test method [9][10]can be viewed as a *local-ring based method*. Each interconnect is assumed to have only two pins, the driver and the receiver. Two interconnects going in the opposite directions (e.g., one from die 1 to die 2 and the other vice

We also owe our gratitude to CIC, Taiwan, for their provision of EDA tools used in the experiments of this work.

versa) form a local oscillation ring. When applied to a 3D-IC with a larger number of heterogeneous interconnects, such an oscillation ring forming strategy imposes two restrictions: (1) It is not suitable for multi-pin interconnects (i.e., interconnects having more than one receiver), and (2) It faces a "characterization results retrieval problem", as illustrated in detailed below.

A local ring produces an oscillation signal, denoted as RO\_clk. Since there might be a great number of local rings (each is associated with a pair of interconnects) sprawling all over a 3D-IC, how to route this great number of physically distributed RO\_clk signals into a centralized *Clock Period Measurement* (CPM) circuit is not trivial.

One way of characterization results retrieval is to use a giant MUX tree that selects one RO clk signal to its output, which further drives the input of the CPM circuit. This type of congregation system is not favorable as it is a global structure involving sprawling long wirings as illustrated in Fig. 1(a), and thus, making the physical design of the test insertion complicated and sophisticated. Another option is to use a "shared RO clk channel" for all local rings. At any given time, one RO clk at a selected local ring drives the shared RO clk channel, while all others only "relay" the incoming RO clk to the next test wrapper in the shared channel, as illustrated in Fig. 1(b). This type of retrieval system may seem to be more preferred. However, it requires a sophisticated design and verification process due to one subtle reason: The RO clk in the local-ring based test method has a very high frequency (e.g., 1GHz) and its duty cycle is subject to change when it travels through a large number of relay elements. Our experience shows that without paying attention to this issue, a 1GHz clock signal could vanish altogether after passing through about 68 relay elements. As a result, this retrieval system may require rigorous timing verification if used to support a daisy-chain of a large number of test wrappers. Our goal in this work is to adopt a highly scalable retrieval system of a daisy-chain structure (due to its simplicity in routing), while not suffering from the pulse vanishing problem at the same time.





(a) Type 1: MUX tree.

(b) Type 2: Shared RO\_clk channel.

Fig. 1: Two possible retrieving systems of oscillation signals, "RO\_clk", generated by local rings for delay characterization of interconnects.

In this work we introduce a new oscillation-ring forming strategy for the VOT-enhanced oscillation test method, called **Feedback-Bus Oscillation Ring (FB-OR)**. This new strategy will easily overcome

This work was supported in part by National Science Council (NSC) of Taiwan under grant NSC-103-2220-E-007-004.

the two restrictions mentioned previously (i.e., multi-pin interconnect problem and scalable characterization result retrieval problem), without incurring any noticeable area overhead.

The rest of this paper is organized as follows. In Section 2, we provide some preliminary information. In Section 3, we propose the Feedback-Bus Oscillation Ring architecture and its detailed DfT circuit, followed by a test access architecture and test flow. In Section 4, we present the experimental results and in Section 5 we conclude.

#### **II. PRELIMINARIES**

To be more self-contained, we briefly review the technique of VOT-enhanced oscillation test method in the following. The ring oscillator (abbreviated as RO) has been used extensively as a vehicle in testing delay faults, and in measuring the delay of a cell or a transmission line [5][8]. Based on this concept, an enhancement technique called VOT scheme was further proposed to increase its sensitivity and to facilitate the small delay testing of a TSV in [9].

**Fig. 2** shows a basic test structure supporting the VOT analysis. A test unit is composed of two TSVs in opposite directions with the following features.



Fig. 2: The Ring Oscillator (RO) as a test structure around a pair of TSVs to support VOT analysis, proposed in [9].

(1) At the inputs of the two original drivers of the two TSVs, two multiplexers are added to allow for switching between the functional mode and the test mode. (2) Two circuit paths of the two TSVs are cascaded back-to-back to form a **Ring Oscillator (RO) in the test mode**. (3) To make the test structure symmetric, two XOR gates are introduced. To cause oscillation in test mode, opposite values are applied to the side-inputs of the two XOR gates, i.e., "RO\_enable1" and "RO\_enable2". (4) Finally, the output inverter of each of the two TSVs is converted into a so-called VOT inverter. **A VOT inverter can be switched from a normal inverter to a Schmitt-Trigger inverter**, depending on the value of its control value, denoted as *Z* in the figure.

The main purpose of the VOT scheme is to **estimate the transition time** at the termination end of a TSV (since it is proportional to the TSV delay) **by measuring the oscillation periods under three configurations**. (1) Both VOT inverters are in normal configuration, i.e., { $Z_1, Z_2$ } = {0, 0}, producing an oscillation period denoted as T<sub>REF</sub>; (2) The TSV1's VOT inverter is switched to Schmitt-Trigger mode, i.e., { $Z_1, Z_2$ } = {1, 0}, producing an oscillation period denoted as T<sub>ST1</sub>; (3) The TSV2's VOT inverter is switched to Schmitt-Trigger mode, i.e., { $Z_1, Z_2$ } = {0, 1}, producing an oscillation period denoted as T<sub>ST2</sub>. Analysis shows that  $\Delta T_{ST1}$  (defined as T<sub>ST1</sub> - T<sub>REF</sub>) reflects the TSV delay across TSV1 in the presence of a delay fault, and  $\Delta T_{ST2}$  (defined as T<sub>ST2</sub> - T<sub>REF</sub>) similarly reflects the TSV delay across TSV2. *This is due to a property that the hysteresis amount of a Schmitt-Trigger inverter (as compared to a normal inverter) is proportional to the transition time of its input signal.* In [9], it has been shown that there exists a strong linear relationship between the measurable  $\Delta T_{ST}$  and its related TSV delay.

# III. PROPOSED METHOD

In this session, we will discuss the architecture, detailed DfT circuits, and the test flow.

#### A. Feedback-Bus Oscillation Ring

We use the following concepts and terminologies for the proposed feedback-bus oscillation ring.

(1) Each multi-pin interconnect is viewed as **multiple segments**, each representing the signal path from the driver to a receiver. Therefore, for an interconnect with one drive (X) and two receivers (R1 and R2), we will try to characterize the delays of two segments, namely, X-to-R1 and X-to-R2, respectively.

(2) Each segment under characterization is considered to be from a **pitching die** to a **catching die**. At this moment, the common feedback-bus (FB) will serve as a **return path** from the catching die back to the pitching die.



Fig. 3: Abstract Feedback-Bus Oscillation Ring (FB-OR).

(3) As shown in **Fig. 3**, there is only one **active global ring** in the entire 3D-IC at any given time, formed by four components: the target segment (from a pitcher to a catcher), the return path via feedback-bus, and two relay paths, called **pitching-side relay path (PS-relay)**, and **catching-side relay path (CS-relay)**. For a selected segment under characterization, the design-for-testability (DfT) circuit to be inserted at the drivers and the receivers of interconnects will jointly establish the required global ring. The details of these circuits will be discussed later. Unlike previous local-ring method, this global ring oscillates at a relatively low speed.

(4) There is a circuit called "Clock Period Measurement" (CPM), or labeled as  $\Omega$  in the figure. At any given time the oscillation signal produced by the FB-OR can be observed at any die, and thus, we only need one copy of this circuit and it can be allocated to any die (which is designated as the master die).



Fig. 4: Illustration of feedback-bus oscillation ring for a segment.

(Example 1): Fig. 4 shows an example. There are four functional dies in the IC, numbered Die 1, 2, 3, 4 in clockwise. A 4-pin interconnect with the driver (X) and three receivers (Y1, Y2, Y3) is under consideration. At a particular moment, we wish to characterize the delay along the segment from X to Y1. In **Fig. 4**, Die 1 (containing pitcher X) is now the pitching die and Die 2 (containing receiver Y1) is now the catching die. Thus, in Die 1, a PS-side relay path is formed from the output of the feedback bus to the current pitcher, while in Die 2, a CS-side relay is dynamically formed from the catcher to the driver of the feedback bus.

## B. Detailed Design-for-Testability Circuit

As a brief summary, we need to do the following in order to form the oscillation ring for a particular interconnect segment under characterization:

(1) We need to assign a pitching die and a catching die.

(2) We need to form the PS-relay path by routing the signal from the feedback bus to a designated pitcher (cell).

(3) We need to form the CS-relay path by routing the signal at the catching (cell) to the driver of the feedback bus.

In this subsection, we discuss the detailed DfT circuit for achieving the above actions, adopting the following tactics:

(T1) For each die, we assume a status bit to indicate whether it is a designated catching die. A catching die is special as it will drive the feedback bus. For all the other dies (either the pitching die or not), their behaviors are identical (as they all listen to the bus) and thus we make no further distinction.

(T2) We use a special driver-cell to replace an original driver and a special receiver-cell to replace an original receiver as shown in Fig. 5. The driver-cell is very simple and obtained by adding a MUX before each original driver. In the test mode (i.e., when TM = '1'), the output signal from the feedback bus is broadcast to all driver-cells. In some sense, all driver-cells are active as they are all driven. The inserted MUXes connected in cascade to form the PS-relay path. On the other hand, the design of a receiver-cell is relatively more involved. Our purpose is two-fold: First, we need to replace the original receiver with a VOT type of cell. Second, the signal received by the current designated catcher cell will be relayed through a sequence of cascaded MUXes to the input of the feedback bus. Since a catching die is the designated BUS DRIVER, this signal will ripple through the feedback bus and finally arrive at the pitching die to complete the oscillation ring. Except the designated catcher (cell), all the other receiver-cells do not play a role in the oscillation ring.



Fig. 5: Two types of test wrappers - driver-cell and receiver-cell.

(T3) To select one receiver-cell at a time as the designated catcher (cell), we need to incorporate a flip-flop (FF) at each receiver-cell, to indicate whether it is a selected catcher (cell) or not. In the sequel, the value of this FF is called **C-bit**. To control the values of C-bits, we need to cascade them into a C-bit scan chain as to be detailed later.

(T4) For simplicity, all VOT cell in the receiver-cell share the same global control signal, Z. When Z is '0', it is in the normal operation, producing a normal delay. On the other hand, when Z is '1', it is in the Schmitt-Trigger operation (ST-operation) producing a longer delay. That is, if the oscillation period of FB-OR when Z is '0' is denoted as  $T_{REF}$ , and as  $T_{SLOW}$  when Z is '1', then  $\Delta T = (T_{SLOW} - T_{REF})$  is a delay amount closely correlated to the delay across the target segment only.

## IV. EXPERIMENTAL RESULTS

In this section, we use SPICE simulation in a typical 90nm process technology to evaluate the proposed method.

#### A. Delay Characterization Ability

In **Fig. 6**, we show the  $\Delta T$  versus interconnect delay correlation, derived by sweeping the faulty resistance gradually from  $1k\Omega$  to  $100k\Omega$  in an increment of  $1k\Omega$ . The interconnect wire-length is assumed to be  $1000 \ \mu m$ , and the feedback-bus is assumed to be 1cm (or equivalently 10,000  $\mu m$ ).



Fig. 6:  $\Delta T$  versus interconnect delay correlation, (by sweeping a faulty resistance from 1k $\Omega$  to 100k $\Omega$ ).

#### B. Area Overhead

| Area overhead                          |                                         |                   |
|----------------------------------------|-----------------------------------------|-------------------|
| Туре                                   | Cell Name                               | Layout Area (µm²) |
| Basic<br>Gates                         | INVERTER                                | 2.82              |
|                                        | BUFFER                                  | 4.23              |
|                                        | MUX Cell                                | 8.47              |
|                                        | VOT Cell                                | 11.72             |
|                                        | FF Cell                                 | 17.64             |
| Basic<br>Macros                        | Driver-Cell                             | 12.7              |
|                                        | Receiver-Cell                           | 37.83             |
|                                        | Test Controller                         | 629.76            |
|                                        | CPM circuit                             | 1871.91           |
|                                        | IEEE-1149.1 Boundary Scan<br>Cell (BSC) | 52.22             |
| Comparison<br>To Boundary<br>Scan Cell | Driver-cell is 24.3% (of BSC)           |                   |
|                                        | Receiver-cell is 72.4% (of BSC)         |                   |

Table 1: Area overhead breakdown.

**Table 1** shows the area overhead in a 90 nm CMOS process. The overall area overhead of our method (assuming that there is N interconnect segments) can be expressed as follows:

Area = (Controller+CPM)+N\*(Driver-Cell + Receiver-Cell) = 2501.67+ N \* (50.53)  $\mu$ m<sup>2</sup>

The fixed part (i.e., the test controller plus the CPM circuit) is relatively negligible if amortized over a large number of interconnects. Comparatively, the area of a boundary scan cell IEEE-1149.1 is  $52.22\mu m^2$ . The area overhead of one driver-cell is therefore 24.3% as normalized to a boundary scan cell, and that of a receiver-cell is 72.4%.

# C. Test Time

The number of interconnect segments to be characterized and tested is denoted as N, and we assume to use a 10MHz test clock, TCK, with a period of 100ns. Also we assume that the clock period measurement time is around 20µs (or 200 TCK cycles).(Note that the CPM time is related to the CPM scheme adopted. The 200 TCK cycles we use here is based on a CPM scheme we have developed to achieve a resolution of 10 ps. But due to space limitation, the detail is omitted). The total test time is dominated by the measurement time for 2N oscillation periods, estimated as:

 $(2N) * (20) \ \mu s = 40N \ \mu s.$ 

For instance, the test time for an 8-die stack with 1024 TSVs passing through every die will be  $40*(7*1024) \ \mu s = 280 \ ms$ , assuming that there are (8-1) = 7 receivers for each global TSV.

If a high-end ATE is used (which can achieve 10-ps resolution for period measurement), the oscillation signal can also be connected to an output pin for direct measurement. In that case, the test time could be further reduced.

#### V. CONCLUSION

Delay characterization of die-to-die interconnects in a 3D-IC is valuable for detecting parametric faults that could have jeopardized the speed of interconnects, and for monitoring the variation of interconnects fabricated either through TSVs, interposers, or Re-Distribution Layers. For this problem, the VOT-enhanced oscillation test method has been proven quite effective for two-pin interconnects. However, limitations exist when it is applied to a complex IC with a large number of heterogeneous multi-pin die-to-die interconnects. In this work, we have presented a more flexible architecture on which the VOT-based delay characterization can be more easily applied. We demonstrated that a simple global feedback-bus along with some driver-cells and receiver-cells can assist the dynamic formation of an oscillation ring for any interconnect segment under consideration. Experimental results show that the area overhead is only 24.3% for our driver-cell and 72.4% for our receiver-cell, as normalized to the area of a boundary scan cell.

#### REFERENCES

- B. Banijamali, S. Ramalingam, K Nagarajan, and R. Chaware, "Advanced Reliability Study of TSV Interposers and Interconnects for the 28nm Technology FPGA," *Proc. of IEEE Electronic Components and Technology Conf.*, pp. 285–290, 2011.
- [2] K. Chakrabarty, "TSV Defects and TSV-Induced Circuit Failures: The Third Dimension in Test and Design-for-Test", *Proc. of Int'l Reliability Physics Symp.*, (IRPS), pp. 5F1.1-5F.1.12, 2012.
- [3] C.-C. Chi, C.-W. Wu, M.-J. Wang, H.-C. Lin, "3D-IC Interconnect Test, Diagnosis, and Repair," *Proc. of IEEE VLSI Test Symp*, pp. 1-6, 2013.
- [4] M. Cho, C. Liu, D. H. Kim, S. K. Lim, and S. Mukhopadhyay, "Pre-Bond and Post-Bond Test and Signal Recovery Structure to Characterize and Repair TSV Defect Induced Signal Degradation in 3-D System," *IEEE Trans. on Components Packaging and Manufacturing Technology*, Vol. 1, No. 11, Nov. 2011.
- [5] S.-Y. Huang, J.-Y. Lee, K.-H. (Hans) Tsai, and W.-T. Cheng, "Pulse-Vanishing Test for Interposers Wires in 2.5-D IC", *IEEE Trans. on Computer-Aided Design of Electronic Circuits* (TCAD), Vol. 33, No. 8, pp. 1258-1268, Aug. 2014.

- [6] Y. J. Huang, J.-F. Li, J.-J. Chen, D.-M. Kwai, Y.-F. Chou, and C.-W. Wu, "A Built-In Self-Test Scheme for the Post-Bond Test of TSVs in 3D ICs," *Proc. of IEEE VLSI Test Symp*, pp. 20-25, 2011.
- [7] H. Lee and K. Chakrabarty, "Test Challenges for 3-D Integrated Circuits," *IEEE Design and Test of Computers*, Vol. 25, No. 5, pp. 26-35, Sept.-Oct. 2009.
- [8] K. S.-M. Li, C. L. Lee, C. Su, and J. E. Chen, "Oscillation Ring Based Interconnect Test Scheme for SoC," *Proc. of IEEE Asia South Pacific Design Automation Conf. (ASP-DAC)*, pp. 184–187, 2005.
- [9] Y.-H. Lin, S.-Y. Huang, K.-H. Tsai, W.-T. Cheng, S. Sunter, Y.-F. Chou, and D.-M. Kwai, "Small Delay Testing for TSVs in 3D ICs," *IEEE Proc. of Design Automation Conf.*, June 2012.
- [10] L.-R. Huang, S.-Y. Huang, K.-H. Tsai, and W.-T. Cheng, "Parametric Fault Testing and Performance Characterization of Post-Bond Interposer Wires in 2.5-D ICs", *IEEE Trans. on Computer-Aided Design of Electronic Circuits* (TCAD), Vol. 33, No. 3, pp. 476-488, March 2014.
- [11] U. Kang, H. Chung, S. Heo, D. Park, H. Lee, J. Kim,S. Ahn, S. Cha, J. Ahn, D. Kwon, et al., "8 GB 3-D DDR3DRAM using Through-Silicon-Via Technology," *IEEE Journal of Solid-State Circuits*, Vol. 45, No. 1, pp. 111–119, 2010.
- [12] E. J. Marinissen, C.-C. Chi, J. Verbree, and M. Konijnenburg, "3D DfT Architecture for Pre-Bond and Post-Bond Testing," *Proc. of 3D Systems Integration Conf.*, pp. 1-8, 2010.
- [13] E. J. Marinissen, "Challenges and Emerging Solutions in Testing TSV-Based 2.5-D and 3D-Stacked ICs," *Proc. of IEEE Design, Automation, and Test in Europe Conf.*, pp. 1277-1282, 2012.
- [14] P. R. O'Brien and T. L. Savarino, "Modeling the Driving-Point Characteristic of Resistive Interconnect for Accurate Delay Estimation," *Proc. of Design Automation Conf.*, pp. 512–515, Nov. 1989.
- [15] M. Tsai, M. Klooz, A. Leonard, J. Appel, and P. Franzon, "Through Silicon Via (TSV) Defect/Pinhole Self Test Circuit for 3D-IC,"*Proc. of Int'l Conf. on 3D System Integration*, pp. 28-30, Sept. 2009.
- [16] R. Wang, K. Chakrabarty, and S. Bhawmik, "At-Speed Interconnect Testing and Test-Path Optimization for 2.5D ICs,"*Proc. of VLSI Test Symp.*, pp. 1-6, 2014.
- [17] S. H. Wu, D. Drmanac, L.-C. Wang, "A Study of Outlier Analysis Techniques for Delay Testing," *Proc. of IEEE Int'l Test Conf.*, pp. 1-10., 2008.
- [18] F. Ye and K. Chakrabarty, "TSV Open Defects in 3D Integrated Circuits: Characterization, Test, and Optimal Spare Allocation", *Proc. of Design Automation Conf.*, pp. 10240-1030, June 2012.
- [19] S. W. Yoon, P. Tang, R. Emigh, Y. Lin, P. C. Marimuthu, and R. Pendse, "Fanout Flip chip eWLB (embedded Wafer Level Ball Grid Array) Technology as 2.5D Packaging Solutions", *Proc. of Electronic Components and Technology Conf.*, (ECTC), pp. 1855-1860, 2013.
- [20] J.-W. You, S.-Y. Huang, D.-M. Kwai, Y.-F. Chou, and C.-W. Wu, "Performance Characterization of TSV in 3D IC via Sensitivity Analysis," *Proc. of Asian Test Symposium* (ATS), Dec. 2010.
- [21] IEEE Computer Society, "IEEE Std 1149.1<sup>TM</sup>-2001, IEEE Standard Test Access Port and Boundary-Scan Architecture", IEEE, June, 2001.
- [22] "CIC Reference Flow for Cell-based IC Design", Chip Implementation Center, CIC, Taiwan, Document no. CIC-DSD-RD-08-01, 2008.