# On-Chip Multi-Channel Waveform Monitoring for Diagnostics of Mixed-Signal VLSI Circuits

Koichiro Noguchi and Makoto Nagata Department of Computer and Systems Engineering, Kobe University 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan {noguchi,nagata}@cs26.scitec.kobe-u.ac.jp

# Abstract

Multi-channel waveform monitoring technique enhances built-in test and diagnostic capability of mixed-signal VLSI circuits. An 8-channel prototype system incorporates adaptive sample time generation with a 10-bit variable step delay generator and algorithmic digitization with a 10-bit incremental reference voltage generator. The prototype in a 0.18-µm CMOS technology demonstrated on-chip waveform acquisition at 40-ps and 200- $\mu$ V resolutions. The waveforms were as accurate as those by an off-chip measurement technique, while more than 95 % reduction of the waste time in waveform monitoring was achieved. The area of  $700\mu m \times 600\mu m$  was occupied by a single waveform acquisition kernel that was shared with 8 front-end modules of  $60\mu m \times 200\mu m$  each. The developed on-chip multi-channel waveform monitoring technique is waveform accurate, area efficient, and low cost, which are all requisite factors for diagnosing methodology toward mixed analog and digital signal integrity in a systems-on-a-chip era.

# 1 Introduction

Capturing signal waveforms on-chip is the most straightforward measure of in-depth diagnostics in a mixed-signal integrated circuit. As recent growth of systems-on-a-chip (SoC) markets forces to locate much more functionality and performance on a single die, most of chips appear to be mixed-signal. However, there are various physical issues that influence dynamic circuit behaviors in advanced SoCs, which finally degrades chip performance or even leads to malfunction. Examples include spurs in analog-to-digital or digital-to-analog conversion, jitters in phase-locked loops, and unstable skew variations in clock distribution network or among critical paths. Embedding signal monitoring circuitry within SoC chips provides the functionality of runtime test of circuit performance as a part of built-in self-test mechanisms as well as diagnostics of invasive factors such as noise, which is strongly helpful for calibration, validation, and improvement of physical design flow. This approach will be getting common as the integration capacity increases, where die area occupied by the embedding circuitry can trade off directly with the cost of mixed-signal test facility and also indirectly with the effective use of electric design automation (EDA) tools.

This paper proposes an architecture of on-chip waveform monitoring that realizes multi-channel waveform-accurate probing at low area consumption and with minimum offchip measurement equipments. Enabled waveform acquisition at various locations on-chip provides extended diagnostics and testability over previous works. On-chip analog test signal generation and analog signal capturing in enhancing testability of analog / mixed-signal circuits was discussed in [1]. As for more diagnostic purposes, on-chip oscilloscope macros targeting wide-bandwidth signal capturing within high-speed digital signaling have appeared in [2][3]. In addition, on-chip measurements of power-supply and ground dynamic noises [4]-[9], signals [10], and clock jitters [11][12] has also been reported.

Section 2 describes a proposed architecture of on-chip waveform monitoring and discusses flow and cost of measurements, followed by details of enabling circuit techniques in Section 3. Section 4 presents design and measurement results of a 0.18- $\mu$ m CMOS prototype chip, and finally, conclusions will be given in Section 5.

# 2 Architecture

## 2.1 Overview

Figure 1 shows a chip structure embedding a multi-channel waveform monitoring function, where tiny probing frontend modules locate adjacent to each circuit block to monitor and probe wiring is drawn from the module to the point of interest within the block, and every module shares a single waveform acquisition kernel that locates on the top. This ar-



Figure 1. Chip embedding on-chip multichannel probing.

chitecture has the flexibility in placing the waveform monitor modules into open spaces after the completion of SoC layout, and therefore minimizes the cost of embedding.

In order to materialize the chip structure, we have partitioned an entire monitor circuitry as shown in Figure 2. A probing front-end module (PFE) consists of a source follower (SF) and a latch comparator (LC), where probe wiring from a target signal of interest is input to SF and its DClevel-shifted output voltage  $V_{sf}$  is compared with reference voltage  $V_{ref}$  by LC, immediately within PFE when sampling clock  $T_{ck}$  is given. Therefore, output of PFE,  $D_{out}$ , is a single bit binary digital stream. Another input signal to PFE is bias voltage  $V_{bsf}$  given to SF. On the other hand, a waveform acquisition kernel includes a reference level generator (VG) supplying  $V_{ref}$  from a 10-bit R-2R ladder, a sampling timing generator (TG) providing  $T_{ck}$  from a 10bit variable step delay generator, and a data processing unit (DPU) producing a series 10-bit digital word from Dout.

This partitioning allows to connect multiple PFEs to the single kernel with multiplexing only digital signals,  $T_{ck}$  and  $D_{out}$ , while distributing common analog signals,  $V_{ref}$  and  $V_{bsf}$ . Compared with a conventional architecture consisting of compatible circuit components given in Figure 3 where S/H signals are multiplexed at the input to analog-to-digital converter (ADC) with a successive approximation register (SAR), the proposed architecture eliminates the necessity of multiplexing analog signals of interest and therefore achieves multiple channel probing without degrading waveform acquisition performance. Here, while one of PFEs is activated during waveform acquisition, all the others are cut off from power-supply through logical control accordingly to a state register. Another key difference from SAR-ADC is found in the monotonously incremental DAC operation of VG, which also eliminates the feed-back processing needed by SA operation. Finally, since our waveform acquisition works on the basis of sampling principle, an explicit sam-



Figure 2. Architecture of on-chip multichannel probing.



Figure 3. Conventional architecture multiplexing inputs to ADC.

pling capacitor was equivalently replaced with repetitive comparison of  $V_{sf}$  against  $V_{ref}$  at LC with statistical processing, as will be discussed in the next section.

#### 2.2 Flow and cost of waveform acquisition

Figure 4 shows the flow of waveform monitoring by the proposed architecture, based on sampling principle. The flow has dual loop operations where reference voltage loop is nested by sample timing loop, and premises that VG and TG generate incrementally stepwise voltage and sampling timing, respectively. Here, TG generates  $T_{ck}$  for every negative edge of Mck that is initially located at a certain cycle of system clock (Sck),  $T_{sck}$ , as shown in Figure 5(a). Then, VG generates  $V_{ref}$  and LC repetitively compares it with  $V_{sf}$ thus the probed voltage at this timing. After  $2^{10}$  iteration, average of LC decision is output from DPU as probability,  $P_{cmp}$ , and then  $V_{ref}$  increased step-wisely. When VG goes through every 10-bit steps, digitized  $V_{sf}$  is determined as the bit number of  $V_{ref}$  at the highest slope of  $\Delta P_{cmp}$  /  $\Delta V_{ref}$  in the comparator transition region. Then, TG sets forward  $T_{ck}$  as shown in Figure 5(b) and again the 2nd loop is going. When TG goes through every 10-bit steps, TG is reset and Mck thus the first edge from TG is placed relatively to the next cycle of Sck as shown in Figure 5(c),



Figure 4. Flow of waveform acquisition.



Figure 5. Timing generation.

and VG is also reset and resumes  $V_{ref}$  loop. The operation continues until user-specified loop count, and then a signal waveform at the probe is reproduced from time-series data of the digitized  $V_{sf}$ . Here, one can also acquire an on-chip signal waveform by embedding a PFE alone as has been achieved in previous works [5][8], if the necessary input signals to PFE is provided from off-chip measurement equipments that are automatically controlled by PC. However, since the total loop count in the proposed waveformacquisition flow reaches the order of 10<sup>9</sup>, accumulated access time to the measurement equipments from the PC can not be negligible. Figure 6 roughly compares the total time of acquiring 1024 measurement points. For the cases with embedding PFE alone, the time for the simple incremental algorithm same as Figure 4 explodes while successiveapproximation like algorithm reduces it in 1/10, however, still more than 5 hours is required. On the other hand, the native time of 4 minutes is only left when embedding the whole of the system of Figure 2 that is running at 4 MHz.



Figure 6. Estimated cost of waveform acquisition.



Figure 7. Probing front-end module, (a) Nchannel SF, (b) P-channel SF, (c) latch comparator.

This is the clear motivation of the development of on-chip multi-channel waveform monitoring, where the cost of measurement time can be traded-off with that of die area.

# **3** Circuit description

## 3.1 Probing front-end module

One from two types of PFEs using P-channel or N-channel SF shown in Figure 7 is selectively applied to the signal of interest, depending on the DC voltage level of the signal. P-channel SF works for voltage ranging from below 0 V to  $Vdd - Vth_p$ , where Vdd and  $Vth_p$  are the supply voltage and the threshold voltage of P-type MOSFET, respectively. On the other hand, N-channel SF works for voltage ranging from above Vdd to  $Vth_n$ , where  $Vth_n$  represents the threshold voltage of N-type MOSFET. While AC small signals with DC voltage at half Vdd, which is often equal to the bias point of analog circuits, can be covered either Pchannel or N-channel SF, the full voltage range in a lowvoltage digital circuit with Vdd of the degree of 1.2 V can be sensed by a single P-channel SF based PFE using highvoltage, for instance, 3.3-V MOSFETs. A simple topology of Figure 7(c) was chosen as LC in order to minimize the area of PFE. The entire PFE was designed to retain the gain



Figure 8. Reference voltage generator. (a) schematic and (b) voltage steps.



Figure 9. Variable step delay generator.

of 0 dB from DC to 1.0 GHz. It should be noticed that the unexpected interaction with the circuit under monitoring must be minimized in the design of on-chip waveform monitoring circuit. The SF not only buffers the signal of interest to LC but also isolates the waveform monitoring circuit from the circuit to monitor. Therefore, the proposed PFE can apply for monitoring various kinds of signals or even noises in a mixed-signal VLSI circuit.

## 3.2 Reference voltage generator

Reference voltage generator, VG, showing in Figure 8(a) consists of R-2R ladder dividing external reference DC voltages of  $V_{refp}$  and  $V_{refm}$  and a 10-bit binary digital counter that increments the output value at every positive edge of DACinc. The switch network of R-2R ladder is designed to generate  $V_{ref}$  that is linear to the output digital code from the 10-bit counter. Therefore, VG works as a 10-bit incremental digital-to-analog converter (DAC) as shown in Figure 8(b).

## 3.3 Sampling timing generator

The most straightforward implementation of TG uses voltage-controlled delay line (VCDL) for timing generation, however, the size of circuit can easily explode as the number of bits of delay step increases, and moreover, it can create significant environmental noises since all delay cells are activated at every input edge. These inappropriateness can not be mitigated even if interpolaters between delay cells or Vernier topology are applied. Therefore, we have developed a variable step delay generator (VSDG) shown in Figure 9, where bias current,  $I_b$ , is divided by an integer,







Figure 11. Replica delay locked loop.

n, corresponding to a digital code and thus the delay time,  $T_{delay}$ , follows to:

$$T_{delay}(n) = n \times T_{delay}(0), \tag{1}$$

where  $T_{delay}$  is measured from negative edge of input clock, Mck, to positive edge of output clock,  $T_{ck}$ , and  $T_{delay}(0)$  is the minimum delay. Here,  $I_b$  is regulated by a replica DLL that equates the maximum delay of  $T_{delay}(2^m)$  to the clock period of system clock, Sck.

Figures 10 and 11 show actual implementation of 10-bit VSDG and replica DLL, respectively. VSDG is a currentmode circuit and 6-bit MSB delay step is finely tuned by injecting 4-bit LSB currents. The net maximum delay of  $T_{delay}(2^{10})$  is defined by subtracting offset delay  $(T_{delay}^{offset})$  from the maximum delay  $(T_{delay}^{max})$  of the 6-bit MSB delay steps as shown in Figure 11, and equated to the clock period of Sck by the replica DLL. Therefore, TG generates sampling time adaptively to Sck of a target circuit. The proposed VSDG-based TG is very quiet since only a single transition occurs for every timing generation along with twice transitions in replica DLL for every phase adjustment, and therefore suitable for on-chip waveform monitoring.



Figure 12. Die photo.



Figure 13. Measurement setup.

# 4 **Experiments**

### 4.1 Prototype chip design

A prototype chip shown in Figure 12 was developed using a 0.18- $\mu$ m CMOS technology, where eight PFEs sharing a single waveform acquisition kernel probed various wirings within a programmable 24-bit digital shift register file (SR). Another identical waveform acquisition kernel is also included for evaluation. Here, the occupied areas are 700 $\mu$ m × 600 $\mu$ m for the kernel and 60 $\mu$ m × 200 $\mu$ m for each PFE in a 2.8mm × 2.8mm chip. The kernel and PFEs were designed with 2.5-V I/O MOSFETs and located within a deep N-type well that was capacitively isolated from a common P-type substrate, on the other hand, all the other circuits used 1.8-V CMOS devices in a conventional twin well structure.

#### **4.2 Measurement results**

Measurement setup is described in Figure 13, consisting of a logic analyzer (LA) fully controlled by PC through



Figure 14. Acquired waveforms on (a) VDD and (b) GND wirings.



Figure 15. Actual cost of waveform acquisition.

TCP/IP interface, power-supply and analog voltage sources, and the prototype chip mounted on a device under test (DUT) board. LA provides a digital vector to SR, DLLinc and DACinc signals that force TG and VG to increment steps, respectively, and clock signals of Sck and Mck. It also collects output from the prototype chip.

Figure 14 shows waveforms on power-supply and ground wirings in SR running at Sck of 100 MHz. Two from the eight PFEs were applied, and moreover, waveform acquisition using the on-chip kernel and also that with the full use of external equipments as in a conventional way were compared. Obviously, waveforms in both on-chip and off-chip signal generation are quite consistent, thus proposed architecture of Figure 2 is successfully implemented. In TG, the replica DLL was locked to Sck through a programmable divider with a factor of 4, thus 25 MHz, and 10-bit VSDG generated the delay step of roughly 40 ps. While in VG, the center between  $V_{refp}$  and  $V_{refm}$  was roughly set at SF's shift voltage and their voltage difference was 200 mV, thus the 10-bit R-2R ladder created the voltage step of roughly 200  $\mu$ V.

Actual waste time of the on-chip and off-chip signal gen-



## Figure 16. Results of long-time waveform acquisition.

eration with SA algorithm for 1024 point waveform acquisition is summarized in Figure 15, which is larger than estimation in Figure 6 due mainly to additional time of data transfer from LA to PC in this measurement setup. However, more than 95% reduction of measurement time was achieved in on-chip waveform monitoring system, as expected.

Figure 16 shows an example of long-time waveform monitoring on the same power-supply and ground wirings, where Mck input to VSDG relatively moved in 40 ns whenever the 10-bit step delay generation completed, as was shown in Figure 5, until waveform acquisition of the total of 24 clock cycles of Sck. Each waveform includes 6144 data points, which is only possible with the on-chip signal generation, in terms of the measurement time. Apparent 4-clock cycle patterns are seen in the waveforms, resulting from a 4-bit repetitive pattern of "0011" written in SR.

# 5 Conclusion

Multi-channel waveform monitoring technique was proposed and implemented with a 10-bit incremental reference voltage generator using R-2R ladder and a 10-bit incremental sample timing generator incorporating a variable step delay generator. The prototype in a 0.18- $\mu$ m CMOS technology demonstrated on-chip multi-point waveform acquisition at 40-ps and 200- $\mu$ V resolutions synchronously to a digital test circuit running at 100 MHz. The waveforms were as accurate as those by an off-chip measurement technique, while more than 95% reduction of the waste time in waveform monitoring was achieved. It was obviously proven that the developed technique was waveform accurate, area efficient, low cost, and capable of monitoring analog and digital signals, although limited experiments on a digital test circuit were described. These factors are all requisite for diagnosing methodology toward mixed analog and digital signal integrity in a systems-on-a-chip era.

### Acknowledgments

This work is supported by Semiconductor Technology Academic Research Center (STARC). The authors would like to thank Masaki Hirata, Shin'ichiro Azuma, Toshihiko Mori, Shiro Doushoh, and Hiroaki Ohkubo for helpful discussions. Test chips were fabricated by Hitachi Ltd. in the chip fabrication program provided by VLSI Design and Education Center (VDEC) of the University of Tokyo, in collaboration with Dai Nippon Printing Corporation.

#### References

[1] G. W. Roberts, "Improving the Testability of Mixed-Signal Integrated Circuits," in *Proc. Custom IC Conf.*, May. 1997, pp. 214-221.

[2] Y. Zheng and K. Shepard, "On-Chip Oscilloscopes for Noninvasive Time-Domain Measurement of Waveforms in Digital Integrated Circuits," *IEEE Trans. VLSI Systems*, June 2003, pp. 336-344.

[3] M. Takamiya, M. Mizuno, and K. Nakamura, "An On-chip 100GHz-Sampling Rate 8-channel Sampling Oscilloscope with Embedded Sampling Clock Generator," in *ISSCC Dig. Tech. Papers*, Feb. 2002, pp. 182-183.

[4] K. M. Fukuda, T. Anbo, T. Tsukada, T. Matsuura, and M. Hotta, "Voltage-Comparator-Based Measurement of Equivalently Sampled Substrate Noise Waveforms in Mixed-Signal Integrated Circuits," *IEEE J. Solid-State Circuits*, Vol. 31, No. 5, pp. 726-731, May. 1996.

[5] M. Nagata, J. Nagai, T. Morie, and A. Iwata, "Measurements and Analyses of Substrate Noise Waveform in Mixed-Signal IC Environment," *IEEE Trans. CAD of Integrated Circuits and Systems*, June 2000, pp. 671-678.

[6] M. Heijningen, J. Compiet, P. Wambacq, S. Donnay, M. G. E. Engels, and I. Bolsens, "Analysis and Experimental Verifi cation of Digital Substrate Noise Generation for Epi-Type Substrates," *IEEE J. Solid-State Circuits*, July 2000, pp. 1002-1008.

[7] A. Muhtaroglu, G. Taylor, and T. Rahal-Arabi, "On-Die Droop Detector for Analog Sensing of Power Supply Noise," *IEEE J. Solid-State Circuits*, April 2004, pp. 651-660.

[8] K. Shimazaki, M. Nagata, T. Okumoto, S. Hirano, and H. Tsujikawa, "Dynamic Power-Supply and Well Noise Measurement and Analysis for High Frequency Body-Biased Circuits," in *Symp. on VLSI Circuits Dig. Tech. Papers*, pp. 94-97, June 2004.

[9] T. Okumoto, M. Nagata, and K. Taki, "A Built-in Technique for Probing Power-Supply Noise Distribution within Large-Scale Digital Integrated Circuits," in *Symp. on VLSI Circuits Dig. Tech. Papers*, pp. 98-101, June 2004.

[10] R. Ho, B. Amrutur, K. Mai, B. Wilburn, T. Mori, and M. Horowitz, "Applications of On-Chip Samplers for Test and Measurement of Integrated Circuits," in *Symp. VLSI Circuits Dig. Tech. Papers*, June 1998, pp. 138-139.

[11] R. Kuppuswamy, K. Callahan, K. Wong, D. Ratchen, and G. Tayloer, "On-die Clock Jitter Detector for High Speed Microprocessors," in *Symp. VLSI Circuits Dig. Tech. Papers*, June 2001, pp. 187-190.

[12] M. Takamiya, H. Inohara, and M. Mizuno, "On-Chip Jitter-Spectrum-Analyzer for High-Speed Digital Designs," in *ISSCC Dig. Tech. Papers*, Feb. 2004, pp. 350-351.