# Crosstalk Avoidance Codes for 3D VLSI 

Rajeev Kumar Sunil P. Khatri<br>Department of ECE, Texas A\&M University, College Station TX 77843


#### Abstract

In 3D VLSI, through-silicon vias (TSVs) are relatively large, and closely spaced. This results in a situation in which noise on one or more TSVs may deteriorate the delay and signal integrity of neighboring TSVs. In this paper, we first quantify the parasitics in contemporary TSVs, and then come up with a classification of crosstalk sequences as $0 \mathrm{C}, 1 \mathrm{C}, \ldots 8 \mathrm{C}$ sequences. Next, we present inductive approaches to quantify the exact overhead for 8C, 6C and 4C crosstalk avoidance codes (CACs) for a $3 \times n$ mesh arrangement of TSVs. These overheads for different CACs for a $3 \times n$ mesh arrangement of TSVs are used to calculate the lower bounds on the corresponding overheads for an $n \times n$ mesh arrangements of TSVs. We also discuss an efficient way to implement the coding and decoding (CODEC) circuitry for limiting the maximum crosstalk to 6C. Our experimental results show that for a TSV mesh arrangement driven by inverters implemented in a 22 nm technology, the coding based approaches yields improvements which are in line with the theoretical predictions.


## I. Introduction and Previous Work

Through Silicon Via (TSV) based 3D stacking technology has garnered a lot of interest from academia and industry over the past few years. It is expected to alleviate the problem of long interconnects in 2D VLSI ICs by providing another dimension for logic to communicate along. It also allows the possibility of stacking ICs implemented in different technologies, to create heterogeneous 3D structures. As a result, much research is being conducted in all aspects of the 3D IC technology. The works of [1] [2] [3] [4] are targeted towards developing TSV manufacturing and die-todie as well as wafer-to-die bonding technologies. In the physical design space, placement algorithms for TSV based 3D ICs are proposed in [5] [6]. Several research efforts have been devoted to the 3D routing problem as well [7][8]. Solutions for building 3D clock tress, and testing individual die before bonding are proposed in [9], [10].
The use of TSVs results in a significant capacitive coupling between adjacent TSVs owing to their large dimensions and tight spacing. This causes significant pattern-dependent noise artefacts. This crosstalk noise can affect chip timing by slowing down transitions on a switching signal when a large number of its neighbors perform opposite transitions. It can also cause functionality issues by causing large glitches on a static signal when its aggressing neighbors transition. Analytical expressions for parasitics, including TSV-to-TSV coupling capacitances for simple arrangements of TSVs are presented in [11] [12]. However, very little work has been done to provide systematic solutions for reducing crosstalk among TSVs. The study done in [13] shows that crosstalk between TSVs in a full 3D chip is significant, and the authors propose shielding and buffer insertion based solutions to mitigate the problem. The shielding based solution is employed in a post-placement step, to find the candidate TSVs for shielding. Buffer insertion is performed in a post route environment, when accurate timing information is available. They also show that spacing TSVs apart is an inefficient solution for reducing coupling noise. The solutions proposed in [13] are applied late in the design flow (postplacement, post-routing), and work on a case by case basis to reduce capacitive coupling. No theoretical bounds are provided in [13] for the area overhead of shielding or the maximum capacitive coupling on any TSV after applying these solutions. In contrast, in this paper, we discuss correct-by-construction crosstalk canceling codes for a regular 2D mesh of TSVs, to guarantee an upper bound on the amount of capacitive coupling observed by any victim TSV. This is particularly relevant in the scenario where a regular 2D mesh of TSVs connects one die to another die. One example of such a scenario is when the core logic and caches of a microprocessor are implemented on different dies, and a dense 2D mesh of TSVs is used to interconnect them.
The main contributions of this work are:

- We classify crosstalk patterns into different classes based on the maximum crosstalk they incur ( $8 \mathrm{C}, 7 \mathrm{C}, 6 \mathrm{C}, 5 \mathrm{C} \ldots 0 \mathrm{C}$ ).
- We present exact overhead of codes to reduce inter-TSV crosstalk for a $3 \times n$ mesh of TSVs, and derive lower bounds on the overheads associated with codes for an $n \times n$ mesh of TSVs.
- We show that by extending the 1 D crosstalk avoidance codes proposed in [14][15], we can come up with an efficient 2D code which reduces the maximum crosstalk by 25 percent, with an encoder and decoder of quadratic complexity in circuit area and an asymptotic overhead (in terms of number of additional bits) of $\sim 40 \%$. We also demonstrate the improvement in delay of our $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C codes over an uncoded scenario, by performing HSPICE [16] simulations using 22nm PTM [17] transistor models.


## II. Approach

In this section, we start with a discussion of our TSV parasitic modeling experiments. This is followed by a classification of data patterns on a 2 D mesh of TSVs into multiple classes, based on the maximum crosstalk that is experienced by any TSV in the mesh. Then, we derive mathematical bounds (exact values for a $3 \times n$ mesh, and lower bounds for an $n \times n$ mesh) on the Crosstalk Avoidance Code (CAC) overhead for each class. We end by presenting a technique to implement an efficient CODEC for a 6C CAC.

## A. Parasitic Modeling of TSVs

Several analytical models for resistance, capacitance and inductance of TSVs have been proposed in the literature. For our work, we performed our own parasitic extraction using the 3D parasitic extraction tool Raphael. Figure 1 shows the dimensions of each TSV in a $3 \times 3$ fragment of the $9 \times 9$ TSV mesh we used for parasitic extraction. Based on the TSV dimensions projected for the 2015-2018 timeframe in the 2011 edition of the ITRS [18], TSV diameter, TSV height and TSV-to-TSV spacing are taken as $1 \mu \mathrm{~m}, 50 \mu \mathrm{~m}$ and $1 \mu \mathrm{~m}$ respectively. Also, each TSV has an $\mathrm{SiO}_{2}$ insulating dielectric layer of width $0.1 \mu \mathrm{~m}$ around it. TSVs are assumed to be made of copper $(\mathrm{Cu})$ with a resistivity of $1.6 \times 10^{-8} \Omega-\mathrm{m}$. The dielectric constants of the silicon substrate in which TSVs are placed, and the $\mathrm{SiO}_{2}$ insulating layers are taken as 3.9 and 12 , respectively. We assume that the 3D IC comprises of three stacked dies, and only consider the coupling between TSVs.


Fig. 1. TSV Mesh Structure used for Parasitic Extraction
Table I shows the coupling capacitances for a mesh of TSVs. The value $C_{i j}$ in any entry $(i, j)$ in the table reports the coupling capacitance of a TSV to another TSV which is $i$ TSVs away horizontally, and $j$ TSVs away vertically. Thus, $C_{10}$ and $C_{01}$ are the coupling capacitances of a TSV to its adjacent TSV to its east (or west) and north (or south) direction respectively. The $(0,0)$ entry represents the capacitance to ground of any TSV in the mesh. We note that $C_{i j}=C_{j i}$. It can be also noted from the table that coupling capacitance for a value of $i$ or $j$ greater than 1 is more than two orders of magnitude less than $C_{10}$ and $C_{01}$, and can be neglected for crosstalk analysis. Also, $C_{11}$ is around $1 / 5$ of $C_{10}$ and
thus has a limited impact on crosstalk. We also extracted inductance, and found that for frequencies up to 10 GHz , the oscillation frequency $\frac{1}{\sqrt{L C}}$ of the system was much greater than the operational frequency. Hence, inductance values are not reported. However, inductance parasitics are included in our HSPICE [16] simulations.

|  | $j=0$ | $j=1$ | $j=2$ | $j=3$ | $j=4$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $i=0$ | $1.2 \times 10^{-19}$ | $1.2 \times 10^{-14}$ | $2.0 \times 10^{-16}$ | $4.8 \times 10^{-18}$ | $1.2 \times 10^{-19}$ |
| $i=1$ | $1.2 \times 10^{-14}$ | $3.0 \times 10^{-15}$ | $1.0 \times 10^{-16}$ | $2.8 \times 10^{-18}$ | $7.4 \times 10^{-20}$ |
| $i=2$ | $2.0 \times 10^{-16}$ | $1.0 \times 10^{-16}$ | $8.3 \times 10^{-18}$ | $3.8 \times 10^{-19}$ | $1.5 \times 10^{-20}$ |
| $i=3$ | $4.8 \times 10^{-18}$ | $2.8 \times 10^{-18}$ | $3.8 \times 10^{-19}$ | $2.8 \times 10^{-20}$ | $2.2 \times 10^{-21}$ |
| $i=4$ | $1.2 \times 10^{-19}$ | $7.4 \times 10^{-20}$ | $1.5 \times 10^{-20}$ | $2.2 \times 10^{-21}$ | $5.1 \times 10^{-22}$ |

TABLE I
CAPACITANCES FOR A 9X9 MESH OF TSVS (IN FARADS)

## B. Crosstalk Classification

In this section, we classify the data patterns in a 2D mesh of TSVs into different crosstalk classes. For this classification, we ignore the coupling capacitances $C_{i j}$ (where $i, j \geq 2$ ). The coupling capacitance $C_{11}$ of a TSV to any of its diagonally neighbouring TSV is roughly $1 / 5$ of its coupling capacitance ( $C_{01}$ or $C_{10}$ ) to any of its adjacent TSVs. Clearly, the sum of the coupling capacitances of a TSV to its four diagonally neighbouring TSVs, though not negligible, is less than its coupling capacitance to any of its adjacent TSVs. Thus, the overhead of including the four diagonally neighbouring TSVs in any coding scheme is not justified by the corresponding crosstalk improvement achieved. Hence, for our classification, we do not consider the diagonal neighbours. For any TSV in a data pattern, only its adjacent (East, West, North and South) TSVs are considered. In our results, all the diagonally neighbouring TSVs are assumed to be contributing the worst-case crosstalk by switching in a direction opposite to that of the TSV under consideration.

Consider a $n \times n$ mesh consisting of signals $b_{11}, b_{12} \cdots b_{n-1, n} \cdots$ $b_{n, n}$.

Definition 1: A pattern $P$ is an assignment of values $v_{i j}$ to $b_{i j}$ such that $v_{i j} \in[0,1]$
Let $P_{k}$ and $P_{k+1}$ be two successive patterns (sequence) applied on the TSV mesh. Let $v_{i j}^{k}$ and $v_{i j}^{k+1}$ be the values of signal $b_{i j}$ in $P_{k}$ and $P_{k+1}$ respectively. We now define the eight type of crosstalk conditions below:
A sequence is an 8 C sequence if $\exists i, j, k$ such that

$$
\begin{aligned}
& v_{i, j}^{k}=v_{i-1, j}^{k+1}=v_{i+1, j}^{k+1}=v_{i, j-1}^{k+1}=v_{i, j+1}^{k+1}=v \\
& v_{i, j}^{k+1}=v_{i-1, j}^{k}=v_{i+1, j}^{k}=v_{i, j-1}^{k}=v_{i, j+1}^{k}=\bar{v}
\end{aligned}
$$

Intuitively, it means that from $P_{k}$ to $P_{k+1}$, all four of the adjacent neighbors of $b_{i j}$ switch in a direction opposite to its direction of switching, which results in $b_{i j}$ charging a load of $8 C$ during its transition, where $C$ is $C_{01}$, the capacitance between a TSV and any one of its adjacent neighbours.
A sequence is a 7 C sequence if it is not a 8 C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, three of the four adjacent neighbors of $b_{i j}$ switch in the direction opposite to its direction of switching and one of them remains static.

A sequence is a 6 C sequence if it is not an 8 C or 7 C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, out of the four neighbors of $b_{i j}$ a) three switch in the direction opposite to its direction of switching and one switches in the same direction, or b) two switch in the direction opposite to its direction of switching and two remain static

A sequence is a 5 C sequence if it is not an 8 C or 7 C or 6 C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, out of the four neighbors of $b_{i j}$ a) two switch in the direction opposite to its direction of switching, one switches in the same direction and one remains static, or b) one switches in the direction opposite to its direction of switching and three remain static.
A sequence is a 4 C sequence if it is not a 8 C or 7 C or 6 C or 5 C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, out of the four neighbors of $b_{i j}$ a) two switch in the direction opposite to its direction of switching and two switch in the same direction, or b) one switches in the direction
opposite to its direction of switching, one switches in the same direction and two remain static, or c) all four remain static.
A sequence is a 3 C sequence if it is not a 8 C or 7 C or 6 C or 5 C or 4C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, out of the four neighbors of $b_{i j}$ a) one switches in the direction opposite to its direction of switching, two switch in the same direction and one remains static, or b) one switches in the same direction and three remain static.

A sequence is a 2 C sequence if it is not a 8 C or 7 C or 6 C or 5 C or 4C or 3C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, out of the four neighbors of $b_{i j}$ a) one switches in the direction opposite to its direction of switching and three switch in the same direction, or b) two switch in the same direction and two remain static.

A sequence is a 1 C sequence if it is not a 8 C or 7 C or 6 C or 5 C or 4 C or 3C or 2C sequence and $\exists i, j, k$ such that from $P_{k}$ to $P_{k+1}$, out of the four neighbors of $b_{i j}$, three switch in the same direction and one remains static.

A sequence is a 0 C sequence if it is not a 8 C or 7 C or 6 C or 5 C or 4 C or 3 C or 2 C or 1 C sequence.
Note that for $b_{i j}$ which lie on the the edges of the mesh and have less than four adjacent neighbors, the absent neighbor can be assumed to be switching in the same direction as that of $b_{i j}$ so that it does not contribute any load, and the definitions stated above hold.

## C. Crosstalk Avoidance Code (CAC)

A $k C(k=0,1 . .8)$ Crosstalk Avoidance Code (CAC) for a $n \times n$ mesh is a set of patterns such that no crosstalk sequence of $(k+1) C$ or above occurs on any TSV for a transition between any two patterns in the set. If $T^{k C}(n)$ is the number of valid patterns for a $k C$ CAC, then, number of bits $m$ which can be actually transmitted using this CAC is given by:-

$$
\begin{equation*}
m=\left\lfloor\log _{2}\left(T^{k C}(n)\right)\right\rfloor \tag{1}
\end{equation*}
$$

The overhead (i.e. the number of additional bits used per useful bit) of using this $k C$ CAC is defined as:-

$$
\begin{equation*}
O v h=\left(n^{2}-m\right) / m \tag{2}
\end{equation*}
$$

Before discussing CACs for different $k$, we define the following types of patterns:-
8C: A pattern $P$ is a 8 C pattern if $\exists i, j$ such that all the four adjacent neighbors of $b_{i j}$ have a value which is the complement of its value.
$\mathbf{6 C}$ : A pattern $P$ is a 6 C pattern if $\exists i, j$ such that exactly three out of the four adjacent neighbors of $b_{i j}$ have a value which is the complement of its value.
4C: A pattern $P$ is a 4C pattern if $\exists i, j$ such that exactly two out of the four adjacent neighbors of $b_{i j}$ have a value which is the complement of its value.

2C: A pattern $P$ is a 2C pattern if $\exists i, j$ such that exactly one out of the four adjacent neighbors of $b_{i j}$ have a value which is the complement of its value.
Lemma 2.1: If a CAC CC does not have any pattern of the type 8C, then it is a 6 C CAC.

Proof: By the definition of an 8C crosstalk sequence, both patterns $P_{k}$ and $P_{k+1}$ should be of the type $8 C$. Since $C C$ has no $8 C$ patterns, it cannot have an $8 C$ crosstalk sequence. Similarly, by definition, a 7 C crosstalk sequence must have one of the patterns $P_{k}$ or $P_{k+1}$ of the type 8 C . Hence, $C C$ can not have a 7C crosstalk sequence either. Thus, $C C$ is a 6C CAC.
By arguing along these lines, it can be shown that:-

1. If a CAC does not have any pattern of the type 8 C and 6 C , it is a 4 C CAC.
2. If a CAC does not have any pattern of the type $8 \mathrm{C}, 6 \mathrm{C}$ and 4 C , it is a 2C CAC.

Calculating the exact overhead of $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C CACs for a $n \times n$ mesh requires enumerating all the patterns and counting the valid ones, which is computationally infeasible. Hence, in the following subsections, we derive the exact overheads of $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C CACs for a $3 \times n$ mesh, and then extend these results to derive the lower bounds on the overheads of $6 \mathrm{C}, 4 \mathrm{C}$ and 2C CACs for a $n \times n$ mesh. A $3 \times n$ mesh is used as the building block because the inductive equations for counting the number of valid patterns are of lower complexity as compared to that of a $4 \times n$ or
$5 \times n$ mesh, while still allowing the calculation of a tighter lower bound as compared to using a $2 \times n$ mesh.

1) Number of 6C CAC Patterns for a $3 \times n$ Mesh (exact): In this section, an exact formulation for the number of valid 6 C bit patterns for a $3 \times n$ mesh is derived. A $3 \times n$ mesh refers to a mesh with 3 rows and $n$ columns.The formulation is inductive, and derives expressions for valid 6 C bit patterns for a $3 \times n$ mesh from the expressions for valid 6 C bit patterns for a $3 \times n-1$ mesh. To simplify notation, when we say that a bit in a pattern has 6 C crosstalk, it implies that exactly three out of it's four West, North, East and South neighboring bits have a value which is the complement of it's bit value. The terminology for the expressions is written below:-
$n=$ number of columns in a $3 \times n$ mesh
corner $=$ for a $3 \times n$ mesh, corner refers to the bits with the coordinates $(1, n)$ and ( $3, n$ )
$T_{6}(n)=$ total number of valid 6 C patterns for a $3 \times n$ mesh
$T^{0 C}(n)=$ total number of valid 6 C patterns in which bit $(2, n)$ has a crosstalk of 0C
$T^{2 C}(n)=$ total number of valid 6C patterns in which bit $(2, n)$ has a crosstalk of 2 C
$T^{4 C}(n)=$ total number of valid 6 C patterns in which bit $(2, n)$ has a crosstalk of 4C
$T^{6 C}(n)=$ total number of valid 6 C patterns in which bit $(2, n)$ has a crosstalk of 6C

Based on the definitions, we know that:

$$
\begin{equation*}
T_{6}(n)=T^{0 C}(n)+T^{2 C}(n)+T^{4 C}(n)+T^{6 C}(n) \tag{3}
\end{equation*}
$$

We now present the inductive step:

$$
\begin{gather*}
T^{0 C}(n)=T^{0 C}(n-1)+T^{2 C}(n-1)+T^{4 C}(n-1)+T^{6 C}(n-1)  \tag{4}\\
T^{2 C}(n)=3 T^{0 C}(n-1)+3 T^{2 C}(n-1)+3 T^{4 C}(n-1)+2 T^{6 C}(n-1)  \tag{5}\\
T^{4 C}(n)=3 T^{0 C}(n-1)+3 T^{2 C}(n-1)+3 T^{4 C}(n-1)+T^{6 C}(n-1)  \tag{6}\\
T^{6 C}(n)=T^{0 C}(n-1)+T^{2 C}(n-1)+T^{4 C}(n-1) \tag{7}
\end{gather*}
$$

Figure 2 illustrates the procedure for deriving the inductive equation for $T^{6 C}(n)$. Figure 2(a) depicts how a $T^{0 C}(n-1)$ pattern is extended to create a single $T^{6 C}(n)$ pattern. By definition of $T^{6 C}(n)$, bit $(2, n)$ is 6 C . Thus, if the value of bit $(2, n-1)$ is $b$, then the values of bits $(1, n),(2, n)$ and (3, $n$ ) should be $b, \bar{b}$ and $b$ respectively. Clearly, every $T^{0 C}(n-1)$ pattern can be extended to create a single $T^{6 C}(n)$ pattern. A similar analysis can be applied to create $T^{6 C}(n)$ patterns by extending $T^{2 C}(n-1)$ and $T^{4 C}(n-1)$ patterns as well.
Figure 2(b) shows how a $T^{6 C}(n-1)$ pattern is extended to create a $T^{6 C}(n)$ pattern. By definition of $T^{6 C}(n-1)$, bit $(2, n-1)$ is 6 C , and by definition of $T^{6 C}(n)$, bit $(2, n)$ is 6 C as well. Clearly, this is an invalid condition, since as a crosstalk of 6 C on bit $(2, n)$ will result in a crosstalk of 8 C on bit ( $2, n-1$ ) as shown in Figure 2(b). Thus, no $T^{6 C}(n)$ pattern can be created by extending a $T^{6 C}(n-1)$ pattern. A similar analysis can be done to derive the inductive equations for $T^{0 C}(n), T^{2 C}(n)$ and $T^{4 C}(n)$ as well.


Fig. 2. Illustration of Inductive Derivation for $T^{6 C}(n)$ for a 6C CAC
2) Number of $4 C$ CAC Patterns for a $3 \times n$ Mesh (exact): In this section, we derive an exact formulation for the number of valid 4 C bit patterns for a $3 \times n$ mesh. The formulation is inductive, and derives expressions for valid 4C bit patterns for a $3 \times n$ mesh from the expressions for valid 4C bit patterns for a $3 \times n-1$ mesh. To simplify notation, when we say that
a bit in a pattern has 4C crosstalk, it implies that exactly two out of it's four West, North, East and South neighboring bits have a value which is the complement of it's bit value. The terminology for the expressions is written below:-

$$
n=\text { number of columns in a } 3 \times n \text { mesh }
$$

corner $=$ for a $3 \times n$ mesh, corner refers to the bits with the coordinates $(1, n)$ and $(3, n)$
$T_{4}(n)=$ total number of valid 4C patterns for a $3 \times n$ mesh
$T^{0 C}(n)=$ total number of valid 4 C patterns in which bit $(2, n)$ has a crosstalk of 0C
$T_{v_{2, n} \neq v_{2, n-1}}^{2 C}(n)=$ total number of valid 4C patterns in which bit (2, n) has a crosstalk of 2 C and the value of bit $(2, n)$ is the complement of the value of bit $(2, n-1)$
$T_{v_{2, n}=v_{2, n-1}, \text { Ocorner }}^{2 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 2C, the value of bit $(2, n)$ is the same as that of bit $(2, n-1)$ and none of the corner bits has a crosstalk of 4C
$T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{2 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 2C, the value of bit $(2, n)$ is the same as that of bit $(2, n-1)$ and exactly one of the corner bits has a crosstalk of 4 C
$T_{v_{2, n}=v_{2, n-1}, \text { Ocorner }}^{4 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 4 C , the value of bit $(2, n)$ is the same as that of bit ( $2, n-1$ ) and none of the corner bits has a crosstalk of 4C
$T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{4 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 4 C , the value of bit $(2, n)$ is the same as that of bit $(2, n-1)$ and exactly one of the corner bits has a crosstalk of 4 C
$T_{v_{2, n}=v_{2, n-1}, 2 \text { corner }}^{4 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 4 C , the value of bit $(2, n)$ is the same as that of bit $(2, n-1)$ and both of the corner bits have a crosstalk of 4 C
$T_{v_{2, n} \neq v_{2, n-1}, \text { Ocorner }}^{4 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 4 C , the value of bit $(2, n)$ is the complement of the value of bit $(2, n-1)$ and none of the corner bits has a crosstalk of 4C
$T_{v_{2, n} \neq v_{2, n-1}, 1 \text { corner }}^{4 C}(n)=$ total number of valid 4C patterns in which bit $(2, n)$ has a crosstalk of 4 C , the value of bit $(2, n)$ is the complement of the value of bit $(2, n-1)$ and exactly one of the corner bits has a crosstalk of 4C

## Based on the definitions, we know that:

$$
\begin{align*}
T_{4}(n)= & T^{0 C}(n)+T_{v_{2, n} \neq v_{2, n-1}}^{2 C}(n)+T_{v_{2, n}=v_{2, n-1}, \text { Ocorner }}^{2 C}(n) \\
& +T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{2 C}(n)+T_{v_{2, n}=v_{2, n-1}, \text { Ocorner }}^{4 C}(n) \\
& +T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{4 C}(n)+T_{v_{2, n}=v_{2, n-1}, 2 \text { corner }}^{4 C}(n) \\
& +T_{v_{2, n} \neq v_{2, n-1}, \text { Ocorner }}^{4 C}(n)+T_{v_{2, n} \neq v_{2, n-1}, 1 \text { corner }}^{4 C}(n) \tag{8}
\end{align*}
$$

We now present the inductive step:

$$
\begin{aligned}
T^{0 C}(n)= & T^{0 C}(n-1)+T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)+T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)+T_{v_{2, n-1} \neq v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)(9)
\end{aligned}
$$

$$
\begin{align*}
T_{v_{2, n} \neq v_{2, n-1}}^{2 C}(n)= & T^{0 C}(n-1)+T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-}, \text { Ocorner }}^{2 C}(n-1)+ \\
& T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1) \tag{10}
\end{align*}
$$

$$
T_{v_{2, n}=v_{2, n-1}, \operatorname{Ocorner}}^{2 C}(n)=T_{v_{2, n-1}=v_{2, n-2}, \operatorname{Ocorner}}^{2 C}(n-1)
$$

$$
+T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1)
$$

$$
+2 T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)
$$

$$
+T_{v_{2, n-1}=v_{2, n-2}, \text { lcorner }}^{4 C}(n-1)
$$

$$
+T_{v_{2, n-1} \neq v_{2, n-2}, O c o r n e r}^{4 C}(n-1)
$$

$$
\begin{equation*}
+T_{v_{2, n-1} \neq v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1) \tag{11}
\end{equation*}
$$

$$
\begin{align*}
T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{2 C}(n)= & 2 T^{0 C}(n-1)+2 T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1) \\
& +T_{v_{2, n-1} \neq v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1) \tag{12}
\end{align*}
$$

$$
\begin{align*}
& T_{v_{2, n}=v_{2, n-1}, \text { Ocorner }}^{4 C}(n)=T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1) \\
& +T_{v_{2, n-1} \neq v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-2}, 2 \text { corner }}^{4 C}(n-1)  \tag{13}\\
& T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{4 C}(n)=T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1) \\
& +T_{v_{2, n-1} \neq v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1) \\
& +T_{v_{2, n-1} \neq v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1)  \tag{14}\\
& T_{v_{2, n}=v_{2, n-1}, 2 \text { corner }}^{4 C}(n)=T^{0 C}(n-1)+T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)  \tag{15}\\
& T_{v_{2, n}=v_{2, n-1}, \text { Ocorner }}^{4 C}(n)=2 T^{0 C}(n-1)+2 T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1) \\
& +T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1)  \tag{16}\\
& T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{4 C}(n)=T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1) \tag{17}
\end{align*}
$$

We will now discuss the procedure for the inductive derivation of $T^{0 C}(n)$. By definition, in a $T^{0 C}(n)$ pattern, the bit $(2, n)$ is $0 C$. This implies that all three of its neighboring bits $(1, n),(3, n)$ and $(2, n-1)$ have the same value $b$. We go through each pattern for $3 \times n-1$ mesh one by one, and discuss how they can be extended to generate $T^{0 C}(n)$ patterns for a $3 \times n$ mesh.
a) $T^{0 C}(n-1)$ :- By definition, in a $T^{0 C}(n-1)$ pattern, all the bits in the column $n-1$ have the same value. As illustrated in Figure 3(a), each $T^{0 C}(n-1)$ pattern can be extended to generate a single $T^{0 C}(n)$ pattern.
b) $T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)$ :- By definition, in a $T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)$ pattern, bit $(2, n-1)$ is 2 C and the value of bit $(2, n-2)(\bar{b})$ is the complement of the value of bit $(2, n-1)(b)$. This implies that bits $(1, n-1)$ and $(3, n-1)$ have the value $b$. Hence, as illustrated in Figure 3(b), each $T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)$ pattern can be extended to create a single $T^{0 C}(n)$ pattern.
c) $\quad T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1)$ :- By definition, in a $T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1)$ pattern, bit $(2, n-1)$ is $2 C$, the value of bit $(2, n-2)(b)$ is same as that of bit $(2, n-1)(b)$ and none of the bits $(1, n-1)$ and $(3, n-1)$ is 4C. Clearly, as illustrated in Figure 3(c), each $T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1)$ pattern results in a single $T^{0 C}(n)$ pattern.
d) $\quad T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1)$ :- $\quad$ The difference among $T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1) \quad$ and $\quad T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{2 C}(n-1) \quad$ patterns discussed in c) is as follows. In $T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1)$, one of the bits $(1, n-1)$ and $(3, n-1)$ is 4 C . This implies that no $T^{0 C}(n)$ patterns can be generated from any of the $T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{2 C}(n-1)$ patterns, as illustrated in Figure 3(d), since this would result in 6C crosstalk on $(1, n-1)$ or $(3, n-1)$. Hence, this term does not appear in Equation 9.
e) $\quad T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ :- By definition, in a $T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ pattern, bit $(2, n-1)$ is 4C, the value of bit $(2, n-2)(b)$ is same as that of bit $(2, n-1)(b)$, and none of the bits $(1, n-1)$ and $(3, n-1)$ is $4 C$. Thus, similar to the $T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)$ case, each $T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ pattern results in a single $T^{0 C}(n)$ pattern, as shown in Figure 3(e).
f) $T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1)$ :- A $T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1)$ pattern is same as a $T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C^{2}}(n-1)$ pattern except for the fact that one of the bits $(1, n-1)$ and $(3, n-1)$ in a $T_{v_{2, n-1}=v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1)$ pattern is 4C. Clearly, this results in an invalid pattern, as illustrated in Figure 3(f), since $(1, n-1)$ or $(3, n-1)$ becomes 6 C . Hence, this term does not appear in Equation 9.
g) $T_{v_{2, n-1}=v_{2, n-2}, 2 \text { corner }}^{4 C}(n-1)$ :- A $T_{v_{2, n-1}=v_{2, n-2}, 2 \text { corner }}^{4 C}(n-1)$ pattern is same as a $T_{v_{2, n-1}=v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ pattern except that both of the bits $(1$, $n-1)$ and $(3, n-1)$ in a $T_{v_{2, n-1}=v_{2, n-2}, 2 \text { corner }}^{4 C}(n-1)$ pattern are 4C. Thus, no $T^{0 C}(n)$ pattern can be generated from it, as shown in Figure 3(g).


Fig. 3. Illustration of Inductive Derivation for $T_{v_{2, n}=v_{2, n-1}, 1 \text { corner }}^{2 C}$ for a 4C CAC
h) $\quad T_{v_{2, n-1} \neq v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ :- By definition, in a $T_{v_{2, n-1} \neq v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ pattern, $\operatorname{bit}(2, n-1)$ is 4C, the value of $\operatorname{bit}(2, n-2)(\bar{b})$ is the complement of the value of bit $(2, n-1)(b)$ and none of the bits $(1, n-1)$ and $(3, n-1)$ is 4C. Clearly, each $T_{v_{2, n-1} \neq v_{2, n-2}, \text { Ocorner }}^{4 C}(n-1)$ pattern can be extended to generate a single $T^{0 C}(n)$ pattern, as illustrated in Figure 3(h).
i) $T_{v_{2, n-1} \neq v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1)$ :- In a $T_{v_{2, n-1} \neq v_{2, n-2}, 1 \text { corner }}^{4 C}(n-1)$ pattern, one of the bits $(1, n-1)$ and $(3, n-1)$ is 4 C . This implies that no $T^{0 C}(n)$ patterns can be generated from it, since one corner would become 6 C after the induction step. This is shown in Figure 3(i).

Based on the discussions in paragraph a) through i) above, Equation 9 is derived by adding the terms in paragraphs a), b), c), e) and h). The same approach as discussed for $T^{0 C}(n)$ patterns can be used to derive expressions for Equations 10 through 17 .
3) Number of $2 C$ CAC Patterns for a $3 \times n$ Mesh (exact): In this section, we present an exact formulation for the number of valid 2 C patterns for a $3 \times n$ mesh. The formulation is inductive, and derives expressions for valid 2C patterns for a $3 \times n$ mesh from the expressions for valid 2 C patterns for a $3 \times n-1$ mesh. A bit in a bit pattern has 2 C crosstalk, if exactly one of it's four West, North, East and South neighboring bits has a value which is the complement of it's bit value. The terminology we use is given below:
$n=$ number of columns in a $3 \times n$ mesh
corner $=$ for a $3 \times n$ mesh, corner refers to the bits with the coordinates $(1, n)$ and $(3, n)$
$T_{2}(n)=$ total number of valid 2 C patterns for a $3 \times n$ mesh
$T_{0 \text { corner }}^{0 C}(n)=$ total number of valid 2C patterns in which bit $(2, n)$ has a crosstalk of 0 C and none of the corner bits has a crosstalk of 2C
$T_{1 \text { corner }}^{0 C}(n)=$ total number of valid 2C patterns in which bit $(2, n)$ has a crosstalk of 0 C and only one of the two corner bits has a crosstalk of 2C
$T_{2 \text { corner }}^{0 C}(n)=$ total number of valid 2C patterns in which bit $(2, n)$ has a crosstalk of 0 C and both of the two corner bits have a crosstalk of 2C
$T_{v_{2, n} \neq v_{2, n-1}}^{2 C}(n)=$ total number of valid 2C patterns in which bit $(2, n)$ has a crosstalk of 2 C and the value of bit $(2, n)$ is the complement of the value of bit ( $2, n-1$ )
$T_{v_{2, n}=v_{2, n-1}}^{2 C}(n)=$ total number of valid 2C patterns in which bit $(2, n)$ has a crosstalk of 2 C and the value of bit $(2, n)$ is same as that of bit ( 2 , $n-1$ )
Based on the definitions, we know that:

$$
\begin{equation*}
T_{2}(n)=T_{0 \text { corner }}^{0 C}(n)+T_{1 \text { corner }}^{0 C}(n)+T_{2 \text { corner }}^{0 C}(n)+T_{v_{2, n} \neq v_{2, n-1}}^{2 C}(n)+T_{v_{2, n}=v_{2, n-1}(1)}^{2 C}(n) \tag{18}
\end{equation*}
$$

We now present the inductive step:

$$
\begin{gather*}
T_{0 \text { corner }}^{0 C}(n)=T_{0 \text { corner }}^{0 C}(n-1)+T_{v_{2, n-1} \neq v_{2, n-2}}^{2 C}(n-1)+T_{v_{2, n-1}=v_{2, n-2}}^{2 C}(n-1)  \tag{19}\\
T_{1 \text { corner }}^{0 C}(n)=0  \tag{20}\\
T_{2 \text { corner }}^{0 C}(n)=0  \tag{21}\\
T_{v_{2, n}!=v_{2, n-1}}^{2 C}(n)=T_{0 \text { corner }}^{0 C}(n-1)  \tag{22}\\
T_{v_{2, n}=v_{2, n-1}}^{2 C}(n)=T_{v_{2, n-1}=v_{2, n-2}}^{2 C}(n-1) \tag{23}
\end{gather*}
$$

Due to the space limitations, we do not discuss the derivation of inductive Equations 19 through 23. A procedure similar to that used in Sections II-C. 1 and II-C. 2 for deriving the inductive equations is used in this case as well.
Having presented exact induction based formulae for $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C crosstalk for a $3 \times n$ mesh of TSVs, we now present bounds for a $n \times n$ mesh of TSVs, for $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C CACs.
4) Lower Bound on Overhead of a 6C CAC for a $n \times n$ Mesh: A lower bound on the overhead of a 6C CAC for an $n \times n$ mesh can be calculated from an upper bound on the number of 6C CAC patterns for the same mesh, by using Equation 2. To calculate an upper bound on the number of 6C CAC patterns for an $n \times n$ mesh, we note that an $n \times n$ mesh can be generated by arranging $(\mathrm{n} / 3) 3 \times n$ meshes one of the top of other. Clearly, $(\mathrm{n} / 3)$ times the number of 6C CAC patterns for a $3 \times n$ mesh upper bounds the number of 6C CAC patterns for an $n \times n$ mesh as some patterns will become invalid at the boundaries between any two adjacent $3 \times n$ meshes. Thus, the exact overhead for a $3 \times n$ mesh is a lower bound the overhead for an $n \times n$ mesh.
5) Lower Bound on Overhead of a 4C and 2C CAC for a $n \times n$ Mesh: The argument presented above can applied in the case of 4C and 2C CAC patterns for an $n \times n$ mesh, to deduce that the overhead of 4C and 2C CAC patterns for a $3 \times n$ mesh are lower bounds on the overhead of 4 C and 2C CAC patterns for an $n \times n$ mesh respectively.

## D. A Simple 6C CAC Coding Scheme for an $n \times n$ Mesh

The implementation of a CAC for an $n \times n$ mesh requires an encoder at the transmitting end and a decoder at the receiving end. A simpleminded way to build an encoder and decoder would be to enumerate all the valid 6C patterns for an $n \times n$ mesh, and map every valid 6C pattern to an input binary word. The mappings can be chosen in a manner so as to minimize the circuit size. However, in general, the size of such an encoder
and decoder circuitry increases exponentially with increasing $n$ [15]. Also, enumerating all valid 6 C patterns is feasible only for small values of $n$. To the best of the author's knowledge, there is no encoding/decoding scheme for a 4C or 2C CAC for an $n \times n$ mesh, which results in an encoder/decoder circuit size complexity that grows polynomially with $n$. This remains an open problem. However, a polynomial complexity encoding/decoding scheme for a 6 C code can be implemented by observing that if every column in an $n \times n$ mesh is encoded using a one dimensional 2C CAC, then the resulting CAC for the $n \times n$ mesh is a 6C CAC. A coding scheme for a 2C CAC for a one dimensional bus, based on the Fibonacci Number System (FNS), was proposed in [15]. It results in an encoder and decoder with a circuit size complexity which grows as $n^{2}$. Figure 4 shows an example of how the FNS can be used to encode a one dimensional bus to ensure a maximum crosstalk of 2 C . In the FNS based representation, the weight of any bit is equal to the sum of the weights of two preceeding bits. The binary represenation of 27 results in the bit with the weight of 4 experiencing a crosstalk of 4 C , whereas the FNS based representation results in a maximum crosstalk of 2C on any bit. Figure 4 presents the binary (top) and FNS based representation (middle and bottom) of the number 26 . In the binary representation, the bit with weight 2 experiences a crosstalk of 4C. The FNS based representation (Figure 4-middle) has a crosstalk of 4 C occuring on the bit with the weight of 5 . However, this FNS based representation can be transformed (Figure 4-bottom) into an equivalent representation, which has a maximum crosstalk of 2C occurring on any bit, by setting the two highlighted bits to 0 and the next higher bit to 1 .

By using the FNS based one dimensional 2C encoding along each of the $n$ columns of an $n \times n$ mesh, a coding scheme with the circuit size complexity of $n^{3}$ can be implemented. The asymptotic overhead for such a coding scheme is 0.44 [15], but for $n$ up to 20, the maximum overhead for any $n$ was found to be around 0.35 [15]. The encoder/decoder logic area overhead for such a coding scheme, extrapolated to 22 nm technology from the 90 nm technology area numbers published in [15], is $386 \mu^{2}$ for a $10 \times 10$ mesh.
$\frac{1}{2^{4}}$
$\frac{1}{2^{3}}$
$\frac{1}{2^{2}}$$\frac{0}{2^{1}} \frac{1}{2^{0}}$

Fig. 4. Fibonacci Number System Based Coding for a 2C CAC on a one Dimensional Bus

## III. Experimental Results

We implemented the inductive equations for the number of $6 \mathrm{C}, 4 \mathrm{C}$ and 2C CAC patterns for a $3 \times n$ mesh, using $n=2$ as the base case. The base case is derived by enumerating the patterns. Figure 5 a), b) and c) show the plots of overhead versus $n$ for $6 \mathrm{C}, 4 \mathrm{C}$ and 2C CAC for a $3 \times n$ mesh respectively. The asymptotic overheads for $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C CAC are $3 \%$, $33 \%$ and $335 \%$ respectively. It should be noted that these overheads are lower bounds on the overhead of corresponding CACs on an $n \times n$ mesh. The extremely high overhead of $335 \%$ for 2C CAC makes them unsuitable for any practical use.

To validate our theoretical predictions that the worst delay through an inverter (buffer) driving a TSV is proportional to the maximum coupling capacitance, we extracted the parasitics for a $9 \times 9$ mesh of TSVs by using Raphael [19], and simulated the $8 \mathrm{C}, 6 \mathrm{C}, 4 \mathrm{C}, 2 \mathrm{C}$ and 0 C crosstalk sequences for the center TSV (victim) in the mesh by applying different logic transitions to its adjacent East, West, North and South TSVs. All the other TSVs are made to cause worst crosstalk on the victim by always making them switch in a direction opposite to that of the victim. All circuit simulations are done using HSPICE [16]. The coupling capacitance values used are as reported in Table I. Each TSV is driven by a chain of 3 cascaded inverters. The input slew used is 10 ps , and sizes of the first, second and third inverter are $3 \times, 10 \times$ and $30 \times$ of the minimum size respectively. These sizes are selected so as to get a maximum slewrate of


Fig. 5. Overhead for $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C CACs for a $3 \times \mathrm{n}$ mesh

100 ps at the output of third inverter which drives the TSV. This allows us to operate the 3D stack at $\sim 2 G H z$. 22 nm Predictive Technology Model (PTM) [17] is used for simulations. Table II presents the delay from the input of third inverter to the far end of the TSV for different crosstalk sequences. The 0 C sequence corresponds to the case of no capacitive coupling between the center TSV and its four adjacent neighbors. The incremental delay ( $d_{x t a l k}$ ) from the 0 C to the 8 C crosstalk sequence is $42.1 p s$. $d_{\text {xtalk }}$ represents the maximum increase in delay due to crosstalk from the adjacent TSVs. Table III shows the incremental delays from 0C to $2 \mathrm{C}, 2 \mathrm{C}$ to $4 \mathrm{C}, 4 \mathrm{C}$ to 6 C and 6 C to 8 C crosstalk sequences, and their contribution to $d_{x t a l k}$. Clearly, the results show that the increase in delay due to crosstalk is proportional to the increase in coupling capacitance. 2C CAC patterns achieve the highest improvement in delay (the 0C CAC is a degenerate code), albeit at a much higher overhead.

| Crosstalk <br> Sequence | Delay (ps) |
| :---: | :---: |
| 8C | 61.8 |
| 6 C | 51.9 |
| 4 C | 41.6 |
| 2C | 30.2 |
| 0C | 19.7 |

TABLE II
DELAYS FOR DIFFERENT CROSSTALK SEQUENCES

| Transition | Incremental <br> Delay $(\mathrm{ps})$ | \% of $d_{\text {xtalk }}$ |
| :---: | :---: | :---: |
| 6C CAC $\rightarrow 8 \mathrm{C} \mathrm{CAC}$ | 13.0 | 23.4 |
| 4C CAC $\rightarrow$ 6C CAC | 14.5 | 24.4 |
| 2C CAC $\rightarrow 4 \mathrm{C}$ CAC | 15.8 | 27.0 |
| 0C CAC $\rightarrow$ 2C CAC | 9.5 | 24.8 |

TABLE III
Incremental delays between CACs

In practice, the complex design of today are heavily pipelined. In case of such systems, the maximum TSV mesh data transfer rate is significantly improved by using coding. In such pipelined systems, the encoding/decoding delays are hidden in the additional pipelined stages, provided the delays are smaller than the clock period.

Owing to the small resistance of a TSV and the fact that it mainly has a capacitive nature, the inverters can be sized up further to reduce $d_{x t a l k}$ at the cost of power. However, this will require a longer, more power-hungry chain of inverters, which may offset the delay improvement achieved. Also, with decreasing TSV diameter and TSV-to-TSV spacing in future technologies, the TSV resistance and coupling capacitance will increase, resulting in a more distributed RC parasitic behavior of a TSV, making it even more important to use coding on the 2D mesh of TSVs.

## IV. Conclusions

In this paper, we present techniques to alleviate crosstalk in 3D VLSI. The through-silicon vias (TSVs) used in 3D VLSI are significantly capacitive, and therefore the switching of neighboring TSVs can degrade the speed as well as signal integrity of a victim TSV. After quantifying 3D parasitics in this context, we classify crosstalk patterns as $8 \mathrm{C}, 7 \mathrm{C}, 6 \mathrm{C}$, ... 0 C patterns. We present inductive approaches to quantify the exact overhead for $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C crosstalk avoidance codes (CAC) for a
$3 \times n$ mesh arrangement of TSVs. We also provide lower bounds of the overheads for an $n \times n$ mesh arrangement of TSVs. An efficient method to realize the 6 C CODECs is presented as well. Our results show that the asymptotic overheads for $6 \mathrm{C}, 4 \mathrm{C}$ and 2 C codes for a $3 \times n$ bus are $3 \%$, $33 \%$ and $335 \%$ respectively. By means of HSPICE experiments conducted on the 22 nm technology node, we demonstrate that our coding based improvements match with simulation results.

## References

[1] E. Beyne, P. De Moor, W. Ruythooren, R. Labie, A. Jourdain, H. Tilmans, D. Tezcan, P. Soussan, B. Swinnen, and R. Cartuyvels, "Through-silicon via and die stacking technologies for microsystems-integration," in Electron Devices Meeting, 2008. IEDM 2008. IEEE International, pp. 1-4, Dec. 2008.
[2] F. Liu, R. Yu, A. Young, J. Doyle, X. Wang, L. Shi, K.-N. Chen, X. Li, D. Dipaola, D. Brown, C. Ryan, J. Hagan, K. Wong, M. Lu, X. Gu, N. Klymko, E. Perfecto, A. Merryman, K. Kelly, S. Purushothaman, S. Koester, R. Wisnieff, and W. Haensch, "A $300-\mathrm{mm}$ wafer-level three-dimensional integration scheme using tungsten throughsilicon via and hybrid Cu-adhesive bonding," in Electron Devices Meeting, 2008. IEDM 2008. IEEE International, pp. 1 -4, Dec. 2008.
[3] J. Van Olmen, A. Mercha, G. Katti, C. Huyghebaert, J. Van Aelst, E. Seppala, Z. Chao, S. Armini, J. Vaes, R. Teixeira, M. Van Cauwenberghe, P. Verdonck, K. Verhemeldonck, A. Jourdain, W. Ruythooren, M. de Potter de ten Broeck, A. Opdebeeck, T. Chiarella, B. Parvais, I. Debusschere, T. Hoffmann, B. De Wachter, W. Dehaene, M. Stucchi, M. Rakowski, P. Soussan, R. Cartuyvels, E. Beyne, S. Biesemans, and B. Swinnen, "3D stacked IC demonstration using a through silicon via first approach," in Electron Devices Meeting, 2008. IEDM 2008. IEEE International, pp. 1-4, Dec. 2008.
[4] D. Chen, W. Chiou, M. Chen, T. Wang, K. Ching, H. Tu, W. Wu, C. Yu, K. Yang, H. Chang, M. Tseng, C. Hsiao, Y. Lu, H. Hu, Y. Lin, C. Hsu, W. Shue, and C. Yu, "Enabling 3D-IC foundry technologies for 28 nm node and beyond: through-siliconvia integration with high throughput die-to-wafer stacking," in Electron Devices Meeting (IEDM), 2009 IEEE International, pp. 1-4, Dec. 2009.
[5] B. Goplen and S. Sapatnekar, "Efficient thermal placement of standard cells in 3D ICs using a force directed approach," in Computer Aided Design, 2003. ICCAD-2003. International Conference on, pp. 86 - 89, Nov. 2003.
[6] K. Athikulwongse, A. Chakraborty, J.-S. Yang, D. Pan, and S. K. Lim, "Stress-driven 3DIC placement with TSV keep-out zone and regularity study," in Computer-Aided Design (ICCAD), 2010 IEEE/ACM International Conference on, pp. 669 -674, Nov. 2010.
[7] T. Zhang, Y. Zhan, and S. S. Sapatnekar, "Temperature-aware routing in 3D ICs," in in Proceedings of the Asia-South Pacific Design Automation Conference, pp. 309-314, 2006.
[8] C.-J. Chang, P.-J. Huang, T.-C. Chen, and C.-N. J. Liu, "ILP-based inter-die routing for 3D ICs," in Proceedings of the 16th Asia and South Pacific Design Automation Conference, ASPDAC '11, (Piscataway, NJ, USA), pp. 330-335, IEEE Press, 2011.
[9] X. Zhao, D. Lewis, H.-H. Lee, and S. K. Lim, "Pre-bond testable low-power clock tree design for 3D stacked ICs," in Computer-Aided Design - Digest of Technical Papers, 2009. ICCAD 2009. IEEE/ACM International Conference on, pp. 184 -190, Nov. 2009.
[10] X. Zhao, J. Minz, and S. K. Lim, "Low-power and reliable clock network design for through-silicon via (TSV) based 3D ICs," Components, Packaging and Manufacturing Technology, IEEE Transactions on, vol. 1, pp. 247 -259, Feb. 2011.
[11] I. Savidis and E. Friedman, "Closed-form expressions of 3-D via resistance, inductance, and capacitance," Electron Devices, IEEE Transactions on, vol. 56, pp. 1873 -1881, Sept. 2009.
[12] D. H. Kim, S. Mukhopadhyay, and S. K. Lim, "Fast and accurate analytical modeling of through-silicon-via capacitive coupling," Components, Packaging and Manufacturing Technology, IEEE Transactions on, vol. 1, pp. 168-180, Feb. 2011.
[13] C. Liu, T. Song, J. Cho, J. Kim, J. Kim, and S. K. Lim, "Full-chip TSV-to-TSV coupling analysis and optimization in 3D IC," in Design Automation Conference (DAC), 2011 48th ACM/EDAC/IEEE, pp. $783-788$, June 2011.
[14] C. Duan, C. Zhu, and S. Khatri, "Forbidden transition free crosstalk avoidance CODEC design," in Design Automation Conference, 2008. DAC 2008. 45th ACM/IEEE, pp. 986 -991, June 2008.
[15] C. Duan, V. Calle, and S. Khatri, "Efficient on-chip crosstalk avoidance CODEC design," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 17, pp. 551 -560, April 2009.
[16] HSPICE http://www.synopsys.com/products/ mixedsignal/hspice/hspice.html.
[17] PTM http://www.eas.asu.edu/~ptm.
[18] "The International Technology Roadmap for Semiconductors." http://public.itrs.net/, 2003.
[19] "Raphael Interconnect Analysis Tool: User's Guide,"

