# TRAM: A Tool for Temperature and Reliability Aware Memory Design

Amin Khajeh<sup>†</sup>, Aseem Gupta<sup>†</sup>, Nikil Dutt<sup>†</sup>, Fadi Kurdahi<sup>†</sup>, Ahmed Eltawil<sup>†</sup>, Kamal Khouri<sup>§</sup>, and Magdy Abadir<sup>§</sup>

<sup>†</sup> University of California, Irvine Irvine, CA 92697 USA {akhajehd, aseemg, dutt, kurdahi, aeltawil}@uci.edu §Freescale Semiconductor Inc. Austin, TX 78729 USA {kamal.khouri,m.abadir}@freescale.com

Abstract-Memories are increasingly dominating Systems on Chip (SoC) designs and thus contribute a large percentage of the total system's power dissipation, area and reliability. In this paper, we present a tool which captures the effects of supply voltage  $V_{dd}$  and temperature on memory performance and their interrelationships. We propose a Temperature- and Reliability- Aware Memory Design (TRAM) approach which allows designers to examine the effects of frequency, supply voltage, power dissipation, and temperature on reliability in a mutually interrelated manner. Our experimental results indicate that thermal unaware estimation of probability of error can be off by at least two orders of magnitude and up to five orders of magnitude from the realistic, temperature-aware cases. We also observed that thermal aware  $V_{dd}$  selection using TRAM can reduce the total power dissipation by up to 2.5X while attaining an identical predefined limit on errors.

## I. INTRODUCTION

Process scaling has enabled systems to offer much higher computational power and performance at the expense of rising power consumption and operating temperatures. Higher operating temperatures have many adverse effects on the design such as: need for expensive cooling mechanisms, increased leakage power, reduced interconnect lifetime, accelerated electromigration, increased cell delay, and increased probability of errors in memories and logic [1]. In this paper, we focus on the effect of temperature on reliable operation of memories.

Memories are not error free and are designed to have a certain probability of error during operation. Typically, this probability of error is controlled to be significantly smaller than the error tolerance capabilities of the units utilizing the memory. In fact, there exists many mechanisms in literature such as redundancy and error correction techniques that are used to reduce or eliminate memory errors [2]. SoC designers aim to achieve the best possible performance (both speed and probability of error) while minimizing the total power consumption. This is an especially challenging problem since there is an intrinsic trade-off between the power consumption and the error resiliency of a design. Conventionally, designers tend to increase the supply voltage  $V_{dd}$  on critical embedded memories to boost reliability<sup>1</sup> (reduce the probability of errors), albeit at a cost of increased power dissipation and increased temperature. In this paper, we show that for highly

<sup>1</sup>Note: In the scope of this paper, a lower probability of errors means higher reliability and the two terms will be used interchangeably.

scaled technologies, when combining the effects of shrinking geometries, higher power densities and a larger fraction of the overall chip area occupied by memories, the overall heating impact attributed to increasing the supply voltage will lead to a *degradation* in the probability of error for a memory access rather than improving it. For a given access time, temperature increases the cell delay which causes the probability of errors in a memory to increase [3]. An increase in  $V_{dd}$  also increases the dynamic power dissipation of the memory cell which raises the temperature of the memory. Thus there are two conflicting phenomena: increase in  $V_{dd}$  which reduces memory errors and a resultant increase in temperature which increases memory errors. Another factor which influences the dynamic power dissipation and reliability is frequency. An increase in frequency increases both dynamic power dissipation and the probability of errors in memory. Finally, temperature has a very significant impact on the leakage power dissipation. In fact, there exists a positive feedback loop between temperature and leakage power [4].

The main contribution of this paper is a tool (TRAM: Temperature and Reliability Aware Memory Design) which allows designers to comprehensively examine the aggregate effect of  $V_{dd}$ , frequency, dynamic & leakage power, and temperature on the reliability of memories. Using TRAM we observed that when we consider the thermal aspect during memory system design (1)**reducing**  $V_{dd}$  **can help improve both the reliability and the power dissipation**, which is contrary to the conventional practice, and (2) the predicted reliability is more realistic than using traditional corner-based design (the latter provides overly optimistic reliability prediction).

### **II. MEMORY ERRORS**

## A. Sources of errors

Classically, failures in embedded memory cells are categorized as either of a transient nature (because of operating conditions) or of a fixed nature (due to manufacturing errors). Failures are manifested as: 1) increase in cell access time, or 2) unstable read/write operations. In process technologies larger than 100nm, fixed errors are predominant, with a minority of the errors introduced due to transient effects. This model cannot be sustained as scaling progresses due to the random nature of the fluctuation of dopant atom distributions [12]. In fact, in sub 100nm design, Random Dopant Fluctuation (RDF)



Fig. 1. Sensitivity of Memory Errors to Various Parameters

has the dominant impact on the transistor's strength mismatch and is the most noticeable type of intra-die variation that can lead to cell instability and failure in embedded memories [9]. RDF has a detrimental effect on transistors that are colocated within one cell by creating a mismatch in their intrinsic threshold voltage,  $V_t$ . Furthermore, these effects are a strong function of the operating conditions such as voltage, frequency, temperature, etc.

## B. Sensitivity of Errors

Fig. 1 shows how errors in memory are affected by different parameters. As the operating frequency is increased the probability of memory errors increases because it enforces tighter bounds on the time allowance for memory accesses. Increase in  $V_{dd}$  reduces the cell delay and thus causes the errors to decrease. The errors in memory increase along with the rise in temperature because of increase in the cell delay. These are not the only relationships that affect memory errors. From Fig. 1 we also examine other interrelationships at work: The dynamic power dissipation in memory cells increases with increase in both frequency ( $\propto f$ ) and  $V_{dd}$  ( $\propto V_{dd}^2$ ). The leakage power, on the other hand, increases with  $V_{dd}$  $(\propto e^{\beta V_{dd}}, \beta > 1)$ . Both dynamic power and leakage power determine the operating temperature. Leakage power dissipation of a cell is known to increase super-linearly with increase in temperature. As temperature increases, the leakage power dissipation increases which further elevates the temperature. This 'positive feedback loop' between temperature and leakage power stabilizes when steady state operating temperatures have been reached at which state, all the dynamic and leakage power dissipation is transferred to the environment by the package [4]. This discussion implies that probability of error is not a monotonically decreasing function of supply voltage but rather exhibits a convex behavior as shown in Fig. 1. A comprehensive approach to memory/logic design must consider these mutual interdependent relationships.

#### **III. MEMORY MODELS**

In order to study memory errors, we use a standard six transistor 65nm SRAM memory bit cell. The SPICE models used for the simulation were obtained from the Predictive Technology Model (PTM) [10]. To model the effects of RDF on the probability of bit failures within a memory array, our simulation setup lumps the RDF effects into an independent Gaussian distribution for each transistor's threshold voltage. For transistor X in the SRAM cell:

$$V_{t_X} \sim N(\mu_{V_{t_X}}, \sigma_{V_{t_X}}) \tag{1}$$

where  $\mu_{V_{t_X}}$  is the nominal  $V_t$  and  $\sigma_{V_{t_X}}$  is the standard deviation of threshold voltage of transistor X and depends on the manufacturing process, doping profile, and transistor sizing and can be calculated as [9]:

$$\sigma_{V_{t_X}} = \frac{qT_{ox}}{\epsilon_0} \sqrt{\frac{N_a W_d}{3LW}} \tag{2}$$

where  $N_a$  is the effective channel doping,  $W_d$  is the depletion region width,  $T_{ox}$  is the oxide thickness, L and W are the channel length and width respectively, q is the electron charge and  $\epsilon_0$  is the dielectric constant.

Fig. 2 shows the typical six-transistor cell used for CMOS Static Random-Access Memories (SRAM). During the read operation, the read time,  $T_{Read}$  is very sensitive to the variations in the threshold voltages of the access transistors  $(S_{R/L})$  ( $S_R$  or  $S_L$ ) and the pull-down transistors  $(N_{R/L})$ . Whereas during the write operation, variations in the threshold voltages of the access transistors and the pull-up transistors  $(P_{R/L})$  have the strongest effect on the write time,  $T_{Write}$  [6]. In order to calculate the probability of failure, we considered  $\pm 6\sigma_{V_{t_X}}$  variation for the threshold voltages of  $S_{R/L}$ ,  $N_{R/L}$ , and  $P_{R/L}$  based on the gaussian distribution in Equation 1. The read time, write time, and the voltage at the storage node  $(V_{R/L})$  for **each**  $V_{dd}$  and **each** temperature was simulated and recorded. In the next step, the cell failure probability was calculated as following:

- (i) Read Failure (RF): An increase in the access time of the cell during read operation such that  $T_{Read} > T_{Max}$ , where  $T_{Max}$  is the maximum allowed time.
- (ii) Write Failure (WF): An increase in write time such that  $T_{Write} > T_{Max}$ .
- (iii) Destructive Read Failure (DRF): An increase in the storage node voltage such that  $V_{R/L} > V_{Trip}$ , where  $V_{Trip}$  is the trip voltage of the inverter in the SRAM. In this case the value stored in the cell will flip.

It is important to note that since we are not changing the hold voltage  $(V_{Hold})$ , we do not consider the hold failure in our calculations. The total probability of failure then calculated as:

$$P_{failure} = P[RF \cup WF \cup DRF]$$



Fig. 2. 6T SRAM cell

## IV. MATHEMATICAL MODELING

As mentioned in Section III, the threshold voltage of the transistors in the SRAM cell can be modeled as independent gaussian random variables. This variation in the threshold voltages will result in gaussian distributions for the read time, write time, and storage node voltage. In this section, we find the first and the second moments of the distribution of these three parameters,  $T_{Read}$ ,  $T_{Write}$ , and  $V_{R/L}$ .

## A. Read time distribution

Gaussian distributions of  $V_{t_{S_{R/L}}}$  and  $V_{t_{N_{R/L}}}$  result in a gaussian distribution for the read time [6]:

$$T_{Read} \sim N\left(\mu(V_{dd}, T), \sigma(V_{dd}, T)\right) \tag{3}$$

where T is the temperature,  $\mu(V_{dd}, T)$  is the mean of the read time and  $\sigma(V_{dd}, T)$  is the standard deviation of the read time. For a given frequency, F, one can find the maximum allowed time for the read operation,  $T_{Max}$ . Therefore, we can define the probability of error as:

$$P_e\left(V_{dd}, T, T_{Max}\right) = P[T_{Read} > T_{Max}] = Q(\tau)|_{\tau=t} \quad (4)$$

where

$$t = \frac{T_{Max} - \mu(V_{dd,T})}{\sigma(V_{dd},T)} \tag{5}$$

and Q(.) is the Gaussian Error Integral or Q-function and is given by:

$$Q(\tau) = \frac{1}{\sqrt{2\pi}} \int_{\tau}^{+\infty} e^{\left(-\frac{x^2}{2}\right)} dx \tag{6}$$

For a given probability of error of  $P_{E0}$ ,  $t_0$  can be calculated such that  $Q(t_0) = P_{E0}$ . Thus for  $P_{E0}$  and F, we have:

$$t_0 = \frac{T_{Max} - \mu(V_{dd,T})}{\sigma(V_{dd},T)} \tag{7}$$

where  $t_0$  and  $T_{Max}$  are constant. Solving this equation for  $V_{dd}$  results in:

$$V_{dd} = G(T) \tag{8}$$

In other words, from Equation 8 one can calculate the required  $V_{dd}$  that results in probability of error  $P_{E0}$  at temperature

T. On the other hand, the choice of floor plan, frequency of operation and activity of the neighboring blocks result in the dependency of temperature to  $V_{dd}$  which can be expressed as:

$$T = H(V_{dd}) \tag{9}$$

The function H is fixed for a block (in this case embedded memory) in a given floorplan and can be calculated by using simulation (HotSpot [13]) or can be found at run time by using temperature sensors. Therefore, for a given probability of error and frequency, one can find the appropriate  $V_{dd}$ , which results in the probability of error  $P_{E0}$  for a given frequency while factoring in the effect of temperature by solving:

$$V_{dd} - G(H(V_{dd})) = 0$$
 (10)

### *B. Read time parameters*

Using Monte Carlo HSPICE simulation for read time and solving nonlinear data-fitting problems in least-squares sense for  $V_{dd}$  and T results in the following formulation for the mean and standard deviation of the random variable  $T_{Read}$ :

$$\mu(V_{dd}, T) = a_1 V_{dd}^3 + a_2 T^3 + a_3 V_{dd}^2 T + a_4 V_{dd} T^2 + a_5 V_{dd}^2 + a_6 T^2 + a_7 V_{dd} T + a_8 V_{dd} + a_9 T + a_{10} \sigma(V_{dd}, T) = b_1 V_{dd}^3 + b_2 T^3 + b_3 V_{dd}^2 T + b_4 V_{dd} T^2 + b_5 V_{dd}^2 + b_6 T^2 + b_7 V_{dd} T + b_8 V_{dd} + b_9 T + b_{10}$$
(11)

By substituting equation 11 in equation 7 one can solve a cubic equation that has one real root that expresses  $V_{dd}$ as a function of temperature. Every cubic equation with real coefficients has at least one solution among the real numbers; this is a consequence of the intermediate value theorem [14]. Equation 7 can now be expressed as:

$$t_0(b_1V_{dd}^3 + \dots + b_{10}) - \left(T_{Max} - (a_1V_{dd}^3 + \dots + a_{10})\right) = 0$$
(12)

where  $t_0$ ,  $T_{Max}$  are constants. It is required to solve the above equation for  $V_{dd}$  and find the three possible roots,  $V_{1,2,3}$  as a function of T, i.e.  $G_k(T)$ , k = 1, 2, 3.

For all the acceptable range of temperature,  $T_{Max}$ , and  $t_0$  equation 12 yields one real root, G(T), and two complex roots,  $G_{i_{1,2}}(T)$  [17].

Fig. 3 illustrates G(T)s, the real solutions to Equation 12 for a fixed frequency and different probability of error (solid lines). Each solid line implies a target probability of error. The figure also shows  $H(V_{dd})$ , the temperature profile for a fixed floorplan (dashed line). The intersection(s) of the dashed line with each solid line will be the solution to Equation 10 for the corresponding  $t_0$  ( or  $P_{E0}$ ). This figure shows that for high reliability (low probability of failure) there will be more than one  $V_{dd}$  which results in the same probability of error. This is due to the dependency of the probability of error on both temperature and  $V_{dd}$ . In these cases the optimum  $V_{dd}$  will be the lower  $V_{dd}$  which results in lower power consumption. Similar analysis can be done for Write



Fig. 3. G(T), H(T) and the optimum  $V_{dd}$  choices for read operation in 65nm 6-T SRAM for different probability of error and a fixed frequency.



Fig. 4. G(T), H(T) and the optimum  $V_{dd}$  choices for write operation in 65nm 6-T SRAM for different probability of error and a fixed frequency.

time as shown in Fig. 4. For the circuit under test (the choice of cell and periphery circuit sizing) the destructive read failure is negligible. However similar analysis can be done for destructive read failure as well.

## V. TRAM: TEMPERATURE AND RELIABILITY AWARE MEMORY DESIGN

Fig. 5 illustrates the overview of TRAM which generates reliability values for embedded memory blocks as a function of the supply voltage and frequency while taking temperature into account. The operating frequency for memory accesses determines the memory's dynamic power dissipation for a given  $V_{dd}$ . As mentioned earlier, there is a positive feedback loop between leakage and temperature. Unlike leakage power, the dynamic power is not directly affected by temperature As the temperature increases the total power dissipation also increases. Moreover, the heat dissipated to the environment by the package increases as its temperature increases. The package dissipates power from the substrate to the environment in the form of heat and in order to do so, the package temperature must be more than the surrounding environment. A steady state operating temperature is reached when the power generated by the blocks is balanced by the power dissipated by the cooling mechanisms and package. If the



Fig. 5. Overview of TRAM.

power generated becomes greater than the capacity of power dissipation by the package, the temperatures will rise beyond the thermal runaway temperature and there will be a thermal melt down. This phenomenon has been validated by data from both industrial test facilities and other published works [15], [16].

The thermal floorplan of the SoC is used to obtain  $V_{dd}$ -Temperature dependency. The memory temperature rises as the  $V_{dd}$  increases because of the dependence of dynamic and leakage power on  $V_{dd}$ . As  $V_{dd}$  increases, the probability of errors decrease for the memory. However, an accompanying increase in the temperature increases the probability of errors. The proposed tool (TRAM) uses temperature estimates of the memory under design to generate a better estimate of the expected probability of error in the memory. TRAM includes the effect of RDF on the threshold voltages as a gaussian distribution as discussed Section III . TRAM allows the designers to explore memory  $V_{dd}$  selection for the tradeoff between power and reliability in a thermally aware environment and quantifies the effect of temperature on the probability of failures in memories. We have observed that in some design spaces, the effect of temperature becomes more dominant than that of  $V_{dd}$ . As a result an increase in  $V_{dd}$  does not necessarily decrease the probability of failure. The following section describes our simulation setup including three sets of results: (i) Comparison of reliability estimates at estimated temperature profile versus two corner case temperatures, (ii) Effect of temperature on the reliability estimates for memory at four different speeds, and (iii) Expected total power savings by thermal aware  $V_{dd}$ selection.

## VI. EXPERIMENTAL RESULTS

### A. Simulation setup

The temperature of a memory block in a SoC does not depend only on its own power density (watt/mm<sup>2</sup>). Due to thermal diffusion among neighboring blocks, the power densities of the neighboring logic blocks (or IP-blocks) affect the temperature of the memory. Hence, the floorplan of the SoC must be considered during temperature estimation of the memory. For our experiments, we used HotSpot 3.2 [13] to determine the temperature of memory blocks at different supply voltages. Furthermore, the platform models the positive feedback relation between temperature and leakage power. For each  $V_{dd}s$  setting and estimated temperature a SPICE simulation is performed to calculate the probability of failure using the memory error models described in Section III.

TABLE I TRAM SIMULATION TIME

| Component                     | Simulation time (sec) |
|-------------------------------|-----------------------|
| Temperature Estimator         | HotSpot: 150          |
| Memory reliability calculator | HSPICE: 420           |
|                               | Matlab: 120           |

TRAM utilizes HSPICE (version Z - 2007.03 - SP1) for circuit simulations. All TRAM simulations have been performed on a Dell PowerEdge 2900 machine (operating system redhat enterprise linux 4 update 4,RHEL4.4) with 4 GB RAM and 3.0 GHz intel Xeon CPU speed. the simulated voltages are  $V_{dd} = 1.0v \pm 0.3v$ , where  $V_{dd_{Nominal}} = 1.1v$ . In circuit simulations, for each  $V_{dd}$  the temperature has been varied from  $25^{\circ}C$  to  $155^{\circ}C$  with the steps of  $5^{\circ}C$ . Table I includes the HotSpot simulation time to calculate the temperature profile for a given floorplan of an SoC with 18 blocks. This simulation needs to be executed only once per floorplan. For each  $V_{dd}$ the table includes the probability of failure simulation time including HSPICE circuit simulation time and Matlab (Version 7.2.0.283 ,R2006a) simulation time for probability of error calculation. The mentioned simulation time is for each  $V_{dd}$ and the whole range of temperatures  $(25^{\circ}C \longrightarrow 150^{\circ}C)$ .

## B. Reliability estimates for different thermal profiles

Fig. 6 shows the relationship between the probability of error and the  $V_{dd}$  for a cell with maximum allowed time of 65ps. The curves show the probability of error for the estimated temperature profile and at two corner case temperatures of  $25^oC$  and  $105^oC.$  Since  $\pm 6\sigma_{V_{t_X}}$  is considered for each transistor, the smallest probability of error that can be recorded is  $10^{-18}$ . From the figure, we observe that the probability of error is significantly higher at higher temperatures. For example, at  $V_{dd} = 0.9v$  the probability of error is in the order of  $10^{-8}$  at  $25^{\circ}C$  versus  $10^{-1}$  at  $105^{\circ}C^{1}$ . However, by starting at room temperature and considering the interrelationships between  $V_{dd}$  and temperature, we observe that the probability of error is in the order of  $10^{-6}$  at  $V_{dd} = 0.9v$ . It is important to observe that for the Vdd dependent curve, an inversion in the trend of probability of error occurs at  $V_{dd} = 1.1v$  (marked  $\gamma$ ). This occurs due to the dominance of the temperature effect (which increases the probability of error). Note that an increase in  $V_{dd}$  fails to reduce the probability of error. This observation demonstrates that an increase in  $V_{dd}$  does not guarantee a reduction in the probability of error because of the



Fig. 6. Probability of Error for Different Temperature Profiles



Fig. 7. Probability of Error for Different Frequencies

temperature effect. Furthermore, we observe that using a fixed temperature model, even when assuming a high temperature corner, provides an overly optimistic estimate of memory reliability at higher  $V_{dd}$ s, and a significant deviation from realistic cases at all  $V_{dd}$ s.

## C. Effect of $V_{dd}$ and temperature on memory reliability

The effect of interplay between  $V_{dd}$ , probability of error, and temperature for different cell speeds is shown in Fig. 7. Curves (1) through (4) represent cells with decreasing maximum allowed times of 70*ps*, 67*ps*, 65*ps*, and 60*ps* respectively. We observe that as the frequency of the cell increases, the probability of error also increases. We also observe that for curves (2), (3), and (4), an increase in  $V_{dd}$ reduces the probability of error but only up to a certain point (marked **X**). After **X**, the rise in temperature due to  $V_{dd}$ increases the memory errors. However, for Curve (1) we do not observe this behavior because the speed of the cell is low and the probability of failure is not detectable by our simulation setup. In the absence of thermal considerations, these curves would have continued to exhibit decreasing probabilities of error with increasing  $V_{dd}$ .

#### D. Power saving

Fig. 8 shows the normalized total power dissipation and the probability of error for a cell with maximum allowed time of

 $<sup>^{1}25^{</sup>o}C$  and  $105^{o}C$  are industry standards for cell characterization.



Fig. 8. Probability of Error and Total Power



Fig. 9. Comparing analytical and simulation results.

65*ps*. Initially, the effect of increase in  $V_{dd}$  is dominant and probability of error decreases with increase in  $V_{dd}$ . However, at higher  $V_{dd}$  the effect of resulting temperature becomes dominant and probability of error increases with increase in  $V_{dd}$ . The figure illustrates that for a given probability of failure target, two voltage levels can be chosen that achieve the desired target. For example, at a target probability of error of  $10^{-8}$ , one can select either 1.0v (Point A) and 1.16v (PointB). However, the dynamic power at 1.0v is 34.5% less than that at 1.16v because  $dynamic power \propto V_{dd}^2$ . Even without thermal dependence the leakage power  $\propto e^{\beta V_{dd}}$ ,  $\beta > 1$ . Thus, the total power at 1.0v is 2.5X less than that at 1.16v. Designers can save significant power by operating at lower  $V_{dd}$  voltages while maintaining performance levels.

### E. Comparing mathematical and simulation results

Fig. 9 compares the analytical and simulation results. For simulations, 6T SRAM cell (as shown in Fig. 2) and the Predictive Technology Model (PTM) [10] is used. In the HSpice simulation the load on the BLT and BLC was assumed to be 64 cells, half of them storing '1' and half of them storing '0'. Note that the choice of the control circuitry and device dimensions does not affect the overall trends and the conclusions can be generalized to any other memory circuit.

In the simulations  $V_{dd}$  has been varied from 0.75v to 1.2v and the coresponding temperature has been calculated using HotSpot [13]. The analytical results are obtained based on the discussion in section IV. For both simulation and analytical results,  $T_{Max}$  is assumed to be 75ps.

### VII. CONCLUSION

In this paper we proposed a Temperature- and Reliability-Aware Memory Design (TRAM) approach which allows designers to explore embedded memory design while considering both thermal and error (reliability) issues. Designers can examine the effects of frequency,  $V_{dd}$ , power dissipation, and temperature on reliability in a mutually interrelated manner. Our experimental results indicate that thermal *un*aware estimation of probability of error can be off by at least two orders of magnitude and up to five orders of magnitude from the realistic, temperature-aware cases. We also observed that thermal aware  $V_{dd}$  selection using TRAM can reduce the total power dissipation by up to 2.5X for a predefined limit on errors. Finally, we also present a mathematical model which captures the effects of  $V_{dd}$  and temperature on memory performance and their interrelationships.

#### REFERENCES

- V. De et al., "Technology and design challenges for low power and high performance," *Proc. of Intnl. Symp. on Low Power Electronics and Design*, 1999.
- [2] A. Agarwal, "A process-tolerant cache architecture for improved yield in nanoscale technologies," *TVLSI*, 2005.
- [3] K. Banerjee et al., "Analysis of Non-Uniform Temperature-Dependent Interconnect Performance in High Performance ICs," Proc. of Design Automation Conference, 2001.
- [4] A. Gupta et al., "A System Level Leakage-Aware Floorplanner for SoCs," Proc. of ASP-DAC, 2007.
- [5] A. Viterbi et al., "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm," *IEEE Transactions on Information Theory*, April 1967.
- [6] A. Khajeh et al., "Cross Layer Error Exploitation for Aggressive Voltage Scaling," International Symposium on Quality Electronic Design, 2007.
- [7] T. Gupta and A. H. Jayatissa. "Recent advances in nanotechnology: key issues & potential problem areas". *In Proceedings of IEEE Conference* on Nanotechnology, Vol. 2, 2003.
- [8] Y. Taur and T. H. Ning. "Fundamentals of Modern VLSI Devices". New York: Cambridge Univ. Press, 1998.
- [9] S. Mukhopadhyay, H. Mahmoodi and K. Roy. "Estimation of delay variations due to random-dopant fluctuations in nanoscale cmos circuits". *IEEE Journal of Solid-State Circuits*, September 2005.
- [10] Predictive Technology Model(PTM). "http://www.eas.asu.edu/~ ptm".
- [11] A. Bhavnagarwala, X. Tang and J. D. Meindl. "The impact of intrinsic device fluctuations on cmos sram cell stability". *IEEE Journal of Solid-State Circuits*, April 2001.
- [12] H. Mahmoodi, S. Mukhopadhyay and K. Roy. "Modeling of failure probability and statistical design of sram array for yield enhancement in nano-scaled cmos". *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 2003.
- [13] K. Skadron et al., "An Improved Block-Based Thermal Model in HotSpot-4.0 with Granularity Considerations," *Proceedings of the Work-shop on Duplicating, Deconstructing, and Debunking*, 2007.
- [14] J. Renze et al.,"Intermediate Value Theorem." From MathWorld-A Wolfram Web Resource. http://mathworld.wolfram.com/IntermediateValueTheorem.html
- [15] L. He et al., "Considering the Interdependence of Temperature and Leakage Interdependence of Temperature and Leakage," DAC, 2004.
- [16] S. Lin et al.,"A Thermally-Aware Methodology for Design-Specific Optimization of Supply and Threshold Voltages in Nanometer Scale
- [17] http://en.wikipedia.org/wiki/Cubic\_equation.

ICs,"ICCD, 2005.