# Dynamic Write Limited Minimum Operating Voltage for Nanoscale SRAMs

Satyanand Nalam<sup>†</sup>, Vikas Chandra<sup>‡</sup>, Robert C. Aitken<sup>‡</sup>, Benton H. Calhoun<sup>†</sup> <sup>†</sup>Dept. of ECE, University of Virginia, Charlottesville; <sup>‡</sup>ARM R&D, San Jose {svn2u,bcalhoun}@virginia.edu {vikas.chandra,raitken}@arm.com

Abstract—Dynamic stability analysis for SRAM has been growing in importance with technology scaling. This paper analyzes dynamic writability for designing low voltage SRAM in nanoscale technologies. We propose a definition for dynamic write limited  $V_{\rm MIN}$ . To the best of our knowledge, this is the first definition of a  $V_{\rm MIN}$  based on dynamic stability. We show how this  $V_{\rm MIN}$  is affected by the array capacity, the voltage scaling of the word-line pulse, the bitcell parasitics, and the number of cycles prior to the first read access. We observe that the array can be either dynamically or statically write limited depending on the aforementioned factors. Finally, we look at how voltage-bias based write assist techniques affect the dynamic write limited  $V_{\rm MIN}$ .

## I. INTRODUCTION

Static Random Access Memory (SRAM) is a critical component in most VLSI systems. As the need for low power systems grows, the supply voltage ( $V_{DD}$ ) has been scaled down to reduce both dynamic and leakage power. The 6T SRAM bitcell is composed of close to minimum sized devices to meet stringent area requirements. This increases the impact of local mismatch between the bitcell transistors and leads to reduced stability [1], typically measured in terms of static noise margin (SNM). Further, as SRAM capacity continues to increase, variability degrades the stability of the worst case cell in an array. Thus, scaling SRAM voltage to design a low power system becomes challenging due to the trade-off with cell stability.

 $V_{MIN}$  is defined as the minimum voltage that ensures correct operation. An accurate prediction of  $V_{MIN}$  is necessary for designing a low power SRAM that meets retention, read, and write functional yield requirements. Existing attempts at  $V_{MIN}$  prediction for standby, active operation, and yield estimation are based on SNM during hold, read, and write operations [2][3]. However, static metrics tend to be optimistic for write and pessimistic for read, since by definition, the cell disturbance is considered to be of infinite duration. Thus, to be able to estimate  $V_{MIN}$  more accurately, it is imperative to consider dynamic margins (DMs) for cell read and write stability [4].

There have been several works recently that investigate SRAM read and write operations from a dynamic perspective. The authors in [4] investigate dynamic read and write stability in terms of the separatrix, which divides the SRAM state space into two stability regions. The read and write DMs are defined as the margin between the  $T_{WL}$ , the width of the word-line (WL) pulse and  $T_{ACROSS}$ , the time taken to cross the

separatrix. Dynamic read stability is investigated in [5] and takes into account the impact of repeated read accesses on the dynamic stability of the bitcell. In [6], the authors show that the minimum T<sub>WL</sub>, or T<sub>WL-CRIT</sub>, at which the cell flips state is a strong indicator of cell dynamic writability. In [5] and [6], the authors also show that the cell recovery time (e.g. the duration for which the word-line is off) affects read and write dynamic stability respectively. In this work, we focus on dynamic writability analysis alone since static write margin is optimistic and tends to underestimate V<sub>MIN</sub>. Also, the PMOS pull-ups that influence the write operation are typically the smallest devices in the cell, and hence more impacted by variability. This makes write failure more likely, especially in newer technologies [7]. Although dynamic stability has been researched extensively in recent times, most of the work has been focused on defining DMs [4][6], devising ways to calculate them analytically [8] or on-chip [9], or calculating failure probability based on dynamic stability [10]. In this work, we focus on how dynamic stability affects V<sub>MIN</sub>. To enable lower power through lower V<sub>MIN</sub>, writability can be improved by voltage-bias based assist techniques such as [11][12][13]. The impact of different assist techniques on write SNM [14] and on dynamic writability [15] has been investigated earlier. In particular, [15] investigates the efficacy of various assist techniques in reducing T<sub>WL-CRIT</sub>.

In this work, our first contribution is to develop the concept of a dynamic write-limited  $V_{MIN}$  (DWV<sub>MIN</sub>), as opposed to a static limited  $V_{MIN}$ . As our second contribution, we investigate how dynamic write-limited  $V_{MIN}$  is affected by various write assist methods.

The rest of the paper is organized as follows. In section II, we overview the  $T_{WL-CRIT}$  metric and describe how we determine worst case  $T_{WL-CRIT}$  for a large array. In section III, we define DWV<sub>MIN</sub>, compare dynamic and static V<sub>MIN</sub>, and analyze factors that affect DWV<sub>MIN</sub>. In section IV, we explore the impact of write assist techniques on the DWV<sub>MIN</sub>. Section V summarizes this work and concludes.

#### II. DYNAMIC WRITABILITY METRIC

The WL pulse clearly plays an important role in the dynamics of the write operation. The  $T_{WL-CRIT}$  metric, referred to as  $T_{CRIT}$  in [6], can be easily estimated using simple circuit simulations that account for the WL pulse duration and shape. So, we choose this metric for our analysis. We use a commercial spice simulator and a binary search to determine the  $T_{WL-CRIT}$  required for the cell to flip.

## A. Defining Dynamic Writability

T<sub>WL-CRIT</sub> is defined as the minimum T<sub>WL</sub> required for the cell to flip, or more specifically, for the voltage difference between the storage nodes to be larger than a threshold value (we use 90% of the  $V_{DD}$ ). Two variations of  $T_{WL-CRIT}$ , based on different criteria for write failure, are described in [6]. In the first case, the cell nodes are sampled a long time after the end of the WL pulse. This definition is optimistic as it does not consider the possibility of the cell being read in the next cycle. If the cell has not fully flipped before the start of the read, it is possible that the correct value is not read. The second definition of T<sub>WL-CRIT</sub>, T<sub>WL-CRIT-RAW</sub>, as defined in [6], takes into account such a Read-after-write (RAW) scenario. For this metric, the bitline differential that develops at the end of a read immediately following a write, is checked. If it is less than a certain threshold, the preceding write operation is considered unsuccessful.

These definitions form bounds on the "real"  $T_{WL-CRIT}$  of the cell, which would depend on when the cell nodes are checked to see if they have flipped. This again depends on what the earliest time is that the cell needs to be read. The earlier the cell needs to be read, the stricter the failure criterion, and the larger the  $T_{WL-CRIT}$  for a successful write.

In this work, we mainly use the first definition of  $T_{WL-CRIT}$  above. We check the cell nodes after a time that is three orders of magnitude larger than  $T_{WL}$ . However, in the next section, we do consider how  $T_{WL-CRIT}$  and DWV<sub>MIN</sub> are affected by the write failure criterion. For this purpose, instead of using  $T_{WL-CRIT-RAW}$ , we consider  $T_{WL-CRIT-CYC}$ , a variant of  $T_{WL-CRIT}$  that is easier to estimate. This metric requires the cell nodes to flip (e.g. the voltage difference between them to be larger than the threshold) by the end of the write cycle. We assume a 50% duty cycle. Table I summarizes the metrics discussed.

 TABLE I

 Summary of Dynamic writability metrics

| Metric                   | Criteria                                                 |  |  |  |
|--------------------------|----------------------------------------------------------|--|--|--|
| T <sub>WL-CRIT</sub>     | Nodes flip eventually                                    |  |  |  |
| T <sub>WL-CRIT-RAW</sub> | Appropriate bitline differential at the end of RAW cycle |  |  |  |
| T <sub>WL-CRIT-CYC</sub> | Nodes flip by the end of the write cycle                 |  |  |  |
|                          |                                                          |  |  |  |

## B. Estimating Worst Case T<sub>WL-CRIT</sub>

The T<sub>WL-CRIT</sub> distribution is clearly not Gaussian, but follows extreme order statistics, and is characterized by a long tail [6][15]. To estimate T<sub>WL-CRIT</sub> values further out the tail, we attempted to fit it to known long tail distributions. For a large class of distributions that satisfy the Balkema-de Haan-Picklands (BdP) theorem, it is possible to fit a Generalized Pareto Distribution (GPD) to the data and make predictions further out the tail [16]. However, through hypothesis testing, we verified that none of the long-tail distributions that satisfy the BdP theorem fit the T<sub>WL-CRIT</sub> data. Since we were unable to fit a known distribution to T<sub>WL-CRIT</sub> and standard Monte Carlo (MC) is too expensive, we use Recursive Statistical Blockade (SB) [17] to estimate worst case  $T_{WL-CRIT}$  values for specific array sizes (e.g. 100kb and 10Mb).

The methodology is as follows. We first run a thousand sample MC to generate a training set for SB. The mismatch parameters in the SRAM cell that we vary are the threshold voltages ( $V_T$ ) of the six transistors. The training samples are then used to build a classifier that identifies potential tail candidates. We use a tail threshold of 99% and a classifier threshold of 97% to minimize false negatives (e.g. tail points classified as non-tail points). We then run the T<sub>WL-CRIT</sub> simulation for the filtered tail candidates alone to identify the worst case for a 100kb array. The whole procedure is repeated recursively again according to the algorithm in [17] to determine the worst case T<sub>WL-CRIT</sub> value for a 10Mb array.

## III. DYNAMIC WRITE STABILITY LIMITED $V_{MIN}$

The T<sub>WL-CRIT</sub> described in the previous section is a measure of the dynamic writability of the bitcell. However, it is not a true "margin" and does not reflect how close to failure the cell is. We propose a new definition of dynamic write margin as  $T_{WL} - T_{WL-CRIT}$ , the difference between the WL pulse width and the critical pulse width required for the cell to flip. The larger the T<sub>WL</sub> compared to T<sub>WL-CRIT</sub>, the larger the margin, meaning that the cell is less susceptible to dynamic write failure.

#### A. Definition

We now define the DWV<sub>MIN</sub> of a bitcell as the supply voltage at which its dynamic write margin is less than a certain threshold (we use zero). It determines the extent to which  $V_{DD}$  can be lowered before the bitcell becomes dynamic write limited. The DWV<sub>MIN</sub> of an array is determined by the bitcell that has the maximum  $T_{WL-CRIT}$  and minimum  $T_{WL}$ .

We make two assumptions in our analysis of  $DWV_{MIN}$ . First, we assume that the variation of  $T_{WL}$  across the array is negligible when compared to that of  $T_{WL-CRIT}$ . This is justified because the WL pulse is driven by inverters that are usually made up of fairly large devices. Moreover, there are far fewer number of WL drivers than bitcells, which means that the spread in  $T_{WL}$  encountered on a chip will be much lower than that of  $T_{WL-CRIT}$ . Thus, we assume a constant  $T_{WL}$  for a given voltage. Second, we note that the the  $T_{WL-CRIT}$  for a write '0' would be different from that for a write '1' due to local mismatch. Thus, the DWV<sub>MIN</sub> is actually the maximum of the DWV<sub>MIN</sub> of the write '0' and write '1' cases. In this work, we only look at the DWV<sub>MIN</sub> corresponding to the write '1' case. Since the write mechanism is the same for both, we expect the same analysis to apply for the write '0' case as well.

If a cell is statically limited (e.g. static write margin is zero), the cell cannot be written even if  $T_{WL}$  is infinite. This happens when the variability within the bitcell is such that the passgate is severely weakened when compared to the pull-up on the side storing a '1'. We consider  $T_{WL-CRIT}$  to be undefined for such statically limited cells as there does not exist a value of  $T_{WL}$  that would allow the cell to flip. Fig. 1 shows an example using an early bitcell from a 32nm low power, CMOS bulk technology. The pass gate threshold voltage (V<sub>T</sub>) is 88 mV



Fig. 1. A dynamically write limited but statically non-limited cell (a) becomes statically limited (b) as the voltage is lowered from 0.686V to 0.55V.

higher than nominal and the magnitude of the pull up  $V_T$  is 120 mV lower than the nominal value. As a result, the storage nodes (Q and QB) cannot be flipped when  $V_{DD}$  is below 0.686 V. This cell has a negative write SNM below 0.686 V, as measured using Seevinck's method [18]. Such cells determine the static write limited  $V_{MIN}$  of the array.

Fig. 2 depicts the  $T_{WL}$  (for a given WL driver) and the worst case  $T_{WL-CRIT}$  of a 1kb array. As the voltage is lowered, both  $T_{WL}$  and worst case  $T_{WL-CRIT}$  increase. The latter does so more rapidly until the dynamic write margin becomes zero at the intersection of the two curves, 624 mV, which is the DWV<sub>MIN</sub>.

The first static write failure appears at 670 mV (Fig. 2). If the array hits the static limit before it becomes dynamically limited as in this case, the DWV<sub>MIN</sub> is irrelevant. However, to understand and characterize the dynamic writability phenomenon, we continue looking at the  $T_{WL-CRIT}$  and DWV<sub>MIN</sub> even after the array is statically limited.

We observe that there is a kink in the  $T_{WL-CRIT}$  curve once the array becomes static write limited. This is because as  $V_{DD}$ is lowered, the weakest cell in a dynamic writability sense (e.g. with the largest  $T_{WL-CRIT}$ ) starts becoming static limited instead (e.g.  $T_{WL-CRIT}$  is not defined). This is confirmed by the fact that the  $V_T$  offsets are the same for each voltage (Table II). As a result, the worst case  $T_{WL-CRIT}$  now corresponds to a relatively stronger cell than before (e.g. not as far out the tail), causing the kink in the curve. We also note that the pull-up and access transistor on the side storing a '1' are the worst affected by variability, being significantly strengthened and weakened respectively. Thus, the same devices influence both static and dynamic writability.

 $\label{eq:TABLE II} TABLE \ II \\ V_T \ \text{offsets for static and dynamic write fails}$ 

|          | PD0    | PD1    | PU0     | PU1     | PG0     | PG1    |
|----------|--------|--------|---------|---------|---------|--------|
| 1 V      | 0.0286 | 0.0332 | -0.0245 | -0.1294 | -0.0258 | 0.1263 |
| 0.6865 V | 0.0286 | 0.0332 | -0.0245 | -0.1294 | -0.0258 | 0.1263 |

## B. Factors affecting DWV<sub>MIN</sub>

 $DWV_{MIN}$  for an array depends on four factors - the nature of the generated WL pulse (e.g. how it scales with voltage),



Fig. 2. Using worst case  $T_{WL-CRIT}$  and  $T_{WL}$  to determine the dynamic writability limited  $V_{MIN}$ . The intersection determines  $DWV_{MIN}$ , 624 mV in this case.

the memory capacity, the number of cycles prior to first read, and the bitcell parasitic capacitances. We now discuss these aspects in detail.

1) WL pulse characteristics: Typically, the final WL pulse that is driven to the bitcell is generated by combining an enable pulse with the address decoder output to activate one row. This enable signal, along with other control signals such as the sense amplifier enable and precharge signals are generated by a timing block, for instance, using a self-timed replica path to track process variations or simply through combinational logic that depends on a clock input [19].

The DWV<sub>MIN</sub> depends on how the generated WL pulse scales with voltage. Fig. 3 shows how the DWV<sub>MIN</sub> changes for two different T<sub>WL</sub> scaling approaches. In Fig. 3a, the WL pulse is generated using a self-timed path that traverses the height of the array. The T<sub>WL</sub> values are derived from simulations of an extracted model of a heavily margined, compiler generated array. Fig. 3b, the  $T_{W\!L}$  at each voltage is set to the value that is required to ensure that a specific bitline differential is developed by the end of the pulse during a read. For this example, we arbitrarily choose a differential of 150 mV. This approach results in a much smaller  $T_{WL}$  across voltage when compared to the former approach. As a result, the array becomes dynamic write limited at a higher voltage. We observe that the DWV<sub>MIN</sub> for a 1kb array is 624 mV with the former approach, while it increases to 741 mV with the latter.

In general, there are several factors that determine the  $T_{WL}$ . For example, it has to be wide enough to generate a sufficient bitline differential to overcome bitline leakage and sense amplifier offset during a read. On the other hand, it has to be narrow enough to meet performance requirements and to avoid read upsets. Scaling  $T_{WL}$  so that it is larger than the worst case  $T_{WL-CRIT}$  of the array will ensure that the array is not dynamically write limited at a particular voltage.

2) Memory size: As the memory size increases, so does the worst case  $T_{WL-CRIT}$  as it moves further out the tail. Fig. 4 shows the the variability of  $T_{WL-CRIT}$  in terms of the ratio of its worst case and nominal values for various array sizes. The values for the 100kb and 10Mb arrays are obtained using SB as described in section II-B, while those for the 1kb and 5kb arrays are obtained from full MC simulation. The worst case  $T_{WL-CRIT}$  for a 10Mb array is nearly 120 times the nominal



Fig. 3. DWV<sub>MIN</sub> dependence on voltage scaling of  $T_{WL}$ . DWV<sub>MIN</sub> increases from 624 mV (a) to 741 mV (b) for two different  $T_{WL}$  scaling approaches.



Fig. 4. Impact of variability on T<sub>WL-CRIT</sub> for different array sizes.

value at 0.8V, while it is only about 20 times the nominal value at 1V. So, for smaller arrays in the order of hundreds of kilobits, the variability impact at lower voltages is not so severe. However, it is quite significant for megabit-sized arrays.

Thus, as memory size increases, the array becomes dynamic write limited at a higher voltage. For instance, Fig. 5 shows worst case  $T_{WL-CRIT}$  values across voltage for the 1kb and 5kb arrays. We can observe that the variability is higher at lower voltages and the difference between the worst case  $T_{WL-CRIT}$  for the two arrays becomes much larger. As a result, the DWV<sub>MIN</sub> for the 5kb array is 714 mV, when compared to 624 mV for the 1kb array. Using SB, we determined the DWV<sub>MIN</sub> for the 5kb and 10Mb as well (Fig. 6). The DWV<sub>MIN</sub> for the 5kb and 10Mb arrays are almost the same as the latter is static write limited. Due to this, the worst case cell in the 100kb array is relatively stronger than that in the 5kb one from a dynamic writability perspective, as explained in section III-A.

3) Number of cycles prior to first read: As mentioned earlier, the value of  $T_{WL-CRIT}$  depends on the write failure criterion. The stricter the criteria, the larger the  $T_{WL-CRIT}$ . The value obtained when we check the cell nodes after a long period of time (the default definition that we use) forms a lower bound.  $T_{WL-CRIT-CYC}$ , which we defined in section II, forms an upper bound as it requires the cell to flip by the end of the write cycle.

Fig. 7 shows the nominal  $T_{WL-CRIT}$  at two voltages. In either case, the nominal  $T_{WL-CRIT}$  initially falls drastically. For instance, if the failure criterion is relaxed by just two cycles, the nominal  $T_{WL-CRIT}$  is nearly halved. It eventually settles to the lower bound indicated by the dashed line, where the cell nodes are checked three orders of magnitude after the end



Fig. 5.  $DWV_{MIN}$  dependence on array capacity.  $DWV_{MIN}$  increases from 624 mV for a 1kb array to 714 mV for a 5kb array.



Fig. 6. DWV<sub>MIN</sub> for various array sizes using SB.

of the write cycle. This happens once the failure criterion is relaxed to the point where the cell needs to be first read about 30 cycles after the write cycle.

This dependence of  $T_{WL-CRIT}$  on the number of cycles prior to the first read operation implies that if a stricter failure criterion is imposed, the array will be dynamically write limited at a higher voltage. Fig. 8 shows the worst case  $T_{WL-CRIT}$  across voltage for the two bounds on the write failure criterion — a read immediately following a write and a read after a "long" time. We observe that the DWV<sub>MIN</sub> of the array lies between 624 mV and 744 mV for the two extreme cases of the write failure criterion.

4) Bitcell parasitics: Since  $T_{WL-CRIT}$  is a dynamic measure of writability, it is affected by the parasitic capacitances in the bitcell, specifically, the capacitances between the storage nodes and various terminals of the bitcell. From the extracted netlist



Fig. 7. Effect of the no. of cycles elapsed before the first read.



Fig. 8.  $DWV_{MIN}$  dependence on the number of cycles prior to first read.  $DWV_{MIN}$  for a 1kb array lies between 624 mV and 744 mV.



Fig. 9. Dominant bitcell parasitics.

of the bitcell, we note that the inter-storage node parasitic capacitance ( $C_{Q-QB}$ ) dominates over other components, being at least 2x larger than the others (Fig. 9). Since the storage nodes need to move in the opposite direction for the cell to flip, a larger value of  $C_{Q-QB}$  would make this harder. Thus,  $T_{WL-CRIT}$  is most affected by this component of the bitcell parasitics. As Fig. 10 shows,  $T_{WL-CRIT}$  increases by more than 6x if the inter-storage parasitic capacitance increases 10x, with the other components remaining the same.

The dependence of  $T_{WL-CRIT}$  on bitcell parasitics implies that the DWV<sub>MIN</sub> also depends on them. Fig. 11 depicts the worst case  $T_{WL-CRIT}$  across voltage for a 1kb array with the  $T_{WL-CRIT}$  estimated using extracted ("real") and non-extracted ("ideal") versions of the bitcells. We note that while the DWV<sub>MIN</sub> of the 1kb array with the "real" bitcells is 624 mV,



Fig. 10. Impact of inter-storage node parasitic on  $T_{WL-CRIT}$  for each of the three most dominant capacitances, with the others kept constant.



Fig. 11. DWV<sub>MIN</sub> dependence on the bitcell parasitics.

the array with the "ideal" bitcells is dynamically write stable above 600 mV. Again, the number of static write failures is the same in either case. Thus, a layout that reduces the bitcell parasitic capacitances, in particular,  $C_{Q-QB}$  can improve the DWV<sub>MIN</sub> of the SRAM, although, the cell becomes more susceptible to read upsets if  $C_{Q-QB}$  is too low. The impact of the parasitic capacitance on the DWV<sub>MIN</sub> in Fig. 11 is small possibly because the cell layout was done carefully to minimize the dominant parasitic capacitances.

# C. Comparison with Static $V_{MIN}$

In addition to parametric variation, which impacts both dynamic and static limited  $V_{MIN}$ , dynamic writability is affected by additional factors, as discussed in the previous secion. In particular, factors such as the voltage scaling of  $T_{WL}$  and the number of cycles prior to first read can influence whether an SRAM array hits the dynamic writability limit before or after the variability-influenced static write limit is encountered.

Fig. 12 shows the static and dynamic write V<sub>MIN</sub> for various memory sizes. The static V<sub>MIN</sub> is determined by the voltage at which the first static write fail occurs (e.g. the cell does not flip for any value of  $T_{WL}$ ). The dynamic  $V_{MIN}$  is calculated for the two scenarios of T<sub>WL</sub> scaling shown in Fig. 3. We observe that the dynamic and static write  $V_{MIN}$  are comparable when the T<sub>WL</sub> scaling is heavily margined as in the first case. However, if the T<sub>WL</sub> scaling is more aggressive, the array hits the dynamic write limitation before static fails start to appear. This is particularly true for large memories in the order of megabits. For instance, when using a more aggressively scaled  $T_{WL}$ , the 10Mb memory is dynamic write limited as high as 0.95 V, while static write fails appear only from 0.8 V. Thus, we conclude that for large memories with aggressive performance requirements, dynamic write limitations imposed by the mode of access and T<sub>WL</sub> scaling will become more dominant than purely variability affected static write limitations.

## IV. Impact of assists on dynamic $V_{\rm MIN}$

Several implementations of write assists exist in literature. Voltage bias-based write assists fall broadly into two categories — ones that alter the "noise source" amplitude or duration through the access transistor, and ones that modify the strength or voltage transfer characteristics of the cross-coupled inverters [14]. We choose the WL boost method (WLB) from the former



Fig. 12. Dynamic vs. Static  $V_{MIN}$  for self-timing path generated, heavily margined  $T_{WL}$  (a) and aggressive bitline differential dependent  $T_{WL}$  (b).



Fig. 13. Impact of write assists on worst case  $T_{WL-CRIT}$ . Static write failure occurs without assist at 670 mV. No static write failures are observed above 500 mV with either assist.

and  $V_{DD}$  lowering method (VDDL) from the latter categories, as these appear to be the most popularly used write assist methods in recent literature [11][20][21]. We use a WL boost and  $V_{DD}$  droop value of 100mV.

Fig. 13 shows the worst case  $T_{WL-CRIT}$  in a 1kb array, with no write assists, and with WLB and VDDL. The  $T_{WL}$  in these examples corresponds to a WL pulse generated from a self-timed path. We observe that WLB is more effective than VDDL in reducing the worst case  $T_{WL-CRIT}$ . This is because the gate-to-source voltage of the access transistor is higher in the case of WLB and consequently, the write time is lower, leading to a lower worst case  $T_{WL-CRIT}$  and DWV<sub>MIN</sub>. However, both categories of assist techniques eliminate the static failures that appear when no write assist is applied, down to 500 mV. In this case, WLB is a better write assist purely from a  $T_{WL-CRIT}$  point of view, although it worsens the stability of the half-selected cells during read. Fig. 14 confirms this for larger memories



Fig. 14.  $DWV_{MIN}$  for various array sizes with and without write assists. Static  $V_{MIN} < DWV_{MIN}$  for both assist methods.

as well. This agrees well with the conclusions drawn in [14] and [15] about the best write assist techniques, based on static and dynamic writability considerations respectively. In terms of half-select stability however, VDDL is a better write assist, particularly if the  $V_{DD}$  is shared column-wise since the boosted WL voltage reduces the read stability of the half-selected cells along the same row. We note that other considerations such as power overhead and complexity of implementation would also need to be considered before choosing a write assist implementation.

## V. CONCLUSIONS

We have developed the concept of a dynamic write limited  $V_{MIN}$  for SRAM based on the  $T_{WL-CRIT}$  metric. While variability affects both static and dynamic write limited  $V_{MIN}$ , the latter is also affected by the voltage scaling of  $T_{WL}$ , the number of cycles before the data is first read, and the bitcell parasitics. We have observed that an SRAM array can be statically or dynamically write limited depending on these factors. Finally, we have analyzed the impact of  $V_{DD}$  lowering and WL boosting write assist methods on the DWV<sub>MIN</sub>. While both methods are effective at lowering static write  $V_{MIN}$ , WL

#### REFERENCES

- A. Bhavnagarwala *et al.*, "The impact of intrinsic device fluctuations on cmos sram cell stability," *JSSC*, vol. 36, pp. 658–665, Apr 2001.
- [2] J. Wang et al., "Statistical modeling for the minimum standby supply voltage of a full sram array," in ESSCIRC, pp. 400–403, Sep. 2007.
- [3] K. Agarwal and S. Nassif, "Statistical analysis of sram cell stability," in DAC, pp. 57–62, 2006.
- [4] W. Dong et al., "Sram dynamic stability: theory, variability and analysis," in ICCAD, pp. 378–385, 2008.
- [5] M. Sharifkhani and M. Sachdev, "Sram cell stability: A dynamic perspective," *JSSC*, vol. 44, pp. 609–619, Feb. 2009.
- [6] J. Wang et al., "Analyzing static and dynamic write margin for nanometer srams," in *ISLPED*, pp. 129–134, 2008.
- [7] A. Bhavnagarwala et al., "Fluctuation limits and scaling opportunities for cmos sram cells," *IEDM*, pp. 659–662, 5-7 Dec. 2005.
- [8] B. Zhang et al., "Analytical modeling of sram dynamic stability," in *ICCAD*, pp. 315 -322, 2006.
- [9] S. O. Toh et al., "Dynamic sram stability characterization in 45nm cmos," in VLSI Circuits Symposium, pp. 35–36, 2010.
- [10] D. Khalil *et al.*, "Accurate estimation of sram dynamic stability," *TVLSI*, vol. 16, pp. 1639–1647, Dec. 2008.
- [11] O. Hirabayashi *et al.*, "A process-variation-tolerant dual-power-supply sram with 0.179 μm<sup>2</sup> cell in 40nm cmos using level-programmable wordline driver," in *ISSCC*, pp. 458–459,459a, 2009.
- [12] H. Yang et al., "Scaling of 32nm low power sram with high-k metal gate," in IEDM, pp. 1–4, 2008.
- [13] N. Shibata et al., "A 0.5-v 25-mhz 1-mw 256-kb mtcmos/soi sram for solar-poweroperated portable personal digital equipment - sure write operation by using stepdown negatively overdriven bitline scheme," JSSC, vol. 41, pp. 728–742, March 2006.
- [14] R. W. Mann et al., "Impact of circuit assist methods on margin and performance in 6t sram," Journal of Solid-State Electronics, vol. 54, no. 11, pp. 1398–1407, 2010.
- [15] V. Chandra *et al.*, "On the efficacy of write-assist techniques in low voltage nanoscale srams," in *DATE*, pp. 345 –350, 2010.
- [16] A. Singhee and R. A. Rutenbar, "Statistical blockade: a novel method for very fast monte carlo simulation of rare circuit events, and its application," in *DATE*, pp. 1379–1384, 2007.
- [17] A. Singhee *et al.*, "Recursive statistical blockade: An enhanced technique for rare event simulation with application to sram circuit design," in *VLSID*, pp. 131–136, 2008.
- [18] E. Seevinck et al., "Static-noise margin analysis of mos sram cells," JSSC, vol. 22, no. 5, pp. 748–754, Oct 1987.
- [19] B. Amrutur and M. Horowitz, "A replica technique for wordline and sense control in low-power sram's," *JSSC*, vol. 33, pp. 1208–1219, Aug 1998.
- [20] K. Takeda et al., "Multi-step word-line control technology in hierarchical cell architecture for scaled-down high-density srams," in VLSI Circuits Symposium, pp. 101–102, Jun. 2010.
- [21] Y. Chung and S.-H. Song, "Implementation of low-voltage static RAM with enhance data stability and circuit speed," *Microelectronics Journal*, vol. 40, pp. 944– 951, 2009.