# Word-Line Power Supply Selector for Stability Improvement of Embedded SRAMs in High Reliability Applications

B. Alorda, C. Carmona, S. Bota Electronic Systems Group, Physics Dept. Illes Balears University Palma de Mallorca, Spain tomeu.alorda@uib.es

*Abstract*— Embedded SRAM yield dominates the overall ASIC yield, therefore the methodologies centered on improving SRAM cell stability will be introduced in the design as a mandatory. Word-line voltage modulation has showed that it is possible to improve cell stability during access operations. The high variability of physical and performance parameters introduce the need to adopt adaptable solutions to adequately improve SRAM cell stability. In this work, we present a wordline voltage selector circuit designed to modulate power-supply word-line voltage at each individual embedded SRAM block. The final area overhead is minimal and several strategies can be implemented with the embedded SRAM allowing adjust wordline voltage value during the life of ASIC, taking into account different operation, aging and degradations effects.

Keywords—SRAM stability, Word-line modulation ,High Reliability applications.

## I. INTRODUCTION

The CMOS IC technologies have been constantly scaled down aggressively and current system-on-chip trends result in a significant percentage of the total die area being dedicated to memory blocks, thus making embedded SRAM yield dominate the overall SoC yield.

The extremely small-sized SRAM bit-cell is vey sensitive to process variations, especially to MOS threshold voltage  $(V_{th})$ variations. The main causes of Vth variation are related to Random dopant fluctuation (RDF), dopant channelling through the gate into the channel, poly and diffusion CD variation, and dopant loss through the isolation oxide. While RDF represents a fundamental limit for SRAM mismatch, the other issues related to lithography constraints can be mitigated with both design and process changes. An example of a design mitigation strategy consists of using a wide-layout cell topology [1]. The wide cell design improves CD control by aligning the poly in the same direction, eliminating diffusion corners, and relaxing some patterning constraints in other critical layers (when the cell is arrayed, all transistors see the same poly pattern). By removing the bends in the diffusion layer, cell sensitivity to misalignment is also reduced.

The SRAM cell stability is especially sensitive to increments of physical parameters variation added to other effects (less power supply voltage or high transistor density). Therefore, a deep knowledge and analysis about the stability of the embedded SRAM cells and new approaches to improve the cell stability in post-production steps, or during circuit active live is becoming a must in modern embedded SRAM CMOS designs.

The 6T-based cell stability of a given SRAM is usually evaluated by analyzing the Static Noise Margin during hold, read and write operations. The Read-SNM has been identified as a critical memory design parameter and great efforts have been done to increase stability during read operations [2]. Traditionally, the  $\alpha$ -ratio of a SRAM cell must satisfy a certain value usually comprised between 1.8-2.5, to ensure high cell stability during read operations. The difference between passtransistor width and pull-down transistors width produces a slight diffusion bend in bit-cell layout. As result, the cell will present some sensitivity to size change due to misalignment.

While some approaches improve the cell-stability during read operations by modifying the read circuitry [3], or by reducing the time required to sense the cell stored values [4]. Other approaches introduce new cell architectures including those that increase the SRAM cell transistor count. Among several proposals, eight-transistor (8T) cell configurations are being adopted as an alternative to the traditional six-transistor (6T) cells in industry designs [5].

An alternative way to reduce cell parameters variability was proposed in [1] based on design the cell-layout using straight diffusion edges (All NMOS with equal transistor width parameter). To achieve the required cell drive and guarantee stability during the read operation, the pass transistor width increase is compensated by a corresponding increase in the transistor length. This alternative cell-layout design results in a diffusion strip without any bends, reducing parameter variability, but reducing at the same time, the capability of using  $\alpha$ -ratio to increase cell stability during read operations [6].

In [2,7,8] the benefits and drawbacks of word-line maximum voltage (V<sub>WL</sub>) modulation was discussed to improve the cell stability during read operation when the wide cell-layout approach is used. The technique has been proposed as a valid alternative due to its positive impact on static and

<sup>978-3-9815370-2-4/</sup>DATE14/©2014 EDAA

dynamic stability behaviour in all king of 6T based SRAM cells that include the 8T cell configuration reported in [2].

This paper proposes a low-area-impact and non-intrusive solution to incorporate the  $V_{WL}$  modulation technique based on changing the power supply voltage of the last buffer stage of the row decoder circuit. A low-leakage voltage divider based controller is proposed to modulate using a digital interface the word-line power supply voltage level, controlling the maximum  $V_{WL}$  applied to the pass-transistor gate. This solution has a reduced impact on SRAM area block and allows archive optimum static and dynamic stability during hold, read and write operations.

The rest of the paper is organized as follows: the next section summarizes the main benefits and drawbacks of word-line voltage modulation. Section 3 describes the architecture of  $V_{WL}$  modulator and its integration with a typical 6T-based SRAM scheme. Section 4 shows the Word-line voltage modulation strategy during any SRAM operation period. Section 5 discusses the area, the delay and the power consumption overhead simulation results using full-custom SRAM block implemented with a commercial CMOS 65nm technology. Finally, section 6 points out the main conclusions of this work.

### II. WORD-LINE VOLTAGE REDUCTION

Given that the Static Noise Margin parameter (RSNM) is significantly degraded during read operations, and traditional transistor size adjust produces slightly improvements on wide cell-layout topologies [6, 8], a V<sub>WL</sub> reduction strategy was proposed to remain optimum levels of cell stability. Figure 1 shows the low improvement of RSNM when transistor widths adjust is used in a wide-layout cell (both pull-down transistors equals to both pass transistors). The 2D-graphic explorers SNM values considering NMOS transistor width equal to  $W_N =$  $\delta_{N}$ ·Wmin, and PMOS transistor width equal to  $W_{P} = \delta_{P}$ ·Wmin. During Hold operation the improvement is negligible, but the RSNM is improved when increase the  $\delta_P$  factor. In Figure 1, the RSNM increases their value 10.9% with respect to the minimum-sized SRAM cell when p-sized case is considered. In contrast, the impact of increase the  $\delta N$  factor reduces the RSNM in -13.2% for n-sized case with respect to the minimum-sized case. In this scenario, the V<sub>WL</sub> reduction technique demonstrated good results in 65nm CMOS technology [7,8]. The graph showed in Figure 2 summarizes the main benefits of  $V_{WL}$  reduction technique in terms of SNM. The maximum value of SNM showed in Figure 2 is obtained when the VWL is equal to zero, that is, when SRAM cells are holding their internal values. In the same way, the minimum SNM represents when the memory cell has their passtransistors saturated, because the  $V_{WL}$  is equal to power supply voltage.

When  $V_{WL}$  reduces its value from Vdd to 0, the value of SNM increases rapidly as showed Figure 2. In fact, apply a reduction of 17% at  $V_{WL}$  produces a gain of more than 50% in terms of SNM [8], while the SRAM cell maintains the read operation performance unaltered [6, 8]. These benefits during read operation contrast with the cell writability decrement if  $V_{WL}$  is reduced. But this drawback can be minimized if the



Fig. 1. Static Noise Margin variation during HOLD and READ operations when Transistor widths adjust is used in a Wide-layout topology cell [9].

 $V_{WL}$  modulation is maintained at lower levels [8]. There is a maximum reduction threshold where the write operation is slightly impacted. Beyond this threshold, the SRAM cell reduces rapidly its capability to be written from external bitlines. In [7,8] this threshold was found around 26% of VDD, where the derivative of write noise margin respect to  $V_{WL}$  changes drastically, increasing the impact on write noise margin reduction [7].



Fig. 2. Static Noise Margin improvement due to Word Line voltage modulation.

A dual  $V_{WL}$  reduction methodology was proposed in [8] to decrease Word-Line maximum value during read operations while remaining at their nominal value during write operations, leaving write performance unaffected. But with a more accurate observation, during a write operation, apart of the cells that are being written, there are other cells that operate like in read operations. These cells share the same word-line signal with the cells being written, but they are not selected for any effective operation. Those memory cells are known as half-selected and are very common in embedded SRAMs organized in blocks [2].

This paper proposes to implement  $V_{WL}$  modulation reducing the power supply voltage of last gate at the row

decoder. The proposed implementation takes benefit of maximum logic gate output voltage dependece with power supply voltage. If the power supply voltage of a logic gate is reduced, the maximum output voltage is reduced in the same proportion. Therefore, the power supply node of all last gate of row decoder is disconnected from main power supply voltage using the proposed WL\_selector circuit. Figure 3 shows the changes on row decoder, considering the logic decoder block formed by logic decoder gates without the last inverter stage. In the schema of Figure 3 the last row decoder gate consists of an inverter logic gate with an isolated power supply node called VDD\_WL. The inverter gate is sized considering the fan-in word-line signal.

The WL\_adapter modules the voltage present at VDD\_WL node using the general power supply node (VDD) as maximum reference. The desired voltage value is digitally controlled using a digital word formed by 7 bits (Config\_WL[6:0]), although it is easy to reduce the number of configuration word using an additional decoder: 3 inputs to 8 outputs. The WL\_EN signal acts like a pre-charge signal acting as a consumption reduction signal.

In the next section, a digital selectable  $V_{WL}$  circuit is proposed to implement both methodologies: the application of two different  $V_{WL}$  values, one for each memory operation, or just one  $V_{WL}$  for all operations.

#### III. WORD-LINE POWER SUPPLY SELECTOR

The proposed word-line power supply selector design is based on the most simple voltage divider using saturated transistors. The basic part of the selector circuit are two transistors in saturated conditions, connected in serie between VDD and ground acting like a voltage divider. The connexion node voltage between transistors is directly proportional to the relation between internal transistors resistance and power supply voltage. Note, that VDD\_WL node will be connected to the power supply node of last decoder inverter stage to provide stable power supply voltage and enough current consumption level.

The expected current comsumption of decoder inverters stage will be related to the possible inverter states:

- During hold mode, all outputs of row decoder circuit will be pulled-up (DEC[i]=1 in Figure 3), so the NMOS transistor of inverter will pull-down all word-lines of memory array. In that situation, the PMOS transistor will be cut-off and the expected leakage will be very low (Identified in Figure 3 as IDD\_WL(0)).
- During active operation mode (read or write operations), the row decoder will activate one DEC[i] output acordelly to the address used in the current operation. The expected current consumption will be IDD\_WL(1) (see Figure 3) equivalent to the dynamic consumption to pull-up the corresponding word-line.

Considering the decoder inverters consumption and the number of rows, the maximum current consumption is equal to  $IDD_WL(1) + (n-1) \cdot IDD_WL(0)$ . Because only one row inverter will pull-up the word-line at each operation and the



Fig. 3. Schematic implementation of Word Line voltage reduction strategy.

rest of row inverters will remain pulling down their outputs, the expected current will be lower, similar to dynamic current of just considering one inverter. Therefore, the requirements of WL\_selector output current will be low.

Figure 4 shows the proposed WL\_selector circuit formed by 8 transistors. The voltage divider consist of one PMOS transistor P1 and a combination of NMOS transistors from N2 to N8. All transistor lengths are minimum sized, while the transistor widths are designed satisfying the required VDD\_WL voltage.



Fig. 4. Schematic implementation of Word Line voltage selector.

Figure 5 represents the output voltage (VDD\_WL) of WL selector depending on the digital value applied to Config\_WL input signals. The minimum VDD\_WL value has been defined

taking into account the threshold limit defined using the write noise margin degradation with respect to  $V_{WL}$  [7, 8]. The number of possible  $V_{WL}$  steps depends on the number of digital values considered. In the implementation reported in this paper, only 8 digital values are considered possible allowing direct digital interface with several alternatives from dedicated digital decoder circuit 3 inputs to 8 outputs until shift register based circuit. Note that the analogue distance between two consecutive VDD\_WL values may be different at each digital value. The WL selector circuit is designed to produce  $V_{WL}$ modulation from VDD to minimum value considered in regular steps.



Fig. 5. Values of  $_{VDD_WL}$  node depending on digital values applied to Config\_WL[6:0] signals.

The transistors P2 and N1 have been introduced in the design to reduce the power consumption of WL selector. As the circuit is based on voltage divider topology, the circuit consumes elevated values energy due to the continuous saturation current flow from power supply to ground. The function of P2 and N1 is to open this saturation current flow when there is activity in the inverters stage (active memory operation is under process and the word-line will be set) and close the current flow when the VDD\_WL node is not discharged by any inverter (all inverters with DEC[i] = 1). Reducing the period with continuous consumption of WL selector, the impact on the total memory consumption is reduced. A detail description of WL selector performance is described in next section.

#### IV. V<sub>WL</sub> REDUCTION STRATEGY

The  $V_{WL}$  reduction strategy is integrated with the normal memory performance considering both reduction alternatives:

 The same V<sub>WL</sub> value for all operations. This option proposes operate with the memory array at fixed V<sub>WL</sub> maintaining acceptable write noise margin. This strategy is considered as static or operation independent. Different V<sub>WL</sub> value for each operation. Two different voltage reductions are used: one for read operations and the other for the write ones. This alternative allows optimize the V<sub>WL</sub> reduction for each operation obtaining best benefits from technique in each case. This strategy is considered as dynamic or operation dependent.

Figure 6 shows the waveforms related with WL selector activation according to the signal names defined in Figure 3. While the memory array is in hold mode, the WL\_EN signal is zero, so the WL selector is not activated and the VDD\_WL voltage remains at the last used value and unaltered during a long period of time. When an operation is in process, the memory controller activity can be divided in two main stages: the pre-charge and the operation performance, as Figure 8 represents at the top. During the pre-charge state, the controller must set Config\_WL values depending on which  $V_{WL}$  reduction will be applied. A period of time (identified as Ten in Figure 6) previous to perform the memory operation, the WL selector is activated setting the WL EN.



Fig. 6. Temporal evolution of WL selector with respect to memory operation states.

When memory operation starts, the corresponding row decoder output is pull-up to open all cells of that row. This activation activity produces a dynamic consumption of last stage inverter and the current flow is obtained from transistor P1 of WL selector. When dynamic consumption is done, the power consumption return to extremely low values, and the WL selector is deactivated to stop power consumption. This dynamic period is identified in Figure 6 as Ton. Note the activation period of WL selector will be Tmin  $\geq$  Ten + Ton with fully power consumption. Figure 8 highlights that memory operation period will be higher than Tmin, because the memory operation delay will be higher than the inverter delay plus the config\_WL values selection delay. The following algorithm describes the controller operations to manage static and dynamic V<sub>WL</sub> reduction strategies.

| While (any operation) do                 |
|------------------------------------------|
| If the dynamic strategy is selected then |
| If current operation is READ then        |
| Config_WL is set to Read_ConfigWL        |
| Else                                     |
| Config_WL is set to Write_ConfigWL       |
| Else (the static strategy is selected)   |
| Config_WL is set to Write_ConfigWL       |
|                                          |
| Wait for Tconfig                         |
| Set WL_EN                                |
| Wait for Tmin;                           |
| Clear WL_EN                              |

The Read\_ConfigWL and Write\_ConfigWL values can be fixed during postproduction test using WL selector circuit to evaluate the optimum values exploring the memory performance at different  $V_{WL}$  reductions.

#### V. RESULTS

The WL selector has been implemented using a commercial 65nm CMOS technology and included in a 256Kb SRAM memory layout. The memory circuit has been organized in a cell array of 256 rows and 64 columns divided in 8 sub-columns. It is accessed synchronously using an 8 bits data input port; an 8 bits data output port and using 12 address bits. Figure 7 shows the layout view of WL selector circuit where the size of transistor can be compared. To reduce the fabrication parameter variation the layout has been implemented aligning the poly in the same direction and eliminating diffusion corners as much as possible.



The result is a circuit with an area of 44µm2. This area must be compared with the total area used by the 256Kb SRAM memory design using wide-layout cell topology. In that case, the total memory area is 31,450 µm2, so the WL selector circuit area impact on the final design will be negligible. In fact, Figure 8 shows the complete memory layout view with a detail of the WL selector position and the area proportion between them. At this point, it is important to note, that the WL selector circuit has been implemented using minimal number of metal layers allowing integrate the design under the previous existing metal layers of the left-upper corner of SRAM circuit. As a result, the increment of area needed to incorporate the WL selector in minimal, only consisting in additional functions and controlling signals that must be incorporated in the memory controller circuit. Note that the utilization of a separate power supply of the last row decoder stage not implies necessary more area because the power supply topology is implemented using upper metal layers.



Fig. 8. Layout view of 256Kb memory with WL selector implemented without impact on total memory area.

The memory operation performance impact is evaluated in terms of delay increment and power consumption WL selector. All simulation results have been evaluated performing memory operations at high frequency, that is, considering the SRAM READ / WRITE operation period equal to 1ns.

Figure 9 quantifies the delay increment due to reduce the power supply voltage of last inverter stage in the row decoder. As the maximum power supply values are reduced, the inverter delay is increased.



Fig. 9. Increment of row decoder delay due to power supply reduction at the last inverter stage.

The row decoder inverter delay is maintained lower than 0.1 ns, that is, 10 times lower than operation time. Therefore, the impact on memory operation delay may be considered unappreciated.

Figure 10 evaluates the impact on the WL selector circuit power consumption when different Config\_WL values are

applied. As it is expected, the voltage divider topology trends to increase the power consumption as more transistors are activated. To minimize the power consumption, the WL\_EN controls the period where the WL selector is actively maintaining the VDD\_WL voltage. The waveforms showed in Figure 10 allow comparing the power saving benefits of using WL selector under control period. If the WL selector circuit is active during all memory operation period, the power consumption increases at very high values rapidly (waveform identified as without power saver in Figure 10). Therefore, the use of power saver strategy reduces drastically the power consumption of WL selector maintaining the simplicity in the circuit design.



Fig. 10. Power consumption of WL selector during SRAM operation with and without activation of power saving strategy.

Considering the obtained results of WL selector power consumption, the total power consumption overhead can be evaluated considering the designed SRAM block. The SRAM without WL selector has a maximum operation power consumption of 7.95 pW. The addition of WL selector produces a maximum increment of the power consumption per operation of 8.32 pW. This increment reduces to 8.05 pW if the power saver strategy is used. Therefore, the impact on memory operation power consumption is approximately 1.2% with saver strategy.

#### VI. CONCLUSIONS

The reduction of the voltage of the word-line signal is considered as a way to increase the RSNM in 6T cells. A digitally controlled VWL reduction circuit is implemented using 65nm CMOS technology. The proposed circuit allows both VWL reduction strategies and defined the optimum values for each memory. The final VWL reduction value used could be selected according to post-process calibration.

A simple voltage divider circuit is used combined with a saving strategy to maintain power selector circuit with minimum number of transistors but overcoming the high power consumption during active stage.

The circuit can be added to a block based 6T-SRAM without mayor changes in the memory architecture; only the power supply of the last row decoder inverter must be isolated in order to work at a lower bias supply.

The use of this technique in embedded SRAMs designs could contribute to achieve most stable operations and due to the easy way to apply the VWL reduction strategies.

#### ACKNOWLEDGMENT

This work has been supported by the Spanish Ministry of Science and Innovation under project CICYT-TEC2011-25017. The authors also wish to thank all reviewers for their interesting comments and remarks.

#### REFERENCES

- K. Zhang et al..SRAM design on 65nm CMOS technology with integrated leakage reduction scheme. Sym. VLSI circ., pp. 294-295, 2004.
- [2] B. Alorda, et al., "8T vs 6T SRAM cell radiation robustness: A comparative analysis", Microelectronics Reliability, vol. 51, 2, 2011, pp. 350-359.
- [3] H. Pilo, et al., "An SRAM Design in 65-nm Technology Node Featuring Read and Write-Assist Circuits to Expand Operating Voltage", IEEE Journal of Solid-State Circuits, vol-42, 2007, pp. 813-819.
- [4] K. Zhang, et al., "SRAM Design on 65-nm CMOS Technology With Dynamic Sleep Transistor for Leakage Reduction", IEEE J. Sol. St. Circuits, 2005, pp 895-901.
- [5] H, Akamatsu, et al. "A 45nm 2-port 8T-SRAM Using Hierarchical Replica Bitline Technique With Immunity From Simultaneous R/W Access Issues", IEEE Journal of Solid-State Circuits, Vol. 43, 4, 2008
- [6] G. Torrens, et al., "Design Hardening of Nanometer SRAMs Through Transistor Width Modulation and Multi-Vt Combination", IEEE Transactions of Circuits and Systems II: Express Briefs, pp. 280-284, 2010.
- [7] B. Alorda, et al., "Stability optimization of embedded 8T SRAMs using Word-Line Voltage Modulation", Design, Automation & Test in Europe Conference, ISSN: 1530-1590, pp 1 - 6. 2011.
- [8] B. Alorda, et al., "Static and Dynamic Stability improvement strategies for 6T CMOS low-power SRAMs", Design, Automation & Test in Europe Conference, ISSN: 1530-1591, pp. 429-434, 2010.