# 2023 Design, Automation & Test in Europe Conference (DATE 2023) Smart Hammering: A practical method of pinhole detection in MRAM memories

Sina Bakhtavari Mamaghani<sup>1</sup>, Christopher Münch<sup>1</sup>, Jongsin Yun<sup>2</sup>, Martin Keim<sup>2</sup>, Mehdi Baradaran Tahoori<sup>1</sup>

<sup>1</sup>Department of Computer Science, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

<sup>2</sup>Siemens Digital Industries Software, Wilsonville, USA

sina.mamaghani@kit.edu, christopher.muench@kit.edu, jongsin.yun@siemens.com, martin.keim@siemens.com, mehdi.tahoori@kit.edu

Abstract- As we move toward the commercialization of Spin-Transfer Torque Magnetic Random Access Memories (STT-MRAM), cost-effective testing and in-field reliability have become more prominent. Among STT-MRAM manufacturing defects, pinholes are one of the important ones. Pinholes are defects on the surface of the oxide layer which degrade the resistive values and, in some cases, cause an oxide breakdown. Some moderate levels of pinhole defects can remain undetected during standard functional tests and may cause a field failure. A stress test of the whole memory, including multiple cycles of long writes, has been suggested to detect candidate pinhole defects. However, this test not only causes extra costs but also degrades the reliability of MRAM for the entire array. In this paper, we have statistically studied the behavior of pinholes and proposed a cost-effective testing scheme to capture pinhole defects and increase the reliability of the end product. Our method limits the number of test candidate cells that need to be hammered, providing a reduced test time of up to 96.42% for our case studies compared to existing methods. This is while the advantages of standard tests are all preserved with our method. The proposed approach is compatible with memory-built-in self-test (MBIST) schemes.

# I. INTRODUCTION

Many integrated chip manufacturers announced Spin-Transfer Torque Magnetic Random Access Memories (STT-MRAM) production as a good candidate to replace embedded flash and eDRAM [1]–[3]. Despite all the attractive features of STT-MRAM, its full commercialization requires further steps to guarantee sufficient product quality [4], [5]. An important step towards this goal would be providing practical guidelines to ensure the lifetime functionality of the memory chips and avoid test escapes [6]. This issue has become more serious since recent studies have shown that there is a possibility of manufacturing defects that may not be easily detectable by simple testing schemes based on March C- [7].

Magnetic Tunnel Junction (MTJ), the core element of STT-MRAM, is manufactured during the back-end-of-line manufacturing phase. During this process, pinholes can be formed in the MgO layer of MTJs due to the diffusion of conducting materials (e.g., CoFeB) into the oxide layer [4]-[6]. The impact of these defects could be different based on their pinhole size. In severe cases (pinholes area > 0.62% of MTJ area), faulty cells can be easily detected by standard tests [4], [5]. On the contrary, in marginal cases (areas of pinholes < 0.62% area of MTJ), their influence on the performance may not be significant enough and remain undetected by existing test schemes [4], [5]. These pinholes can grow in size during their life cycle and cause early-life failure in the memory. Increasing the error correction code (ECC) to cover these extra potential faults would be too costly, and ignoring them could cause field failures and leakage problems. Therefore, a practical method of evaluating potential pinhole defects considering the ECC budget is quite of interest.

Recent research has tried to model these defects based on physical characteristics rather than electrical shorts/opens. They have shown that modeling pinholes based on resistive defects may not be accurate enough to capture all aspects of pinhole defects [4], [5]. In addition to modeling the pinholes, it is necessary to devise proper testing schemes with high coverage and low test time and costs. In [5], it is suggested to use the hammering method to further aggravate the effect of the present pinholes, making them more detectable during testing. The hammering method comprises multiple cycles of long back-to-back write pulses, ideally with higher voltages. Despite its effectiveness, the testing-time overhead of hammering every individual cell is too much to be practical for production tests.

This paper proposes a *Memory Built-in Self-Test* (MBIST) compatible, low-cost testing scheme for pinhole detection. It provides a guideline considering multiple factors, including process controllability, ECC budget, and defect coverage. The contributions of this paper are as follows.

- We have statistically studied the pinhole defects, their influence on the characteristics of MTJ, and how they change their resistive distributions.
- We have provided a recommended spec to set a target defect coverage based on process maturity and product configuration. If the expected pinhole defect level is higher, which could cause an unrepairable row by existing ECC, the hammer test is needed to guarantee the product's functionality.
- To reduce the test burden, we limit the number of cells required to perform hammer tests to guarantee target pinhole defect coverage. A number of test candidates were selected based on bit performance and variation. We calculate the expected performance level to mark enough pinhole defect population to meet the target defect coverage for correct chip functioning. The test candidate cells in the chip are selected among low-performing cells during the read test with modified reference resistance values.

The rest of this paper is organized as follows. Section II provides background on STT-MRAM structure and pinhole defects. Section III explains how we modeled pinholes based on literature and offers a guideline for calculating different test requirements. Section IV presents the simulation setups, results, and related discussion. We conclude the paper in section V.

#### II. BACKGROUND

# A. Magnetic Tunnel Junction

The MTJ is the core element of STT-MRAM memories. An MTJ comprises two magnetic layers separated by a thin oxide layer. One of the layers has a configurable magnetic orientation, while the other has a fixed one. The resistance of MTJ could be modulated depending on the orientation of its magnetic layers. If the layers have the parallel spin direction ( $R_P$ ), then the resistance of the MTJ will be lower compared to the case in which the layers are in the opposite orientation ( $R_{AP}$ ) [8]. The ratio between these two values is referred to as *tunnel magnetoresistance* (TMR) and is calculated by (1) [8], [9].

$$TMR = \frac{R_{AP} - R_P}{R_P} \tag{1}$$

Higher TMR provides better readability and read-stability [10]. Reading the value stored in MTJ can be done by comparing the current passing through it to a reference current generated by the reference source [11]. For reconfiguring an MTJ, we can pass a current higher than the critical current through the MTJ, and based on the current direction, the MTJ can be configured to two states. If the current is passed through the fixed layer toward the free layer, the MTJ is changed to the P state. On the contrary, if the current is passed from the free to the fixed layer, the configuration will be set to AP.

### B. Memory architecture and trim circuitry

Fig. 1 shows the structure of an STT-MRAM memory consisting of an MTJ array, read circuitry, address decoder, and write circuitry. The read circuit consists of a sense amplifier and a trimming circuit to set the reference resistance. The reference resistance value will be fine-tuned to the corresponding trim level by a configurable digital input setting [12], [13]. The trim circuitry can be composed of a series of resistors that can be individually bypassed. This provides tunability for reference resistance that is used to improve the read margin, increase the yield, and mitigate the process and temperature variation effects [14], [15].

The trimming process, which is performed at the initialization phase, can be completed through different approaches. A naive solution would be performing a functional test and collecting fail-bit information for every trim step. However, more efficient ways of searching for optimum trim settings using existing MBIST circuitry were proposed in the literature. In [15], the authors suggested an MBIST-based automated trim search method that minimizes the test time by utilizing a binary search approach. This method computes the optimized read reference trim value within the chip through a built-in analysis circuit rather than manual test and extraction for engineering analysis.

#### C. Pinhole defects and Hammering method

Manufacturing MTJ in the back-end-of-line phase is a process that requires the deposition of more than ten layers for performance purposes [16]. Unique defects can be introduced during this process that could cause MRAM operation failure [4]. One of the important MTJ failure mechanisms is the pinhole defect shown in Fig. 2.a. This defect can be caused by the diffusion of Boron or other metallic impurities into the MgO layer during the deposition process [17], [18]. Other potential causes are the filling of pinholes in the MgO layer with CoFeB material and the diffusion of Oxygen atoms out of the MgO layer due to over-annealing [19], [20].

The pinholes in the MgO layer form a leakage path between two ferromagnetic layers and reduce the TMR value [17]. In some cases, the presence of pinholes could cause a field breakdown in the MgO layer due to the joule heating generated by the leakage current passing through the MTJ [5].



Fig. 1. Structure of a standard STT-MRAM along with trim circuitry.

Many attempts have been made to model the pinhole effects [4], [5], [21], [22]. Initial efforts were made to model pinholes using electrical shorts and opens [21], [22]. However, Wu et al. demonstrated that using resistive-based defects is too pessimistic to precisely model the physical behavior of pinholes [4], [5]. Instead, they have suggested a model based on the *resistance-area product* (RA) and TMR degradation, as shown in equations (2) and (3), respectively. In these equations  $A_{ph} = [0,1]$  is the normalized pinhole area with respect to cross-section area of MTJ denoted by  $A. RA_{gc}$  and  $TMR_{gc}$  are respectively RA and TMR of a good MTJ (meaning that  $A_{ph} = 0$ ).  $RA_{fc}$  is the RA of faulty device which can be obtained by extrapolating the date of RA degradation for a device under stress test [5].

$$RA_{eff\_ph}(A_{ph}) = \frac{A}{\frac{A(1-A_{ph})}{RA_{gc}} + \frac{AA_{ph}}{RA_{fc}}}$$
(2)

$$TMR_{eff\_ph}(A_{ph}) = TMR_{gc} \cdot \frac{RA_{eff\_ph}(A_{ph}) - RA_{fc}}{RA_{gc} - RA_{fc}}$$
(3)

Based on the relative area of pinholes, we could classify them into three different categories, as shown in Fig. 2. The first category is the cells with a relative pinhole area below 0.0015. There is no data showing the failure of these cells within their expected lifetime [4]–[6]. For devices with a relative pinhole area between 0.0015 and 0.0062, the pinhole defect will cause a TMR degradation. However, the behavior of devices is still in the range of normal operation at the manufacturing test time. We call these devices marginal since their pinholes could potentially grow in size and cause a complete oxide breakdown earlier than good cells [6]. At last, if the pinhole defect area is larger than 0.0062, the effective resistance of the cell is below the reference resistance value, and the cell will fail during the standard March test.

Detecting potential breakdowns is not only essential to guarantee the functionality of the chips but also important for maintaining their power consumption level during their lifetime. Faulty MTJs act as electrical shorts, and therefore a high current passes through them when activated. Therefore, if not detected and treated in the test stage, they may cause leakage issues even if the ECC budget covers them.

A testing algorithm that could be used to trigger marginal cells to fail is a repeated back-to-back write operation known as a hammer test [5], [6]. In this test, the accessed MTJ will be continuously stressed to cause joule heating and voltage stress. It has been shown that the marginal cells that are not detectable during routine March tests could become detectable as their pinhole size grows during the hammering test [4]–[6].

Using high voltage or extended write cycles makes MTJs reach temperature saturation faster and accelerates the exposure of marginal cells during the functional test. However, such conditions are difficult to apply in production tests [23].

#### III. PROPOSED APPROACH

# A. Modeling

First, a model of pinhole defects is needed to investigate their behavior and provide a testing scheme to detect them. For this purpose, we have used the MTJ model of [9] as our framework and embedded formulas (2) and (3) in order to capture the RA and TMR degradation effects. In addition, we have utilized equations (4) and (5) to capture the voltage effects on resistance and TMR drops as a function of voltage [5]. In these formulas  $R_0$  and  $TMR_0$  are respectively the  $R_P$  and TMR values at voltage zero.  $V_h$  is the voltage which the *TMR* drops to half of  $TMR_0$  value. And  $\delta$  and  $\rho$  are fitting parameters.

$$R_P(V) = \frac{R_0}{1+\delta . |V|} \quad (4) \qquad TMR(V) = \frac{TMR_0}{1+\frac{V^2}{V^2} + \rho . V^{\frac{4}{3}}} \quad (5)$$

Our model is calibrated and verified according to the experimental measurements presented in [4]–[6]. Fig. 3 shows the simulation results for R-V hysteresis loops of a good MTJ, two marginal (blue and cyan), and three faulty ones (purple, orange, and red). As expected, the loops for marginal and faulty devices are smaller compared to the good ones indicating lower  $R_P$ , TMR, and switching voltage ( $V_c$ ). The decreased switching voltage is associated with lower resistance values which increase the amount of current passing through the MTJ for a specific voltage. It is notable that in case the hysteresis loop of an MTJ shrinks to make Rap below a reference level, it will be indistinguishable between the resistive states  $R_P$  and  $R_{AP}$ , leading to a stuck-at fault.

#### B. Resistance distribution in MRAM

In Fig. 4, we show a distribution of resistance values for two types of MTJ cells. The green lines show the distribution of good cells, and the red lines show how the green distribution would shift if all green cells had a pinhole with a relative area of 0.0015. The solid lines represent the  $R_{AP}$ states, and the dashed lines stand for  $R_P$  state. The different adjustable references by the trim circuit are also shown on the x-axis. The reference resistance setting below Trim\_a (2.7k $\Omega$ ) will increase Read '0' fails, while the reference resistance setting above Trim\_b (4.7k $\Omega$ ) will increase Read '1' fails. Considering the distributions, we can conclude that the optimal trim step for this example array would be trim\_c at 3.7k $\Omega$ , where it provides the balanced read margin for data '0' and '1'.



Fig. 2. Structure of pinhole defect. a) TEM image of a pinhole in the MgO layer (Source: [17]). b) good MTJ cell. c) marginal MTJ cell. d) faulty MTJ cell.



Fig. 3. R-V hysteresis loop of a good MTJ (green), two marginal (blue and cyan), and three faulty ones (purple, orange, and red). The solid lines show the results calibrated according to experimental measurements of [4]–[6]. The dashed lines show additional results predicted by our model.



Fig. 4. Resistance distribution for good and marginal ( $A_{ph} = 0.0015$ ) cells.

#### C. The proposed methodology of smart hammering

As shown in the previous section, the presence of pinholes in marginal cells degrades their resistance. However, in some cases, even the decreased resistance falls within the range of good cells and, therefore, cannot be detected by standard tests. The test escape pinhole defects can degrade further and cause early bit failures. In case these bit failures exceed the ECC budget of the memory, they will cause reliability issues. In addition, the leakage caused by their low resistance values can become an early performance degradation issue. However, a massive test on the entire array for pinholes is not a feasible solution since the memory sizes are too large to conduct such a thorough analysis in a cost-efficient manner. Therefore, our goal here is to calculate the limited test access number based on the target defect coverage so that we can achieve good coverage value under the limited test budget. Once we decide on the target defect coverage, we can calculate the required resistance range to limit the number of candidate cells for the hammer test. This test candidate cell selection can be made by reading out cell values with a modified trim step to capture cells with low resistance in the pool of all memory cells. Fig. 5 shows an overview of our proposed methodology.

As shown in Fig. 5, the hammer test for pinhole detection is not always required to guarantee field failure coverage, as it can be covered by ECC and other repair solutions (including pinholes and other hard errors). To calculate the required defect coverage, we need to consider multiple factors, namely, memory architecture, the probability of pinhole defects, baseline defect rate, and their coincidence on the same row to exceed the repair budget.

### D. Calculating required pinhole defect coverage

We first calculate the fail bit count (FBC) to compute the required pinhole coverage (R<sub>coverage</sub>), which guarantees faultfree functionality. We defined multiple memory parameters, including the number of bits per word (Wlenght), the number of words per chip (Memorysize), and the number of manufactured chips (Product<sub>volume</sub>). It is assumed that pinhole failures of individual cells are independent and uncorrelated and identically distributed over the entire memory array. Assuming  $P_{pd}$  for pinhole defect probability,  $N_{pd}$  for pinhole defect count per word, Phr for hard fail probability, which indicates entire functional fails, including write failures, read decision failures, read disturb faults, etc., and N<sub>hr</sub> for the number of hard fail bits in a word. We can calculate FBC using equation (6). Consecutively, the expected number of occurrences (ENO) for a different combination of defects (e.g., one pinhole and one hard fail, or two pinholes and no hard fail, etc.) could be calculated using (7).

$$FBC = {\binom{W_{lenght}}{N_{pd}}} \times P_{pd}^{(N_{pd})} \times {\binom{W_{lenght} - N_{pd}}{N_{hr}}} \times P_{hr}^{(N_{hr})} \times (1 - P_{pd} - P_{hr})^{(W_{lenght} - N_{pd} - N_{hr})}$$
(6)

$$ENO = FBC \times Memory_{size} \times Product_{volume}$$
(7)

This number shows how many words per manufacturing process will have  $N_{pd}$  number of pinhole defects and  $N_{hr}$ number of hard defect errors inside. In case  $N_{pd} + N_{hr}$ exceeds the ECC budget, we will have a faulty chip in the field. Therefore, it is necessary to test such chips at the manufacturing time in order to prevent them from failing in the field. Assuming all hard errors need to be covered by ECC, we need to successfully detect a specific amount of pinhole defects so that the breakdown of the remaining ones does not cause any future chip failures in the field. To achieve this, the probability of undetected pinholes after the manufacturing test should be low enough that they could not cause a field failure. The required pinhole probability after the test  $(D_{pd})$  can be calculated by replacing  $P_{pd}$  with  $D_{pd}$  in equation (6) and solving equation (7) for an ENO value lower than one. This solution needs to be solved for all possible combinations of faults in which  $N_{pd} + N_{hr}$  exceeds the ECC budget. For instance, if we have a 2-bit ECC budget and one of the bits is reserved for other types of field failures, we need to solve the equation for  $(N_{pd} = 1, N_{hr} = 1)$  and  $(N_{pd} = 2, N_{hr} = 0)$ . The case of  $(N_{pd} = 0, N_{hr} = 2)$  exceeds the repair budget but does not need to be considered, as we assumed all hard errors are detectable with a standard test [24]. It is noteworthy that we don't need to consider a higher number of faults (e.g.,  $N_{pd} = 2, N_{hr} = 1$ ) as they are a subset of the prementioned conditions ( $N_{pd} = 1, N_{hr} = 1$ ). After calculating  $D_{pd}$  for all different cases, the required R<sub>coverage</sub> could be calculated by (8) where  $D_{pd,max}$  is the maximum value of  $D_{pd}$  for all the cases.

$$R_{coverage} = \frac{P_{pd} - D_{pd,max}}{P_{pd}} \times 100(\%)$$
(8)

After calculating the required pinhole defect coverage, one can use the hammering method to detect the marginal cells in a poll of suspect cells to achieve the desired defect coverage. By doing this, the initially undetectable pinholes will become large enough to be detected by the standard test procedure.

In order to determine which cells and what percentage of memory need to be hammered, we used a MATLAB code that calculates the achievable coverage ( $A_{coverage}$ ) per trim step (see Fig. 6). In this code, the  $A_{coverage}$  will be calculated for every trim step starting from the first trim step (Trim\_a in Fig. 4). If the  $A_{coverage}$  of a stage is equal to or higher than our target  $R_{coverage}$ , the code will return this step number as an output. The results provided by this code also determine which trim step needs to be used for the hammering procedure and what percentage of memory will go through hammering to achieve the target  $R_{coverage}$ .

After the target trim step is obtained, it can be used to perform the hammering test in the following manner. First, the trim setup is set to the target resistance level. Then all the MTJs are written to the AP state, and the entire memory is tested with this specific predetermined trim. In case a cell read fails with this trim step, the cell will be hammered for a specific number of long write pulses. The hammering method can be done in two directions. In [25], it is suggested that hammering in the AP to P direction could be more effective due to the self-heating effect, while it is argued in [26] that the hammering would be more effective in P to AP state due to higher voltage stress on MTJs. Notably, since the pinhole effect on resistance is more detectable in the AP state, we try to capture them in this configuration. After marching through the entire memory this way, we proceed with a standard test procedure to capture the previously marginal cells that have been broken down due to the Joule heating and voltage stress effects.



Fig. 5. Flowchart of our proposed method for pinhole detection.

```
input: Required pinhole defect coverage (TPDC)
output: Trim step number (TSN), Memory hammer percentage (MHP)
for each trim_step
    calculate PDC
    if PDC > TPDC
        TSN ← trim_step
        MHP ← calculate the memory hammer percentage
        break
    end
end
```

Fig. 6. Pseudo code for determining the trim step needed for the hammering procedure.

It is notable that in some cases, the target PDC will not be achievable if the trim steps are limited. For these cases, we can either increase the trim step counts (e.g., from 32 to 64) or use a manufacturing process with higher variation control. Since the number of trim steps cannot be modified after manufacturing, designers need to consider the possible requirement of trim extension beforehand.

# IV. SIMULATION RESULTS AND DISCUSSION

# A. Simulation setup

To demonstrate the importance of pinhole detection, two study cases are considered, as shown in Table I. In all our simulations, we consider the room temperature and the MTJ characteristics presented in Table II [9]. The  $R_P$  and TMR variations are considered 6.95% and 4.1%, respectively [27]. The effective pinhole defectivity rate and hard error rate are assumed to be 0.2 ppm and 1 ppm [24]. We also assume that a relative pinhole area needs to be equal to or larger than 0.0015 in order to cause an early field failure since there is no data showing smaller pinholes could cause failures within the expected lifetime of the device [4]–[6]. We also consider a trim circuit ranging from  $R_P + 3\sigma_P$  to  $R_{AP} - 3\sigma_{AP}$  with 64 steps [12], [14].

#### B. Simulation results

Fig. 7 shows the simulation result of the MTJ distribution based on their pinhole size. We have considered the effective pinhole ratio of 0.2 ppm; therefore, the accumulative area below the cell distribution line needs to be equal to  $0.2 \times 10^{-6}$  for pinhole with a relative area larger than 0.0015. In this graph, the red area shows the probability density of the pinhole defects that standard test procedures can always detect, while the green area shows the probability density of the pinhole defects that are detectable by smart hammering test based on candidate cell selection with a trim circuitry [12], [14]. The blue area includes the entire population of pinhole defects where the overlap with the green and red colored areas can be detected, but the remaining blue area is undetectable even with our proposed approach. The cells in this area have resistance values within the normal cell range, even with the presence of a pinhole. They are excluded from the candidate pool since their resistance value exceeds the maximum reference level set in our trim range.

TABLE I. CASE STUDIES

| Parameter           | Case-1 | Case-2 |
|---------------------|--------|--------|
| #Manufactured chips | 105    | 105    |
| Chip capacity       | 1MB    | 8MB    |
| Word length         | 128    | 256    |
| ECC budget          | 2*     | 2*     |

\* One bit reserved for field failures caused by sources other than the pinhole

| TABLE II. | MTJ PARAMETERS |
|-----------|----------------|
|           |                |

| Parameter                          | Value                                   |
|------------------------------------|-----------------------------------------|
| Oxide barrier thickness            | 0.9nm                                   |
| Free layer thickness               | 0.85nm                                  |
| MTJ surface                        | 60nm×60nm                               |
| Resistance-area product            | $4.52\Omega.\mu m^2$                    |
| TMR ratio with $V_{\text{biar=0}}$ | ~140%                                   |
| MTJ resistance                     | $\sim 2.3 k\Omega$ , $\sim 5.5 k\Omega$ |

To show the importance of pinhole defect detection and the advantage of smart hammering, we have performed two case studies, as shown in Table II. We have calculated the required defect coverage for discovering all pinhole defects under the ECC budget and computed what percentage of memory needs to be hammered to capture unrepairable rows with ECC. Table III shows the required pinhole defect coverage, the target reference resistance, whether the target reference resistance is under the max trim, and the percentage of memory sizes, with an increase in word length, number of words, or manufactured chip count, the pinhole defect escape rate will increase; therefore, the need for pinhole detection also increases.

As mentioned before, one of the factors that makes pinhole defect detection more difficult is process variation. Smaller process variation makes defect properties more dominant and easier to differentiate. However, wider process variation makes the differentiation more difficult due to the overlap of the properties of good and marginal cells. To further investigate this issue, we performed the same steps considering a process with half of the variation as before. As we can see in Table III, the number of memory cells requiring hammering reduces at least five orders of magnitude as the overlap among good and marginal cells decreases. In addition, higher pinhole detection is achievable with smaller trim steps.

To further demonstrate the superiority of our method, we have plotted our accuracy and memory hammer percentage against the standard hammering test in which all the cells are hammered (Fig. 8). As we can see, our method can guarantee the chip functionality with a far lower percentage of memory cell hammering. In case-2, our proposed method achieves nearly the same accuracy while only hammering a 3.58% of memory. This gives us a time advantage of up to 96.42% while providing the same quality as the standard test. This is while the only change made was the extension of trim steps.



Fig. 7. Distribution of cells for pinhole defectivity rate of 0.2ppm.

TABLE III. COMPARATIVE ANALYSIS FOR DIFFERENT MEMORY MANUFACTURING PROCESSES

| Manufacturing<br>process            | Case-1   |                      | Case-2   |                      |
|-------------------------------------|----------|----------------------|----------|----------------------|
|                                     | Standard | Highly<br>controlled | Standard | Highly<br>controlled |
| Required pinhole<br>defect coverage | 62.45%   |                      | 97.66%   |                      |
| Target Reference resistance         | 4275Ω    | 4200Ω                | 4750Ω    | 4400Ω                |
| Is an extended trim step needed?    | No       | No                   | Yes      | No                   |
| Memory hammer<br>percentage         | 0.11     | 2.4×10 <sup>-8</sup> | 3.58     | 3.3×10 <sup>-5</sup> |

# TEST COST AND COVERAGE • Coverage percentage • Relative memory access number (test time) • Coverage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Relative memory access number (test time) • Overage percentage • Overage percenage • Overag

Fig. 8. Comparative analysis of smart vs. standard hammering.

#### V. CONCLUSION

Pinholes in the MgO layer are one of the most important MTJ defects that are crucial to detect and can cause test escapes and subsequent field failures. In this paper, we have thoroughly investigated their statistical effect on the resistive distribution of STT-MRAMs with pinhole defects to find a methodology that can be used to capture a suspect poll of marginal cells. The smart incremental hammering procedure guarantees reliability and high yield for the final product with a reduced test time of up to 96.42% for our case studies compared to standard hammering solutions. Our approach is compatible with the existing MBIST, only requiring minor modifications to its circuit.

#### References

[1] D. Edelstein *et al.*, "A 14 nm Embedded STT-MRAM CMOS Technology," in *2020 IEEE International Electron Devices Meeting (IEDM)*, Dec. 2020, p. 11.5.1-11.5.4. doi: 10.1109/IEDM13553.2020.9371922.

[2] G. Hu *et al.*, "STT-MRAM with double magnetic tunnel junctions," in 2015 IEEE International Electron Devices Meeting (IEDM), Dec. 2015, p. 26.3.1-26.3.4. doi: 10.1109/IEDM.2015.7409772.

[3] L. Wei *et al.*, "13.3 A 7Mb STT-MRAM in 22FFL FinFET Technology with 4ns Read Sensing Time at 0.9V Using Write-Verify-Write Scheme and Offset-Cancellation Sensing Technique," in *2019 IEEE International Solid-State Circuits Conference - (ISSCC)*, Feb. 2019, pp. 214–216. doi: 10.1109/ISSCC.2019.8662444.

[4] L. Wu, M. Taouil, S. Rao, E. J. Marinissen, and S. Hamdioui, "Electrical Modeling of STT-MRAM Defects," in *2018 IEEE International Test Conference (ITC)*, Oct. 2018, pp. 1–10. doi: 10.1109/TEST.2018.8624749.

[5] L. Wu *et al.*, "Pinhole Defect Characterization and Fault Modeling for STT-MRAM Testing," in *2019 IEEE European Test Symposium (ETS)*, May 2019, pp. 1–6. doi: 10.1109/ETS.2019.8791518.

[6] L. Wu *et al.*, "Defect and Fault Modeling Framework for STT-MRAM Testing," *IEEE Trans. Emerg. Top. Comput.*, vol. 9, no. 2, pp. 707–723, Apr. 2021, doi: 10.1109/TETC.2019.2960375.

[7] K. Komagaki *et al.*, "Influence of Diffused Boron Into MgO Barrier on Pinhole Creation in CoFeB/MgO/CoFeB Magnetic Tunnel Junctions," *IEEE Trans. Magn.*, vol. 45, no. 10, pp. 3453–3456, Oct. 2009, doi: 10.1109/TMAG.2009.2022189. [8] Z. Wang, W. Zhao, E. Deng, J.-O. Klein, and C. Chappert, "Perpendicular-anisotropy magnetic tunnel junction switched by spin-Hallassisted spin-transfer torque," *J. Phys. Appl. Phys.*, vol. 48, no. 6, p. 065001, Jan. 2015, doi: 10.1088/0022-3727/48/6/065001.

[9] Y. Wang *et al.*, "Compact Model of Dielectric Breakdown in Spin-Transfer Torque Magnetic Tunnel Junction," *IEEE Trans. Electron Devices*, vol. 63, no. 4, pp. 1762–1767, Apr. 2016, doi: 10.1109/TED.2016.2533438. [10] S. Bakhtavari Mamaghani, M. H. Moaiyeri, and G. Jaberipur, "Design of an efficient fully nonvolatile and radiation-hardened majority-based magnetic full adder using FinFET/MTJ," *Microelectron. J.*, vol. 103, p. 104864, Sep. 2020, doi: 10.1016/j.mejo.2020.104864.

[11] R. Rajaei and S. Bakhtavari Mamaghani, "A Nonvolatile, Low-Power, and Highly Reliable MRAM Block for Advanced Microarchitectures," *IEEE Trans. Device Mater. Reliab.*, vol. 17, no. 2, pp. 472–474, Jun. 2017, doi: 10.1109/TDMR.2017.2694228.

[12] C. Münch, J. Yun, M. Keim, and M. B. Tahoori, "MBIST-based Trim-Search Test Time Reduction for STT-MRAM," in 2022 IEEE 40th VLSI Test Symposium (VTS), Apr. 2022, pp. 1–7. doi: 10.1109/VTS52500.2021.9794178.

[13] A. Antonyan, S. Pyo, H. Jung, and T. Song, "Embedded MRAM Macro for eFlash Replacement," in 2018 IEEE International Symposium on Circuits and Systems (ISCAS), May 2018, pp. 1–4. doi: 10.1109/ISCAS.2018.8351201.

[14] C. Münch, J. Yun, M. Keim, and M. B. Tahoori, "MBIST-supported Trim Adjustment to Compensate Thermal Behavior of MRAM," in 2021 IEEE European Test Symposium (ETS), May 2021, pp. 1–6. doi: 10.1109/ETS50041.2021.9465383.

[15] J. Yun, B. Nadeau-Dostie, M. Keim, C. Dray, and M. Boujamaa, "MBIST Support for Reliable eMRAM Sensing," in 2020 IEEE European Test Symposium (ETS), May 2020, pp. 1–6. doi: 10.1109/ETS48528.2020.9131564.

[16] M. Komalan et al., "Cross-layer design and analysis of a low power, high density STT-MRAM for embedded systems," in 2017 IEEE International Symposium on Circuits and Systems (ISCAS), May 2017, pp. 1–4. doi: 10.1109/ISCAS.2017.8050923.

[17] W. Zhao *et al.*, "Failure Analysis in Magnetic Tunnel Junction Nanopillar with Interfacial Perpendicular Magnetic Anisotropy," *Materials*, vol. 9, no. 1, p. 41, Jan. 2016, doi: 10.3390/ma9010041.

[18] S. Mukherjee *et al.*, "Role of boron diffusion in CoFeB/MgO magnetic tunnel junctions," *Phys. Rev. B*, vol. 91, no. 8, p. 085311, Feb. 2015, doi: 10.1103/PhysRevB.91.085311.

[19] B. Oliver, G. Tuttle, Q. He, X. Tang, and J. Nowak, "Two breakdown mechanisms in ultrathin alumina barrier magnetic tunnel junctions," *J. Appl. Phys.*, vol. 95, no. 3, pp. 1315–1322, Feb. 2004, doi: 10.1063/1.1636255.

[20] S. Van Beek *et al.*, "Impact of processing and stack optimization on the reliability of perpendicular STT-MRAM," in *2017 IEEE International Reliability Physics Symposium (IRPS)*, Apr. 2017, pp. 5A-1.1-5A–1.5. doi: 10.1109/IRPS.2017.7936318.

[21] I. Yoon, A. Chintaluri, and A. Raychowdhury, "EMACS: Efficient MBIST architecture for test and characterization of STT-MRAM arrays," in *2016 IEEE International Test Conference (ITC)*, Nov. 2016, pp. 1–10. doi: 10.1109/TEST.2016.7805834.

[22] A. Chintaluri, H. Naeimi, S. Natarajan, and A. Raychowdhury, "Analysis of Defects and Variations in Embedded Spin Transfer Torque (STT) MRAM Arrays," *IEEE J. Emerg. Sel. Top. Circuits Syst.*, vol. 6, no. 3, pp. 319–329, Sep. 2016, doi: 10.1109/JETCAS.2016.2547779.

[23] J. H. Lim *et al.*, "Area and pulsewidth dependence of bipolar TDDB in MgO magnetic tunnel junction," in *2018 IEEE International Reliability Physics Symposium (IRPS)*, Mar. 2018, p. 6D.6-1-6D.6-6. doi: 10.1109/IRPS.2018.8353637.

[24] Y.-C. Shih *et al.*, "Logic Process Compatible 40NM 16MB, Embedded Perpendicular-MRAM with Hybrid-Resistance Reference, Sub-μA Sensing Resolution, and 17.5NS Read Access Time," in *2018 IEEE Symposium on VLSI Circuits*, Jun. 2018, pp. 79–80. doi: 10.1109/VLSIC.2018.8502260.

[25] Y. Ji *et al.*, "Reliability of Industrial grade Embedded-STT-MRAM," in 2020 IEEE International Reliability Physics Symposium (IRPS), Apr. 2020, pp. 1–3. doi: 10.1109/IRPS45951.2020.9129178.

[26] V. B. Naik *et al.*, "Extended MTJ TDDB Model, and Improved STT-MRAM Reliability With Reduced Circuit and Process Variabilities," in *2022 IEEE International Reliability Physics Symposium (IRPS)*, Mar. 2022, p. 6B.3-1-6B.3-6. doi: 10.1109/IRPS48227.2022.9764563.

[27] V. B. Naik *et al.*, "Manufacturable 22nm FD-SOI Embedded MRAM Technology for Industrial-grade MCU and IOT Applications," in *2019 IEEE International Electron Devices Meeting (IEDM)*, Dec. 2019, p. 2.3.1-2.3.4. doi: 10.1109/IEDM19573.2019.8993454.