UNIVERSITY OF CALIFORNIA

Los Angeles

# A Comparative Analysis of SRAM Sense Amplifiers

A thesis submitted in partial satisfaction of the requirements for the degree Master of Science in Electrical Engineering

by

Kejian Shi

2017

© Copyright by Kejian Shi 2017

#### Abstract of the Thesis

# A Comparative Analysis of SRAM Sense Amplifiers

by

Kejian Shi

Master of Science in Electrical Engineering University of California, Los Angeles, 2017 Professor Asad A. Abidi, Chair

The operation of the voltage-mode and current-mode sense amplifiers (VSA and CSA) is explained in detail. Analytical expressions for offset arising from random FET mismatch are derived. The mechanism of two offset reduction method for VSA is explained. The supply-dependent offset for CSA is shown. The offset combined with signal attenuation define a figure-of-merit. The current-mode sense amplifier is shown to be superior at  $V_{DD} < 0.6 \sim 0.7$ V.

The thesis of Kejian Shi is approved.

Subramanian S. Iyer

C.-K. Ken Yang

Asad A. Abidi, Committee Chair

University of California, Los Angeles 2017

To my parents ...

# TABLE OF CONTENTS

| 1        | Intr                       | $\mathbf{roduction}$ |                                                                                                  |    |  |
|----------|----------------------------|----------------------|--------------------------------------------------------------------------------------------------|----|--|
|          | 1.1                        | Backgr               | round                                                                                            | 1  |  |
|          | 1.2                        | Contri               | bution $\ldots$ | 3  |  |
|          | 1.3                        | Organi               | ization of the Thesis                                                                            | 4  |  |
| <b>2</b> | Offs                       | set Volt             | tage Comparison for Sense Amplifiers                                                             | 6  |  |
|          | 2.1                        | A Gen                | eralized Mathematical Proof for Optimal Offset Design                                            | 7  |  |
|          | 2.2                        | Cross-               | Coupled Inverter                                                                                 | 11 |  |
| 3        | Vol                        | tage-M               | ode Sense Amplifier Analysis                                                                     | 15 |  |
|          | 3.1                        | Phase                | Plane and Separatrix                                                                             | 15 |  |
|          | 3.2                        | VSA C                | Operations                                                                                       | 17 |  |
|          |                            | 3.2.1                | Pre-Amplification Gain                                                                           | 17 |  |
|          |                            | 3.2.2                | Sensing Speed                                                                                    | 18 |  |
|          | 3.3                        | Offset Voltage       |                                                                                                  |    |  |
|          |                            | 3.3.1                | Threshold Mismatch                                                                               | 19 |  |
|          |                            | 3.3.2                | $\beta$ Mismatch                                                                                 | 19 |  |
|          |                            | 3.3.3                | Total Offset Voltage                                                                             | 21 |  |
| 4        | $\mathbf{VS}_{\mathbf{r}}$ | A Offse              | et Reduction Methods                                                                             | 22 |  |
|          | 4.1                        | VSA C                | Offset Reduction: SAE Transition                                                                 | 22 |  |
|          |                            | 4.1.1                | Operations                                                                                       | 22 |  |
|          |                            | 4.1.2                | Offset Reduction                                                                                 | 24 |  |
|          |                            | 4.1.3                | Sensing Speed                                                                                    | 25 |  |

|          | 4.2        | VSA (   | Offset Reduction: Skewing PGB and SAE | 26 |  |  |  |
|----------|------------|---------|---------------------------------------|----|--|--|--|
|          |            | 4.2.1   | Operations                            | 26 |  |  |  |
|          |            | 4.2.2   | Offset Reduction                      | 28 |  |  |  |
|          |            | 4.2.3   | Sensing Speed                         | 29 |  |  |  |
| <b>5</b> | Cur        | rent-N  | Iode Sense Amplifier Analysis         | 30 |  |  |  |
|          | 5.1        | Opera   | tions                                 | 30 |  |  |  |
|          |            | 5.1.1   | Sampling Phase                        | 30 |  |  |  |
|          |            | 5.1.2   | Propagation Phase                     | 32 |  |  |  |
|          |            | 5.1.3   | Regeneration Phase                    | 33 |  |  |  |
|          |            | 5.1.4   | Sensing Speed                         | 33 |  |  |  |
|          | 5.2        | Offset  | Voltage                               | 33 |  |  |  |
|          |            | 5.2.1   | N1 Pair                               | 34 |  |  |  |
|          |            | 5.2.2   | N2 Pair                               | 34 |  |  |  |
|          |            | 5.2.3   | Total Offset Voltage                  | 35 |  |  |  |
| 6        | $VS_{2}$   | A and   | CSA Comparison                        | 37 |  |  |  |
|          | 6.1        | Offset  | Voltage Comparison                    | 37 |  |  |  |
|          | 6.2        | Sensin  | g Speed Comparison                    | 39 |  |  |  |
|          | 6.3        | FoM (   | Comparison and Supply Dependency      | 39 |  |  |  |
| 7        | Cor        | nclusio | <b>n</b>                              | 42 |  |  |  |
| Re       | References |         |                                       |    |  |  |  |

# LIST OF FIGURES

| 1.1 | (a) Voltage-Mode Sense Amplifier (VSA); (b) Current-Mode Sense             |    |
|-----|----------------------------------------------------------------------------|----|
|     | Amplifier (CSA).                                                           | 2  |
| 1.2 | (a) SRAM read circuit used in this thesis ; (b) 6T SRAM cell; (c)          |    |
|     | Equivalent circuit of 6T SRAM cell when $WL$ is high; (d) Differen-        |    |
|     | tial equivalent during small signal development; (e) Typical timing        |    |
|     | diagram                                                                    | 3  |
| 2.1 | (a) Cross-coupled inverter; (b) Differential-mode equivalent circuit;      |    |
|     | (c) Common-mode equivalent circuit. $\ldots \ldots \ldots \ldots \ldots$   | 11 |
| 2.2 | Monte-Carlo simulation of cross-coupled inverter minimum offset            |    |
|     | voltage on different initial common-mode voltage $V_{CM}(0)$               | 14 |
| 3.1 | Voltage-Mode Sense Amplifier (VSA). Transistor sizing: $M_P$ =             |    |
|     | $560n/60n, M_{ST} = M_{AT} = 2M_P, M_N = 4M_P. \dots \dots \dots$          | 17 |
| 3.2 | (a) Equivalent circuit of VSA pre-amplification phase; (b) Common-         |    |
|     | mode, (c) Differential-mode equivalent circuit. $\ldots \ldots \ldots$     | 18 |
| 3.3 | Phase plane plot ( $V_{CM}$ vs. $V_{DM}$ taking time t as a parameter) for |    |
|     | 10% $\beta$ mismatch                                                       | 20 |
| 4.1 | (a) Common-mode equivalent circuit, (b) Differential-mode equiv-           |    |
|     | alent circuit during SAE transition; (c) $I$ decomposition                 | 23 |
| 4.2 | Offset reduction vs. SAE transition time plot                              | 26 |
| 4.3 | (a) Common-mode equivalent circuit, (b) Differential-mode equiv-           |    |
|     | alent circuit during pre-amplification phase; (c) ${\cal I}$ decomposition | 27 |
| 4.4 | Offset reduction by skewing PGB and SAE signals by $10ps$                  | 28 |

| 5.1 | Current-Mode Sense Amplifier (CSA). Transistor sizing: $M_{ST}$ =               |    |
|-----|---------------------------------------------------------------------------------|----|
|     | $560n/60n, M_{N1} = M_{N2} = M_P = 2M_{ST}.$                                    | 31 |
| 5.2 | Differential-mode equivalent circuit of CSA: (a) Sampling Phase;                |    |
|     | (b) Propagation Phase; (c) Regeneration Phase                                   | 32 |
| 5.3 | Differntial equivalent circuit when mismatches are considered: (a)              |    |
|     | Sampling phase; (b) Propagation phase.                                          | 33 |
| 5.4 | CSA offset vs. $C_L/C_C.$ (28nm CMOS: $A_{V_{tN}}=2.5mV\cdot\mu m,A_{\beta_N}=$ |    |
|     | $0.5\% \cdot \mu m. [1]$ )                                                      | 35 |
| 6.1 | FoM comparison of VSA and CSA on different supply voltage                       | 40 |
|     |                                                                                 | -0 |

# LIST OF TABLES

| 3.1 | Measured RMS offset Vs. calculated offset (28nm FDSOI: $A_{V_{tN}} =$                                |    |
|-----|------------------------------------------------------------------------------------------------------|----|
|     | $1.7mV \cdot \mu m, A_{\beta_N} = 0.5\% \cdot \mu m. [2])  \dots  \dots  \dots  \dots  \dots  \dots$ | 21 |

### LIST OF SYMBOLS

- $A_{vt}$  Pelgrom coefficient for threshold voltage.
- $I_{cell}$  Read current from SRAM bit-cell.
- N Number of FET pairs.
- $S_i$  Area of one FET of *i*-th FET pair.
- $S_{tot}$  Half of total area of all FET pairs.
- T SRAM Sensing window.
- $V_{DD}$  Supply voltage.
- $V_{os}$  Input-referred offset voltage.
- $V_{t0}$  Threshold voltage of a NFET.
- $\Delta V_{ti}$  Threshold voltage mismatch of *i*-th FET pair.
- $\beta\,$  FET transconductance coefficient.
- $\sigma_x$  Standard deviation of random variation x.
- $a_i$  Coefficient of *i*-th FET pair in  $V_{os}$  expression.
- $A_{SP}$  CSA sampling phase gain.
- $A_{V_{tN}}$  Pelgrom coefficient for NFET shreshold.
- $A_{V_{tP}}$  Pelgrom coefficient for PFET threshold.
- $A_{\beta_N}$  Pelgrom coefficient of  $\beta_N$ .
- A Constant.
- B Constant.

- $C'_L$  Common-mode load capacitance.
- $C_C$  CSA node C differential capacitance.
- ${\cal C}_L$  Differ<br/>netial-mode load capacitance.
- $C_{BL}$  Differential bitline capacitance.
- $C_{in}$  SA input capacitance.
- $C_{wire}$  Wiring capacitance.
- C Constant.
- EIC Equivalent initial condition.
- $G_m$  Inverter transconductance.
- $G_{AT}$  Conductance of AT.
- Gain VSA preamplification gain.
- I(t) Differential current injected due to mismatch and AT imbalance.
- $I_D$  Drain current of a FET.
- $I_{PP}$  CSA propagation phase common-mode current through N2.
- $I_{SP}$  CSA sampling phase common-mode current through N1.
- $I_{mis,N1}$  Differential current injected by N1 mismatches.
- $I_{mis,N2}$  Differential current injected by N2 mismatches.
- $I_{mis}$  Differential current induced by mismatch.
- $R_{AT}$  AT resistance.
- $S_N$  Area of one NFET.

- $S_P$  Area of one PFET.
- $S_{AT}$  Area of one access transistor.
- $S_{N1}$  Area of one N1 transistor.
- $S_{N2}$  Area of one N2 transistor.
- $T_{PP}$  CSA propagation phase duration.
- $T_{SP}$  CSA sampling phase duration.
- $V'_{CM} V_{CM} (V_{DD} |V_{tP}| + V_{tN})/2.$
- $V'_{DD} V_{DD} |V_{tP}| V_{tN}.$
- $V_C(t)$  CSA node C differential voltage.
- $V_G$  Gate voltage of a FET.
- $V_O(t)$  CSA output differential voltage.

 $V_{CM}(t)$  Common-mode voltage as a function of time.

 $V_{DM}(t)$  Differential voltage as a function of time.

 $V_{O,CM}$  CSA output common-mode voltage.

- $V_{TH}$  Inverter trip point.
- $V_{id}$  CSA differential input voltage.

 $V_{mis,N1} \ \Delta V_{tN1} + \frac{V_{ov,N1}}{2} \frac{\Delta \beta_{N1}}{\beta_{N1}}.$  $V_{mis,N2} \ \Delta V_{tN2} + \frac{V_{ov,N2}}{2} \frac{\Delta \beta_{N2}}{\beta_{N2}}.$ 

- $V_{os,N1}$  Offset induced by N1.
- $V_{os,N2}$  Offset induced by N2.

- $V_{os,\beta}$  Offset induced by  $\beta$  mismatch.
- $V_{os,i}$  Offset voltage with *i*th FET pair contribution only.
- $V_{os,min}$  Minimum offset voltage.
- $V_{os,t}\,$  Offset induced by NFET threshold voltage mismatch.
- $V_{ov,N1}$  N1 overdrive voltage.
- $V_{ov,N2}$  CSA N2 overdrive voltage.
- $V_{t,AT}$  AT threshold voltage.
- $V_{tN}$  NFET threshold voltage.
- $V_{tP}$  PFET threshold voltage.
- $W(t)^{-1}$  Time evolution function.
- $\Delta V_{CM}(t_1)$  Common-mode voltage drop at time  $t_1$ .
- $\Delta x$  x mismatch.
- $\beta_N$  NFET transconductance coefficient.
- $\beta_P$  PFET transconductance coefficient.
- $\beta_{AT}$  AT transconductance coefficient.
- $\beta_{N1}$  CSA N1 transconductance coefficient.
- $\beta_{N2}$  CSA N2 transconductance coefficient.
- $\eta\,$  RC delay attenuation ratio.
- **A**  $N \times N$  matrix.
- **F** Nonlinear operator defined in state space.

- $\mathbf{x_n}$  Fixed point in state space.
- $\mathbf{x}(t)$  State variable vector.
- FoM Figure of Mirit.
- $\tau_1$  Time constant defined by ST and AT.
- $a_N$  Coefficient of NFET pair in  $V_{os}$  expression.
- $a_P$  Coefficient of PFET pair in  $V_{os}$  expression.
- $b_i$  Coefficient.
- $g_{mN2}$  N2 transconductance.
- $g_{mN}$  NFET transconductance.
- $g_{mP}$  PFET transconductance.
- $g_{md1}$  CSA N1:  $\partial I_D / \partial V_D$ .
- $k_{N2}\,$  Coefficient of N2 pair in  $V_{os}$  expression.
- $k_{\beta}$  Slope of separatrix.
- $k\,$  Constant.
- n Body effect coefficient in EKV model.
- $s_+$  Positive (real part) pole.
- $s_{-}$  Negative (real part) pole.
- $s_r$  VSA regeneration phase pole.
- $s_{RP}$  CSA regeneration phase pole.
- $s_{max}$  Maximum of  $s_{re}(t)$ .

- $s_{pre}\,$  VSA preamplification pole.
- $s_{re}$  Regeneration pole.
- $t^\prime_{tran}\,$  SAE rise time for AT and N both on.
- t' The moment VSA preamplification phase ends.
- $t_0$  PGB delay.
- $t_1\,$  The moment regeneration begins during SAE transition.
- $t_{delay}$  Extra sensing delay.
- $t_{tran}$  SAE rise time rail-to-rail.

#### Acknowledgments

I would like to show my gratitude toward Professor Asad Ali Abidi who has guided me through the research topics. His understanding on circuits is very intuitive and intelligent which continuously impressed and enlightened me ever since I took his EE215A. And it is him that taught me the way to think as an engineer while keeping the pursuit of underlying physics of every circuit. It is him that leads me to the realm of circuits.

I would like to thank Professor Subramanian S. Iyer and Professor C.-K. Ken Yang for being on my thesis committee and offering advice to me.

I would like to thank PhD student Hao Xu who shows great patience and wisdom discussing problems. His previous work paves the way of this thesis. I am very grateful for his help.

## CHAPTER 1

### Introduction

### 1.1 Background

SRAMs are one of the important building blocks in modern systems since they usually consume large portion of energy consumption, area and access time. Traditional 6T bit-cell SRAMs are widely used in nominal supply systems while 8T bit-cell SRAMs is proposed and used as an alternative for low supply applications since the stability issues with 6T bit-cells in low supply [3,4].

Sense amplifiers are essential parts in SRAM read circuitry which employs small-signal sensing scheme to reduce read access time and power consumption. Two types of sense amplifiers (SAs) are widely used in today's SRAMs: the voltage-mode sense amplifier (VSA), and the current-mode sense amplifier (CSA). Fig. 1.1 shows the sense amplifier circuit. These two sense amplifiers are both latch-based voltage-type sense amplifiers, in contrast to static amplifiers and current-type sense amplifiers that sense small current signal instead of voltage signal. VSA and CSA do not consume stand-by power and have simple structure when comparing to static amplifiers and current-type sense amplifiers, respectively. The recent literature on these SRAMs shows that VSA is often employed in 6T SRAMs while CSA is often employed in 8T SRAMS [5–9].

The read path is usually a part of the critical path in SRAM as shown in Fig. 1.2. The operation begins when a selected row is activated. Due to the read current from the bit-cell, the differential voltage signal begins developing between



Figure 1.1: (a) Voltage-Mode Sense Amplifier (VSA) ; (b) Current-Mode Sense Amplifier (CSA).

pre-charged  $(V_{DD})$  bitlines (BL/BLB). The differential signal is applied to the sense amplifier at the same time. After sufficient signal is developed (sensing window), the sense amplifier is enabled by a control signal SAE and the small signal is amplified rail-to-rail. There are three main sources of variation that can cause read failure [10]:

- 1. Read current  $(I_{cell})$  variation from bit-cell which affects sensing signal for sense amplifier;
- 2. Sensing window (T) variation which affects sensing signal for sense amplifier;
- 3. Sense amplifier variation.

Although 1 and 2 are not part of this thesis, it helps to conclude that the yield of SRAM read operation is limited by:

 Sensing signal applied to sense amplifier the moment sense amplifier is enabled;



Figure 1.2: (a) SRAM read circuit used in this thesis ; (b) 6T SRAM cell; (c) Equivalent circuit of 6T SRAM cell when WL is high; (d) Differential equivalent during small signal development; (e) Typical timing diagram.

2. Sense amplifier input-referred offset voltage (offset or offset voltage for simplicity).

### 1.2 Contribution

The thesis mainly focuses on two types of sense amplifier: VSA and CSA.

Previous works have compared the two sense amplifiers. [11–13] compare them by performing simulations on pre-fixed designs and not treat sizing and supply voltage as variables, which is not how design takes place. Woo's analysis [14] applies more math, and considers secondary effects but primary effects are not analyzed carefully. Moreover, it cannot explain the supply dependency. [15] is a good experimental comparison which shows the different  $V_{DD}$  dependency of CSA and VSA, and it also use the concept "area efficiency" for sense amplifiers. However, it does not provide analysis that explains why different  $V_{DD}$  dependency occurs.

In this paper, we explain using the methods of analysis first used in [16] why each type of sense amplifier is associated with a particular SRAM.

The main results are:

- We obtain explicit expressions for the net input-referred offset voltage, both static and dynamic, in each type of sense amplifier, based on EKV model [17];
- 2. We derive generalized expressions of best achievable offset voltage for any sense amplifiers, which only depends on circuit topology and operations, and provides a design strategy;
- 3. We explain how proper timing schemes of control signals affect VSA offset, and to what degree;
- 4. We point out two counteracting effects in SA yield consideration, which are offset and signal attenuation due to SA input capacitance and RC delay;

We conclude by assembling these results into a figure-of-merit that shows that although the VSA is superior at high  $V_{DD}$  the CSA will prevail below 0.6V. Predictions from analysis match simulations very well.

### **1.3** Organization of the Thesis

This thesis is organized as following: Chapter 2 provides a generalized method for comparing offset voltage for arbitrary voltage-type sense amplifiers; Chapter 3 gives the analysis of traditional VSA; Chapter 4 analyzes the methods of reducing offset for VSA; Chapter 5 shows the analysis of traditional CSA; Chapter 6 finalizes the thesis by giving the comparison of the two sense amplifiers. It is worth mentioning the offset voltage analysis strategy in this thesis. We will only focusing on the largest mismatch contribution when the analysis purpose is for comparison, while we will focusing on all mismatch contributions when the analysis purpose is for accuracy.

The simulation is performed using 28nm TSMC technology and typical PVT condition.

## CHAPTER 2

### **Offset Voltage Comparison for Sense Amplifiers**

In principle, we can always calculate the offset for any given sense amplifier circuitry using proper model, however, it is less intuitive when comparing different types of sense amplifier. The expression for offset usually does not explicitly show how well one sense amplifier compares to another given a specific constraint and most importantly, it does not contain sense amplifier area, which is a crucial constraint for SRAM design.

Pelgrom gave a good explanation for why area is important [18]. According to his theory, the mismatches of any physical property (e.g. threshold voltage) for adjacent FET pair follow a distribution, of which the variance is inversely proportional to the active area of FET pair. It gives an apparent tradeoff between area and offset.

Here are the questions we want to answer:

- 1. If the total area is constrained, how do we design the sense amplifier to achieve the best offset? Or in other words, if the target offset is given, how do we design a sense amplifier with minimum area?
- 2. What is the intrinsic constraint for the best offset of a given sense amplifier?

In the following part, we will derive an inequality that shows that given the circuit topology thus offset formula, we can give the best achievable offset voltage given the total area constraint which is used later to compare among different sense amplifier topologies.

The inequality is totally general and can be applied not only to SRAM sense amplifiers, but also other types of amplifiers (either dynamic or static).

# 2.1 A Generalized Mathematical Proof for Optimal Offset Design

From EKV model of a NFET in saturation with its source and body connected to ground and its body effect neglected:

$$I_D = \frac{1}{2}\beta (V_G - V_{t0})^2 \tag{2.1}$$

we know the FET drain current variation can be lumped into  $\beta$  variation and  $V_{t0}$  variation. These variations will induce offset voltage for a sense amplifier.

As for CSA and VSA, the threshold mismatch is the main contributor for offset voltage (will be shown in following Chapters), here we assume the input-referred offset voltage is given (either derived from circuit analysis or from simulations) only by threshold mismatch. This assumption is essential to give simple, clear and intuitive results.

$$V_{os} = \sum_{i=1}^{N} a_i \Delta V_{ti} \tag{2.2}$$

where N is the number of FET pairs that contribute to the offset voltage,  $a_i$  denotes contribution of each FET pair to input-referred offset,  $\Delta V_{ti}$  is the threshold voltage mismatch of each FET pair. The threshold mismatch is a Gaussian shaped random variable with zero mean and standard deviation given by Pelgrom's theory [18]:

$$\sigma_{\Delta V_{ti}} = \frac{A_{vt}}{\sqrt{S_i}} \tag{2.3}$$

where  $S_i$  is the matching area of the FET pair. Here we assume matching coefficient  $A_{vt}$  is same for both PFET and NFET, and the threshold mismatch dominates the offset. The area constraint is given by:

$$\sum_{i=1}^{N} S_i = S_{tot} \tag{2.4}$$

where  $S_{tot}$  is the total active area. Here we assume active area is a good representative of total area.

Then we have:

$$\sigma_{V_{os}}^2 = A_{vt}^2 \sum_{i=1}^N \frac{a_i^2}{S_i}$$
(2.5)

Differentiate Equ. 2.5 and we get:

$$d\sigma_{V_{os}}^2 = -A_{vt}^2 \sum_{i=1}^N \frac{a_i^2}{S_i^2} dS_i \equiv \sum_{i=1}^N b_i dS_i$$
(2.6)

The constraint is given by differentiating Equ. 2.4:

$$\sum_{i=1}^{N} dS_i = 0 \tag{2.7}$$

It is easy to find one extremum condition from Equ. 2.7 for  $\sigma_{V_{os}}^2$  that if:

$$b_i = B, \quad \text{for} \quad i \in \{1, 2, ..., N\}$$
 (2.8)

where B is a constant. We then have:

$$d\sigma_{V_{os}}^2 = 0 \tag{2.9}$$

Now, the rest is to prove there is only one local extremum (uniqueness), which satisfies Equ. 2.9, and it is the minimum.

To prove uniqueness, we use proof by contradiction. Now assume there exists:

$$b_j \neq b_k, \quad \text{for} \quad j,k \in \{1,2,...,N\}$$
 (2.10)

then if we choose:

$$\sum_{i=1}^{N} dS_i = dS_j + dS_k = 0 \tag{2.11}$$

we will have:

$$d\sigma_{V_{os}}^{2} = \sum_{i=1}^{N} b_{i} dS_{i} = b_{j} dS_{j} + b_{k} dS_{k} \neq 0$$
(2.12)

which means it is not a extremum. This gives the uniqueness of the extremum given by Equ. 2.8.

The proof of minimum is given by showing that there exists one set of  $\{S_i\}$  that is larger. This is easily proved by looking at extremes:

$$\lim_{S_j \to 0} \sigma_{V_{os}}^2 = A_{vt}^2 \sum_{i=1}^N \frac{a_i^2}{S_i} = +\infty, \quad \text{for} \quad a_j \neq 0, \quad j \in \{1, 2, ..., N\}$$
(2.13)

The solution given by Equ. 2.8 indicates:

$$S_i = \alpha a_i \quad \text{for} \quad i \in \{1, 2, ..., N\}$$
 (2.14)

where  $\alpha$  is a constant. This relation gives the the area arrangement, which is the area of a FET pair is proportional to its contribution coefficient  $a_i$  on offset, for FET pairs in order to achieve minimum offset voltage when total area is given. It should be mentioned that Equ. 2.14 indicates that a larger  $a_i$  should result in larger offset contribution:

$$\sigma_{V_{os,i}} = a_i \frac{A_{vt}}{\sqrt{S_i}} \propto \sqrt{a_i} \tag{2.15}$$

which is not equivalent to the concept that all offset contributions should be equal given by  $S_i \propto a_i^2$ .

The minimum RMS offset voltage is thus given by:

$$\sigma_{V_{os}} \ge (\sum_{i=1}^{N} a_i) \frac{A_{vt}}{\sqrt{S_{tot}}}$$
(2.16)

Equ. 2.16 is very useful for comparing RMS offset voltage for different types of sense amplifier and essentially shows the area efficiency of a sense amplifier. When the total area and technology are given, the only limit is from the coefficient term  $A \equiv \sum_{i=1}^{N} a_i$ : the larger the A, the worse choice of a sense amplifier. It shows

an intrinsic tradeoff between sense amplifier design area  $S_{tot}$  and the minimum achievable offset, which is determined by circuit topology and operations.

It provides a useful design strategy for designers. Even if the exact expression for offset is hard to obtain, designers can still design the circuit for optimum offset with Monte-Carlo simulations using this strategy: square root of matching area ratio is same to the offset contribution ratio which is derived from Equ. 2.14:

$$\sqrt{\frac{S_i}{S_j}} = \frac{a_i \frac{A_{vt}}{\sqrt{S_i}}}{a_j \frac{A_{vt}}{\sqrt{S_j}}} = \frac{\sigma_{V_{os,i}}}{\sigma_{V_{os,j}}}$$
(2.17)

which does not need the information of coefficient  $a_n$ . Trying to satisfy the above equation will get the optimum design for offset.

Here shows how the optimization procedure works in practice.

- 1. When running Monte-Carlo simulation, only select one FET pair of interest for mismatch. The simulation will give the standard deviation of inputreferred offset induced only by the mismatches of this FET pair, which is equivalent to  $\sigma_{V_{os,1}}$  in above equation.
- 2. Using the same method but with different FET pair, eventually all  $\sigma_{V_{os,i}}$  can be got from simulations.
- 3. Adjust the area  $S_i$  based on the resulting  $\sigma_{V_{os,i}}$  and target relation Equ. 2.17.
- 4. Iteratively performing the above steps will eventually gives the best offset for a given area.

In many cases, it may not be economic to design with optimum offset since it will sacrifice other performance of the circuit too much. Since there are only second order terms near minimum point of a function (e.g.  $\sigma_{V_{os}}$ ), set the offset specification away from optimum point by a reasonably small amount (e.g. 10%)



Figure 2.1: (a) Cross-coupled inverter; (b) Differential-mode equivalent circuit; (c) Common-mode equivalent circuit.

will make design much easier. The argument is similar to inverter chain delay optimization and active power optimization in digital circuit. Using inverter RC delay model, a minimum delay as well as corresponding inverter sizing and number of stages can be derived mathematically given output input capacitance ratio. Then by relax the delay by a small amount, relatively large active power reduction can be achieved.

Note that the assumptions for derivation shown above also include that coefficient  $a_i$  is independent of area  $S_j$  for all  $i, j \in \{1, 2, ..., N\}$ .

### 2.2 Cross-Coupled Inverter

Cross-coupled inverter is a well-known, easily understood circuit, and it can also serve as a sense amplifier when proper switches are applied [13]. We will look at the operation and the optimum offset of this circuit as an example for the above analysis.

The cross-coupled inverter is shown in Fig. 2.1. The circuit can be divided into common-mode and differential-mode equivalent circuits as shown in Fig. 2.1. Assuming both FET pairs are in saturation, the KCL equations for both equivalent circuit are given by:

$$C'_{L}\frac{dV_{CM}(t)}{dt} = \frac{1}{2}\beta_{P}\left(V_{DD} - V_{CM}(t) - |V_{tP}|\right)^{2} - \frac{1}{2}\beta_{N}\left(V_{CM}(t) - V_{tN}\right)^{2}$$
(2.18)

$$C_{L}\frac{dV_{DM}(t)}{dt} = \left(\beta_{P}\left(V_{DD} - V_{CM}(t) - |V_{tP}|\right) + \beta_{N}\left(V_{CM}(t) - V_{tN}\right)\right)V_{DM}(t) \quad (2.19)$$

By assuming  $\beta_P = \beta_N = \beta$ , Equ. 2.18 and Equ. 2.19 can be re-written as:

$$C'_{L}\frac{dV'_{CM}(t)}{dt} = -\beta V'_{DD}V'_{CM}$$
(2.20)

$$C_L \frac{dV_{DM}(t)}{dt} = \beta V'_{DD} V_{DM}$$
(2.21)

where  $V'_{CM} = V_{CM} - (V_{DD} - |V_{tP}| + V_{tN})/2$ ,  $V'_{DD} = V_{DD} - |V_{tP}| - V_{tN}$ . From Equ. 2.20 and Equ. 2.21, we know for common-mode, there is one negative pole  $s_{-}$  and for differential-mode, there is one regeneration pole  $s_{+}$ . If  $C_L = C'_L$ , we have:

$$|s_{-}| = s_{+} = \frac{\beta V'_{DD}}{C_L} \equiv \frac{G_m}{C_L}$$

$$(2.22)$$

which gives the operation of the cross-coupled inverter.

Now we consider the offset voltage from FET threshold mismatches. From [16], the cross-coupled inverter offset can be written as  $V_{TH}$  mismatch and  $G_m$  mismatch, where the inverter trip point  $V_{TH}$  is defined by  $dV_{CM}(t)/dt = 0$ :

$$V_{TH} = (V_{DD} - |V_{tP}| + V_{tN})/2$$
(2.23)

and the offset voltage can be expressed as:

$$V_{os} = \Delta V_{TH} - \frac{\Delta G_m}{2G_m} (V'_{CM}(0) - V_{TH})$$
(2.24)

From definition of  $V_{TH}$  and  $G_m$ , we have:

$$\Delta V_{TH} = \frac{\partial V_{TH}}{\partial |V_{tP}|} \Delta |V_{tP}| + \frac{\partial V_{TH}}{\partial V_{tN}} \Delta V_{tN} = \frac{\Delta V_{tN} - \Delta |V_{tP}|}{2}$$
(2.25)

$$\Delta G_m = \frac{\partial G_m}{\partial |V_{tP}|} \Delta |V_{tP}| + \frac{\partial G_m}{\partial V_{tN}} \Delta V_{tN} = -\beta (\Delta V_{tN} + \Delta |V_{tP}|)$$
(2.26)

Thus the offset can be expressed by FET threshold mismatch:

$$V_{os} = \frac{1}{2} \left(1 + \frac{V'_{CM}(0) - V_{TH}}{V'_{DD}}\right) \Delta V_{tN} + \frac{1}{2} \left(1 - \frac{V'_{CM}(0) - V_{TH}}{V'_{DD}}\right) \Delta V_{tP}$$
(2.27)

which can be re-written as:

$$V_{os} = a_N \Delta V_{tN} + a_P \Delta V_{tP} \tag{2.28}$$

where  $a_N + a_P = 1$ . Although we assume all FETs are in saturation in the derivation, it can be easily proved that  $a_N + a_P = 1$  is also true for initially one FET pair being off. By applying Equ. 2.16, the minimum RMS offset is given by:

$$\sigma_{V_{os}} \ge \frac{A_{vt}}{\sqrt{S_N + S_P}} \tag{2.29}$$

which is independent on initial condition  $V'_{CM}(0)$ . It has to be mentioned although  $a_N$  and  $a_P$  are function of initial condition  $V'_{CM}(0)$  which is often given by application, the minimum RMS offset does not rely on it. Thus for cross-coupled inverter, the minimum RMS offset is set intrinsically by the circuit topology and is independent of operations. To achieve minimum offset, the designed area ratio should satisfy Equ. 2.14:

$$\frac{S_N}{S_P} = \frac{a_N}{a_P} \tag{2.30}$$

Fig. 2.2 shows the Monte-Carlo simulation results on minimum offset, by adjusting area ratio of NFET and PFET while keeping total area fixed, of different initial common-mode voltages  $V_{CM}(0)$ , which denote for possible different applications of cross-coupled inverter. It clearly shows first-order consistency of simulation and the theory which predict an equal minimum offset for all  $V_{CM}(0)$ . Also the area ratio for minimum offset is very close as Equ. 2.30 predicts. Secondorder discrepancy due to the small variation in minimum offset arises from the assumptions we made in the derivation that in reality  $A_{V_{tP}} > A_{V_{tN}}$ , which results in higher offset for lower  $V_{CM}(0)$ , and that  $\beta$  mismatches actually contribute to offset, which may have different dependency on  $V_{CM}(0)$ .



Figure 2.2: Monte-Carlo simulation of cross-coupled inverter minimum offset voltage on different initial common-mode voltage  $V_{CM}(0)$ .

## CHAPTER 3

### Voltage-Mode Sense Amplifier Analysis

The VSA consists of a cross-coupled inverter, a NMOS foot switch transistor (ST) and a pair of PMOS access transistors (ATs) as shown in Fig. 3.1. It is similar to a very simple circuit, the cross-coupled inverter.

### 3.1 Phase Plane and Separatrix

Analysis in this thesis may use some concepts in dynamical systems that are not familiar to circuit designers, and the purpose of this section is to briefly introduce and explain them.

Although Laplace domain analysis is widely used in Linear Time-Invariant (LTI) circuits, it is intrinsically not suitable for nonlinear dynamical systems. In our case, a more fundamental method - time domain analysis, is needed to analyze regenerative circuits.

A memory-less system (the evolution of system only depends on current state, which is different from memory-less circuit) described by state variable (e.g. V, I) vector  $\mathbf{x}(t)$  of dimension N can be expressed as:

$$d\mathbf{x}(t)/dt = \mathbf{F} \cdot (\mathbf{x}(t)) \tag{3.1}$$

where **F** in general is a nonlinear function of  $\mathbf{x}(t)$ . The points  $\mathbf{x}_{\mathbf{n}}$  that satisfy:

$$\mathbf{F}(\mathbf{x}_{\mathbf{n}}) = \mathbf{0} \tag{3.2}$$

are called fixed point (e.g. DC operation point, bi-stable states, meta-stable state).

In the vicinity of a fixed point  $\mathbf{x_n}$ , the system can be linearized:

$$d\mathbf{x}(t)/dt = \mathbf{A}\mathbf{x}(t) \tag{3.3}$$

where **A** is a  $N \times N$  matrix. And the eigenvalues of **A** in circuit are called "poles". If all eigenvalues of **A** are negative (assuming real, otherwise it means real part), the fixed point is stable, which means for states in the vicinity of  $\mathbf{x_n}$  $\lim_{t\to\infty} \mathbf{x}(t) = \mathbf{x_n}$ . If **A** contains both negative and positive eigenvalues, the fixed point is a saddle point, which is also called "meta-stable state" in circuit.

For example, crossed-coupled inverter as a 2-dimension system has a metastable state at origin if  $V'_{CM}$  and  $V_{DM}$  is chosen to be state variables and its linearized system equations are Equ. 2.20 and Equ. 2.21. Surprisingly the matrix **A** is diagonal, and this is why we here use common-mode and differential-mode voltage instead of other state variables.

Phase plane is a multidimensional space whose coordinates are the variables of the system. A trajectory is the time evolution of a state. For a point on the trajectory, the vector  $d\mathbf{x}/dt$  is always tangent to the trajectory, since it denotes the direction of time evolution.

For a meta-stable state, stable manifold is defined as the subspace of the variable space that evolves into the meta-stable state when  $t \to \infty$ , and the dimension of the stable manifold is equal to the number of positive poles in the circuit. An 1-dimension stable manifold is the concept of "separatrix" we will use in this thesis.

Fig. 3.3 is an example of phase plane, trajectories and separatrix, where the arrows denote the direction of time evolution. As we see, different initial states evolve to different final states when  $t \to \infty$ . What we are interested is the line that separates the spaces that evolve into different final states. In other words, we are interested in the line that evolves into the meta-stable state (no regeneration), which is the separatrix.



Figure 3.1: Voltage-Mode Sense Amplifier (VSA). Transistor sizing:  $M_P = 560n/60n, M_{ST} = M_{AT} = 2M_P, M_N = 4M_P.$ 

### 3.2 VSA Operations

The analysis of the circuit is through common-mode and differential-mode halfcircuits.

The initial input common-mode voltage  $V_{CM}(t=0)$  of the VSA is near  $V_{DD}$ , while AT is on and ST is off. A differential-mode small signal  $V_{DM}(0)$  is applied at the input. The sensing begins when AT turns off with ST turning on simultaneously.

Since  $V_{CM}(0) = V_{DD}$ , the PFETs are off. The sensing experiences preamplification that is provided by cross-coupled NMOS pair. At the end of preamplification stage,  $V_{CM} = V_{DD} - V_{tP}$  and PFETs turns on. The VSA enter full regeneration stage provided by the cross-coupled inverter.

#### 3.2.1 Pre-Amplification Gain

The pre-amplification phase equivalent circuit is shown in Fig. 3.2. By assuming square law, the KCL equations for both equivalent circuits are:

$$C'_{L}\frac{dV_{CM}(t)}{dt} = -\frac{1}{2}\beta_{N}\left(V_{CM}(t) - V_{tN}\right)^{2}$$
(3.4)



Figure 3.2: (a) Equivalent circuit of VSA pre-amplification phase; (b) Commonmode, (c) Differential-mode equivalent circuit.

$$C_L \frac{dV_{DM}(t)}{dt} = \beta_N \Big( V_{CM}(t) - V_{tN} \Big) V_{DM}(t)$$
(3.5)

where  $V_{tN}$  is the threshold voltage for NFET. For simplicity, let  $C'_L = C_L$ . Then the pre-amplification gain, which is defined as the ratio of the differential voltage when PFETs turn on  $V_{DM}(t')$  and initial value  $V_{DM}(0)$ , is derived from Equ. 3.5 and Equ. 3.4:

$$Gain \equiv \frac{V_{DM}(t')}{V_{DM}(0)} = \left(\frac{V_{CM}(0) - V_{tN}}{V_{CM}(t') - V_{tN}}\right)^2$$
(3.6)

where  $V_{CM}(t') = V_{DD} - V_{tP}$  when PFETs turn on. In 28nm TSMC technology, the gain is around  $Gain \sim 2 - 3$ .

#### 3.2.2 Sensing Speed

The sensing speed is defined by how fast the differential signal grows. Since the regeneration begins the moment the VSA is enabled with no extra delay, it depends only on the regeneration pole of pre-amplification stage and full regeneration stage.

The regeneration pole in pre-amplification stage is defined by  $V_{DM}(t) = V_{DM}(0)e^{+s_{pre}t}$ . Here  $s_{pre} \equiv \frac{\int_0^{t'} g_{mN}(t)dt}{t'} \times \frac{1}{C_L} \approx \beta_N (V'_{DD} + V_{tN}/2)/C_L$  is the average value and the integral is through entire pre-amplification stage. The full regeneration stage is modeled as cross-coupled inverter pair in [16]. The regeneration pole is  $s_r = (g_{mN} + g_{mP})/C_L \approx \beta V'_{DD}/C_L$  and can be regarded as a constant. These two poles are approximately equal.

### 3.3 Offset Voltage

Because of the high gain of pre-amplification stage (~ 2–3), NMOS pair mismatch contributes major offset and it is safe to ignore the contribution from PMOS pair. The NMOS mismatch consists of two parts: threshold mismatch and  $\beta$  mismatch.

To analyze offset, add the mismatch current term  $I_{mis}$  in Equ. 3.5 (while Equ. 3.4 remains unchanged):

$$C_L \frac{dV_{DM}(t)}{dt} = g_{mN}(t)V_{DM}(t) + I_{mis}$$
(3.7)

#### 3.3.1 Threshold Mismatch

For threshold mismatch  $\Delta V_{tN}$ :

$$I_{mis}(t) = g_{mN}(t)\Delta V_{tN} \tag{3.8}$$

Now let  $V_{DM} = V'_{DM} - \Delta V_{tN}$  and substitute into Equ. 3.7, same mathematical form as Equ. 3.5 can be got, which gives:

$$V_{os,t} = \Delta V_{tN} \tag{3.9}$$

Since the offset does not depend on initial condition  $V_{CM}(0)$ , this offset component is also called static offset.

#### **3.3.2** $\beta$ Mismatch

For  $\beta_N$  mismatch  $\Delta\beta_N$ :

$$I_{mis}(t) = \frac{1}{2}\beta_N \left( V_{CM}(t) - V_{tN} \right)^2 \times \frac{\Delta\beta_N}{\beta_N}$$
(3.10)

Now with  $\beta$  mismatch, the separtrix is tilted. Assume it is still a straight line (will prove later):

$$\frac{V_{DM}}{V_{CM} - V_{tN}} = \frac{dV_{DM}}{dV_{CM}} \equiv k_{\beta} \tag{3.11}$$



Figure 3.3: Phase plane plot ( $V_{CM}$  vs.  $V_{DM}$  taking time t as a parameter) for 10%  $\beta$  mismatch.

where  $k_{\beta}$  is the slope of the seperatrix in the  $V_{DM}$  -  $V_{CM}$  plot (phase phase). Solving Equ. 3.4 and Equ. 3.7 using above metastability condition, we get:

$$k_{\beta} = \frac{1}{3} \frac{\Delta \beta_N}{\beta_N} \tag{3.12}$$

which is a constant, which means previous assumption of a straight separatrix is correct. This gives:

$$V_{os,\beta} = \frac{1}{3} \frac{\Delta \beta_N}{\beta_N} (V_{DD} - V_{tN}). \tag{3.13}$$

This expression is consistent with simulation result in Fig. 3.3, where the deviation in low  $V_{CM}$  comes from sub-threshold effect. The factor 1/3 is different from 1/2 in the previous research [19] that uses metastability condition  $\frac{dV_{DM}}{dt}|_{t=0} = 0$ , which is not true.

Table 3.1: Measured RMS offset Vs. calculated offset (28nm FDSOI:  $A_{V_{tN}} = 1.7mV \cdot \mu m, A_{\beta_N} = 0.5\% \cdot \mu m.$  [2])

|                         | VSA in [20]        |
|-------------------------|--------------------|
| Measured Offset         | $6.02 \mathrm{mV}$ |
| Calculated Offset       | $5.50 \mathrm{mV}$ |
| Due to $\Delta V_{tN}$  | $4.64 \mathrm{mV}$ |
| Due to $\Delta \beta_N$ | $2.96 \mathrm{mV}$ |

There is small inaccuracy when assuming  $C'_L = C_L$ . When considering  $C'_L \neq C_L$ , the factor 1/3 becomes  $\frac{1}{2+C'_L/C_L}$ . However, we will still use 1/3 for reasonable approximation.

This offset component depends on the initial condition, and is called dynamic offset. Load capacitance mismatch can also cause dynamic offset but is not presented in typical VSA since the load is carefully matched.

#### 3.3.3 Total Offset Voltage

Thus, the total input-referred offset formula of VSA is the summation of static offset and dynamic offset:

$$V_{os} = \Delta V_{tN} + \frac{1}{3} \frac{\Delta \beta_N}{\beta_N} (V_{DD} - V_{tN})$$
(3.14)

The comparison between measurement in literatures and theory is shown in Table 3.1 which shows first order accuracy of the theory.

# CHAPTER 4

## VSA Offset Reduction Methods

In reality, ST and AT are not ideal switch, and can somehow influence the performance of VSA. In traditional VSA, only NMOS pair is designed for good matching to lower offset. However, here we show that AT can be used for offset reduction.

By adjusting the control signals of ST and AT, we can make AT be part of the sense amplifier which will influence sensing operations and thus speed and offset. We here present two ways of adjusting control signals and both methods will surprisingly reduce the total offset of VSA with similar formula.

Due to the complexity of analysis together with the fact that we focus on the reduction (relative value) instead of absolute value, we reasonably sacrifice accuracy by only including threshold mismatch in offset analysis.

## 4.1 VSA Offset Reduction: SAE Transition

It is reported in [21] that offset voltage can be reduced by adjusting sense amplifier enable (SAE) transition time when ST and AT are controlled by same signal SAE. During SAE transition, the circuit is acutually a time-variant circuit shown in Fig. 4.1. The analysis is based on differential equations.

### 4.1.1 Operations

During SAE transition, both NMOS pair and AT pair are on. A differential current injected through AT pair because of different source and drain voltage between AT pair



Figure 4.1: (a) Common-mode equivalent circuit, (b) Differential-mode equivalent circuit during SAE transition; (c) I decomposition.

will impact regeneration and offset. The circuit equation is given by:

$$C_L \frac{dV_{DM}(t)}{dt} = \left(g_{mN}(t) - G_{AT}(t)\right) V_{DM}(t) - I(t)$$
(4.1)

where I(t) is differential current due to transistor mismatch and AT imbalance. t = 0is defined by  $V_{SAE} = V_{tN}$ .

This equation is solved by comparing to its generalized form, which is a first order linear differential equation:

$$dX(t)/dt = a(t)X(t) + b(t)$$
 (4.2)

of which the general solution is:

$$X(t) = \left(X(0) + \int_0^t b(\tau)e^{-\int_0^\tau a(s)ds}d\tau\right)e^{\int_0^t a(\tau)d\tau}$$
(4.3)

Assuming SAE rises from  $V_{tn}$  to  $V_{DD} - |V_{tP}|$  linearly in duration  $t'_{tran}$  ( $t_{tran}$  if from 0 to  $V_{DD}$ ), so that a(t) is linear function of t because of the square-law model of MOSFETs. a(t) satisfies a(0) < 0 and  $a(t'_{tran}) > 0$ , which means regeneration is suppressed in the beginning and starts somewhere  $t_1$  suring SAE transition. The solution to the Equ. 4.1 is given by:

$$V_{DM}(t) = \left(V_{DM}(0) + \frac{1}{C_L} \int_0^t I(\tau) \times W(\tau) d\tau\right) W(t)^{-1}$$
(4.4)

where W(t) is Gaussian shaped window function centered at  $\mu \equiv t_1 = \frac{\beta_{AT}}{\sqrt{\beta_{ST}\beta_n/2/n+\beta_{AT}}} t'_{tran}$ , and standard deviation  $\sigma = \sqrt{t'_{tran} \times \tau_1}$ . Regeneration begins at  $t_1$  and  $\tau_1 = \frac{C_L}{(\beta_{AT}+\beta_{ST}/2)V'_{DD}}$  is time constant defined by ST and AT where  $V'_{DD} = V_{DD} - V_{tn} - V_{tp}$ . The right of Equ. 4.4 can be interpreted as equivalent initial condition (EIC) multiplied by time evolution function  $W(t)^{-1}$ .

Notice that the separatrix condition, where  $V_{DM}(t)$  does not regenerate, is EIC = 0. However, when  $\sigma \ll t'_{tran}$ , the injection current dominates the regeneration which means second term of EIC dominates. In this case:

$$W(t) \approx k \times \delta(t - t_1) \tag{4.5}$$

where k is unimportant large constant. The separatrix is thus given by I(t) in the peak of W(t):

$$I(t_1) = \frac{V_{os}}{R_{AT}(t_1)} + \Delta V_{CM}(t_1)\beta_{AT}V_{os} + I_{mis}(t_1) = 0$$
(4.6)

where  $I_{mis}$  is induced by transistor mismatch shown later. Note that here we use approximation  $\sigma \ll t'_{tran}$ , which leads to less accuracy when  $t'_{tran}$  is small.

After SAE transition, the circuit becomes same as the circuit we analyzed in previous chapter. Here we assume the high gain during SAE transition so that following operations has little impact on separatrix condition (Equ. 4.6), otherwise we can just use method mentioned in previous chapter for good approximation.

#### 4.1.2 Offset Reduction

The  $I_{mis}$  consists of two components from N pair and AT pair which are  $I_{mis,N}(t) = g_{mN}(t)\Delta V_{tn}$  and  $I_{mis,AT}(t) = \Delta V_{CM}(t)\beta_{AT}\Delta V_{t,AT}$ . Comparing with separatrix condition (Equ. 4.6) and offset voltage is given by:

$$V_{os} = \frac{G_{AT}(t_1) \times \Delta V_{tn} + \Delta V_{CM}(t_1)\beta_{AT} \times \Delta V_{t,AT}}{G_{AT}(t_1) + \Delta V_{CM}(t_1)\beta_{AT}}$$
(4.7)

where  $\Delta V_{CM}$  is a function of  $t_{tran}$ , and will increase when  $t_{tran}$  increases until maximum is met.

From Equ. 2.16, we know that Equ. 4.7 indicates that by RMS fashion (take area fixed and  $\Delta V_{CM}(t_1)$  as variable, or the opposite):

$$\sigma_{V_{os}} \ge \frac{A_{V_t}}{\sqrt{S_N + S_{AT}}} \tag{4.8}$$

which means by slowing down SAE transition, both AT and NMOS pair can be used for matching instead of only NMOS pair  $S_N$  for  $t_{tran} = 0$  which gives:

$$V_{os} = \frac{A_{V_t}}{\sqrt{S_N}} \tag{4.9}$$

Here the assumption is  $A_{V_{tN}} \approx A_{V_{tP}}$ .

However, it is hard to achieve the theoretical minimum because of the limited  $\Delta V_{CM}(t_1)$  tuning range that can be accomplished by changing  $t_{tran}$ . The real minimum  $V_{os,min}$  is achieved when  $\frac{\Delta V_{CM}\beta_{AT}}{G_{AT}} = \frac{\beta_{AT}n^2}{2\beta_N}$ .

Fig. 4.2 shows the offset reduction when changing SAE transition time and other design parameters. Roughly  $10\% \sim 20\%$  offset reduction can be achieved by this method.

#### 4.1.3 Sensing Speed

Since the SAE signal transition in finite time, the sensing delay is obviously longer compared to when SAE edge is instantaneous. Regeneration pole  $s_{re}(t) = a(t)$  increases linearly with time from  $t_1$  to end of transition. Thus the additional sensing delay from onset of regeneration is:

$$t_{delay} = \frac{t'_{tran} - t_1}{2} - \frac{\int_0^{t'_{tran} - t_1} \frac{s_{max}}{t'_{tran} - t_1} t dt}{s_{max}} = \frac{t'_{tran} - t_1}{2} \approx \frac{t'_{tran}}{4}$$
(4.10)

where the assumption is the regeneration begins near middle of the SAE transition.

This additional sensing delay is a disadvantage for reasonable offset reduction. Thus we introduce another method that can reduce offset by similar principles but has almost negligible sensing delay.



Figure 4.2: Offset reduction vs. SAE transition time plot.

## 4.2 VSA Offset Reduction: Skewing PGB and SAE

It is common that separate control signals for AT (PGB) and ST (SAE) are used for various reasons to isolate the sense amplifier from SRAM data line (DL) while SA is not activated. [9,20,22–24]

Simulation shows that if PGB is delayed comparing to SAE, the offset will also be reduced and the mechanism is similar to what we discussed above. Only one or two inverter delay will result in offset reduction. This timing strategy is actually employed in practical use [9,22].

### 4.2.1 Operations

The sensing operations can be devided into two stages after that ST turns on: 1. AT is on and ST is on; 2. AT is off and ST is on. The second stage is actually discussed in previous chapter and will not be discussed here. We assume that the gain of first



Figure 4.3: (a) Common-mode equivalent circuit, (b) Differential-mode equivalent circuit during pre-amplification phase; (c) *I* decomposition.

stage is sufficiently large that second stage has little impact on separatrix, otherwise it suggests the delay between SAE and PGB does not make big differences here thus can be estimated with theory in previous chapter, where SAE and PGB transit simultaneously.

The pre-amplification phase circuit is shown in Fig. 4.3. The circuit equation is given by:

$$C_L \frac{dV_{DM}(t)}{dt} = \left(g_{mN} - G_{AT}\right) V_{DM}(t) - I$$
(4.11)

where the solution is:

$$V_{DM}(t) = \left(V_{DM}(0) + \frac{1}{C_L} \int_0^t I \times e^{-\frac{(g_{mN} - G_{AT})}{C_L}\tau} d\tau \right) e^{\frac{(g_{mN} - G_{AT})}{C_L}t}$$
(4.12)

Here we linearize the circuit by using average values for time-varying component so that  $g_{mN}$ , I and  $G_{AT}$  are constants. The reason we use this approximation while we do not in previous method is that this differential circuit is in regeneration from beginning when NFETs are turned on if  $g_{mN} > G_{AT}$ .



Figure 4.4: Offset reduction by skewing PGB and SAE signals by 10ps.

From Equ. 4.12, the separatrix condition is thus given by:

$$V_{os} + \frac{I}{g_{mN} - G_{AT}} = 0 (4.13)$$

### 4.2.2 Offset Reduction

From separatrix condition we can derive the offset formula. Notice that differential current I is also a function of  $V_{BL,DM} = V_{os}$ . Offset is similar to that discussed:

$$V_{os} = \frac{g_{mN} \times \Delta V_{tN} + \Delta V_{CM} \beta_{AT} \times \Delta V_{t,AT}}{g_{mN} + \Delta V_{CM} \beta_{AT}}$$
(4.14)

Equ. 4.4 has similar mathematical form to Equ. 4.7, so that similar offset reduction can be achieved using this method. Fig. 4.4 shows the offset reduction by this method. Approximately 10% offset reduction can be achieved by just delaying PGB by  $1 \sim 2$ intrinsic inverter delay.

In cases that already use separate control signals for access transistor, sense amplifier enable and also sense amplifier pre-charge, this 10% offset reduction almost comes for free.

## 4.2.3 Sensing Speed

Assume PGB delay is  $t_0$ . The regeneration pole is  $s_{re} \approx (g_{mN} - G_{AT})/C_L$ . So that if  $t_0 > s_{re}$ , above analysis holds.

The  $t_0$  needed to maximize offset reduction is much smaller than the SAE transition time needed in former case since  $\Delta V_{CM}$  now changes as fast as regeneration pole. This property will significantly reduce sensing delay.

As  $s_{reg}$  is slightly smaller than original pole. This causes a sensing delay estimated by:

$$t_{delay} \approx t_0 - t_0 \times \frac{g_{mN} - G_{AT}}{g_{mN}} = \frac{G_{AT}}{g_{mN}} \times t_0 \tag{4.15}$$

# CHAPTER 5

# **Current-Mode Sense Amplifier Analysis**

The Current-Mode Sense Amplifier (CSA) is based on the StrongArm latch that is used widely as a comparator in A/D converters. The CSA consists of a NMOS foot switch transistor (ST), a NMOS input pair (N1) and a pair of cross-coupled inverters (N2 and P) as shown in Fig. 5.1. Same as VSA, equivalent half-circuits is used for analysis. The method of analysis is same as [16], however, results are different due to different operation conditions of the same circuit.

## 5.1 Operations

[16] shows that it operates in three phases: sampling phase (SP), propagation phase (PP) and regeneration phase (RP). Unlike comparators, the large input common-mode voltage of  $V_{DD}$  here forces the input NFET pair into triode during the propagation phase. The differential-mode equivalent circuit is shown in Fig. 5.2.

Initially, the internal nodes C (drain of N1) and output nodes O are precharged at  $V_{DD}$  and ST is off. The input common-mode voltage is near  $V_{DD}$  and a differential-mode signal  $V_{id}$  is applied at the input.

### 5.1.1 Sampling Phase

Sampling phase starts when ST turns on and ends when common-mode voltage of internal nodes C  $V_{C,CM}$  is pulled down by N1 to  $V_{DD} - V_{tN}$  when N2 turns on. This period of time is given by equation:

$$\int_0^{T_{SP}} I_{SP} dt = C_C V_{tN} \tag{5.1}$$



Figure 5.1: Current-Mode Sense Amplifier (CSA). Transistor sizing:  $M_{ST} = 560n/60n, M_{N1} = M_{N2} = M_P = 2M_{ST}.$ 

which gives:

$$T_{SP} = C_C V_{tN} / I_{SP} \tag{5.2}$$

where

$$I_{SP} \approx \frac{1}{2} \beta_{N1} (V_{GS,N1} - V_{tN})^2 \equiv \frac{1}{2} \beta_{N1} V_{ov,N1}^2$$
(5.3)

Meanwhile during SP, an amplified differential voltage  $V_C(t) \equiv V_{C1} - V_{C2}$  is developed in internal nodes C by small differential current integrating during time  $T_{SP}$ , which is:

$$V_C(T_{SP}) = -g_{mN1}V_{id}T_{SP}/C_C \tag{5.4}$$

Note that from square law there is:

$$I_{SP} = \frac{1}{2} g_{mN1} V_{ov,N1} \tag{5.5}$$

The gain of SP is therefore defined:

$$A_{SP} \equiv \frac{V_C(T_{SP})}{V_{id}} = -\frac{2V_{tN}}{V_{ov,N1}}$$
(5.6)



Figure 5.2: Differential-mode equivalent circuit of CSA: (a) Sampling Phase; (b) Propagation Phase; (c) Regeneration Phase.

It is important to mention that from Equ. 5.6, we know that  $|A_{SP}|$  is larger when  $V_{ov,N1}$  is lower which can be achieved by lower  $V_{DD}$ :

$$\frac{\partial |A_{SP}|}{\partial V_{DD}} = \frac{\partial |A_{SP}|}{\partial V_{ov,N1}} \times \frac{\partial V_{ov,N1}}{\partial V_{DD}} < 0$$
(5.7)

This means a larger amplification can be achieved for low  $V_{DD}$ , which suggests better performance of CSA in low  $V_{DD}$  SRAM.

#### 5.1.2 Propagation Phase

Propagation phase begins when N2 pair is turned on and ends when the PMOS pair is turned on which gives common-mode voltage of output node  $V_{O,CM} = V_{DD} - |V_{tP}|$ . This period of time is given as:

$$T_{PP} = C_L' |V_{tP}| / I_{PP} \tag{5.8}$$

where  $I_{PP} = \frac{1}{2}g_{mN2}V_{ov,N2}$ .

The differential-mode equivalent circuit is shown in Fig. 5.2. This circuit has one negative pole  $s_{-} \approx -(g_{mN2} + g_{md1})/C_C$  and one positive pole  $s_{+} \approx (g_{md1}||g_{mN2})/C_L$ which means a soft regeneration. Because the N1 pair enters triode region, the charge on both node C will partially leak through  $g_{md1}$  and partially propagate to node O. In addition, the N1 pair differential current becomes weaker since it enters triode region, which makes the circuit very vulnerable to transistor mismatches.

Due to the nonlinear nature of the circuit, it is hard to analyze the dynamics of the circuit. To simplify this problem, averaged value is used for linearization of the circuit.



Figure 5.3: Differntial equivalent circuit when mismatches are considered: (a) Sampling phase; (b) Propagation phase.

### 5.1.3 Regeneration Phase

The regeneration phase is essentially the regeneration of the output node differential voltage. It begins when PMOS pair is turned on. The differential-mode equivalent circuit is shown on Fig. 5.2, which is similar to a cross-coupled inverter. The regeneration pole is given by:

$$s_{RP} \approx (g_{mN2} + g_{mP})/C_L \tag{5.9}$$

#### 5.1.4 Sensing Speed

Unlike VSA where the output differential signal regenerates from beginning, CSA first experiences a signal pre-amplification in SP and a signal transfer in PP. After PP, the signal in output node start regeneration. The sensing delay for SP and PP is given by  $T_{SP} + T_{PP}$ , and the regeneration pole in RP is  $s_{RP}$ .

## 5.2 Offset Voltage

Unlike VSA, the offset of CSA has contribution from both NMOS pairs. PMOS pair normally has little impact on offset because the output node signal is relatively large in RP phase. The differential equivalent circuit for offset analysis is shown in Fig. 5.3.

#### 5.2.1 N1 Pair

We notice in Fig. 5.3 that in both SP and PP:

$$I_{mis,N1} = g_{mN} \Delta V_{tN1} + I_{SP} \frac{\Delta \beta_{N1}}{\beta_{N1}}$$
(5.10)

where  $I_{SP}$  is the current through N1.

Offset from N1 pair is defined by  $V_{os,N1} = I_{mis}/g_{mN}$ , thus:

$$V_{os,N1} = \Delta V_{tN1} + \frac{V_{ov,N1}}{2} \frac{\Delta \beta_{N1}}{\beta_{N1}}$$
(5.11)

### 5.2.2 N2 Pair

N2 pair mismatch will mainly be problematic in PP. Equivalent circuit is shown in Fig. 5.3. Since we know this circuit has one positive pole and one negative pole, we can express the  $V_O$  by this linear combination:

$$V_O(t) = Ae^{-|s_-|t|} + Be^{s_+t} + C (5.12)$$

where A, B, C are unknown constant.

From initial condition, we have:

$$V_O(0) = 0 (5.13)$$

$$C_L \frac{dV_O(t)}{dt}|_{t=0} = g_{mN2} V_C(0) + I_{mis}$$
(5.14)

where  $V_C(0) = A_{SP}V_{id}$  is the internal node voltage in end of SP, and  $I_{mis,N2} = g_{mN2} \left( \Delta V_{tN2} + \frac{V_{ov,N2}}{2} \frac{\Delta \beta_{N2}}{\beta_{N2}} \right).$ 

The separatrix condition is:

$$B = 0 \tag{5.15}$$

where regeneration is totally suppressed.

From Equ. 5.12, 5.13, 5.14, 5.15, we get the offset:

$$V_{os,N2} \equiv V_{id}|_{B=0} = \frac{1+|s_{-}| \times \frac{C_L}{g_{mN2}}}{|A_{SP}| + \frac{g_{mN1}}{g_{md1}} \times |s_{-}| \times \frac{C_L}{g_{mN2}}} \times \frac{I_{mis,N2}}{g_{mN2}} \equiv k_{N2} \times \frac{I_{mis,N2}}{g_{mN2}}$$
(5.16)

where  $|s_{-}| \approx (g_{mN2} + g_{md1})/C_C$ , and N2 offset contribution coefficient  $k_{N2}$  is defined.



Figure 5.4: CSA offset vs.  $C_L/C_C$ . (28nm CMOS:  $A_{V_{tN}} = 2.5mV \cdot \mu m$ ,  $A_{\beta_N} = 0.5\% \cdot \mu m$ . [1])

#### 5.2.3 Total Offset Voltage

The total offset is:

$$V_{os} \approx V_{mis,N1} + k_{N2}V_{mis,N2}$$
(5.17)  
where  $V_{mis,N1} = \Delta V_{tN1} + \frac{V_{ov,N1}}{2}\frac{\Delta\beta_{N1}}{\beta_{N1}}, V_{mis,N2} = \Delta V_{tN2} + \frac{V_{ov,N2}}{2}\frac{\Delta\beta_{N2}}{\beta_{N2}}.$ 

Because internal gain is small due to charges leak through N1 (in triode)  $g_{md1}$ , N2's contribution to offset becomes very important and can even be the bigger contribution depending on load capacitance as shown in Fig. 5.4. As the Pelgrom coefficients for 28nm CMOS technology are not directly from TSMC which we are using, there may be small systematic shift for theoretical offset prediction in the plot.

Another important feature of CSA is that:

$$\frac{\partial k_{N2}}{\partial V_{DD}} = \frac{\partial k_{N2}}{\partial |A_{SP}|} \times \frac{\partial |A_{SP}|}{\partial V_{DD}} > 0$$
(5.18)

which means lower offset for lower  $V_{DD}$ . This is a good property for CSA which suggest better performance in low  $V_{DD}$  SRAM, since VSA does not show similar feature. We will show the comparison in next chapter.

# CHAPTER 6

# VSA and CSA Comparison

## 6.1 Offset Voltage Comparison

Now let us look at the minimum RMS offset voltage of circuits we analyzed in previous chapters using the theory in Chapter 2:

- 1. Cross-Coupled Inverter
  - PFET and NFET contribute to the offset, thus number of FET pairs N = 2 (in Equ. 2.16, not same as subscript N which means NFET). Based on Equ. 2.27, the minimum RMS offset is given by:

$$\sigma_{V_{os}} \ge \frac{A_{vt}}{\sqrt{S_N + S_P}} \tag{6.1}$$

By giving initial common-mode input  $V_{CM}(0)$ , we can design the best area proportion for NFET and PFET to achieve this minimum offset. A larger common-mode input voltage will result in a larger NFET area for optimal design, however, the optimal offset is physically determined and is unchanged as long as total area is the same. Of course, smaller effect like  $\beta$  mismatch may affect the minimum offset, however, it is quite accurate as shown in Fig. 2.2.

- 2. Voltage-Mode Sense Amplifier
  - Only NFET contributes to the offset, thus N = 1. Based on Equ. 3.14, the minimum RMS offset is obviously given by:

$$\sigma_{V_{os}} \ge \frac{A_{vt}}{\sqrt{S_N}} \tag{6.2}$$

- 3. Offset-Reduced Voltage-Mode Sense Amplifier
  - Only NFET and AT pair contribute to the offset, thus N = 2. Based on Equ. 4.7 and Equ. 4.14, the minimum RMS offset for two reduction methods is the same and is given by:

$$\sigma_{V_{os}} \ge \frac{A_{vt}}{\sqrt{S_N + S_{AT}}} \tag{6.3}$$

Comparing to Equ. 6.2, the minimum offset now include the area of ATs. It shows exactly why and how the offset voltage of VSA can be reduced by implementing proper timing scheme. However, limited by the degree of freedom in design, the optimum condition may not be met and we end up achieving partially the offset reduction which does not make fully use of the area.

- 4. Current-Mode Sense Amplifier
  - Only N1 and N2 pair contribute to the offset, thus N = 2. Based on Equ.
    5.17, the minimum RMS offset is given by:

$$\sigma_{V_{os}} \ge (1 + k_{N2}) \frac{A_{vt}}{\sqrt{S_{N1} + S_{N2}}} \tag{6.4}$$

For CSA, we can see directly from Equ. 6.4 that the minimum offset is larger than VSA for same total area. The reason for this is determined by circuit topology and it is an intrinsic property of the sense amplifier. There is no way we can make a better CSA over VSA in sense of offset voltage based on above analysis. For a good estimation,  $k_{N2} \approx 1$  and CSA is twice the offset than VSA when optimally designed.

 $k_{N2}$  can be achieved much smaller in StrongArm latch used in data converters, since the input common-mode voltage will not force N2 pair into triode [16]. This property suggests that the CSA used in SRAM is not the optimal utilization for the same circuit topology when considering offset.

In conclusion, CSA has much larger offset voltage than VSA when occupying similar

area. It is clearly an advantage for VSA, however, the difference will be reduced when  $V_{DD}$  is lowered.

### 6.2 Sensing Speed Comparison

The sensing speed comparison can be done naturally given the previous analysis on SA operations. For VSA, the sensing speed is given by the regeneration pole and it is approximately  $s_+ \approx G_m/C_L$ . For CSA, it is more complicated. However, since positive poles for propagation (mild regeneration) and regeneration phases (strong regeneration) are both smaller than that in VSA assuming same load  $C_L$ , plus the additional sampling stage delay for CSA, we can safely conclude that sensing speed of CSA is slower than VSA.

## 6.3 FoM Comparison and Supply Dependency

The "yield" of a sense amplifier is defined by the fraction of its correct decisions across a large population of memories. As stated in Chapter 1, yield is limited by the inputreferred voltage offset  $V_{os}$  of the sense amplifier, arising mainly from FET mismatch that disturbs circuit symmetry, and attenuation in the voltage induced by a memory cell on the bitline due to sense amplifier's input capacitance  $C_{in}$  and possible signal delay such as RC delay. There is a clear trade-off. Scaling up the sense amplifier FETs will lower mismatch, but increase input capacitance. It will also increase amplifier surface area  $S_{tot}$ , which is undesirable for chip density. We propose a figure-of-merit to compare one sense amplifier circuit against another:

$$\text{FoM}^{-1} = \left(V_{os} \times \sqrt{S_{tot}}\right) \times \left(\frac{C_{BL} + C_{in}}{C_{BL}} \times \eta\right)$$
(6.5)

where  $C_{BL}$  is the bitline capacitance which consists of FET capacitance and wire capacitance,  $\eta$  accounts for RC delay attenuation ratio which satisfies  $\eta \leq 1$ ,  $S_{tot}$  accounts for all active area including precharge FETs and extra timing (for PGB) circuit. The former part of right hand side Equ. 6.5 denotes area efficiency, and the latter denotes signal attenuation.



Figure 6.1: FoM comparison of VSA and CSA on different supply voltage.

The simulation results are shown in Fig. 6.1. Here are the simulation settings: The diffusion capacitance on the bitlines is estimated by 128 W/L = 100n/30n NFETs; the input capacitance is extracted from simulations; the wire capacitance is estimated by  $C_{wire} = 128 \times 0.3 um \times 0.18 fF/um = 6.9 fF$ ; RC attenuation ratio  $\eta$  is estimated by  $\eta = 1 - RC/T = 0.79$  for VSA and  $\eta = 1$  for CSA where T (Fig. 1.2) is the time needed to develop 100mV differential signal which is commonly used in SRAM; the sizing for VSA is same to [20] and the sizing of CSA is designed using the idea of similar matching area [15], as shown in Fig. 3.1 and Fig. 5.1; the extra timing logic circuitry area for VSA is estimated based on [22].

It is surprising that although VSA has nearly half the offset voltage of CSA, the total yield level is in same level by considering signal attenuation. Since SA usually has much larger area than any SRAM cell in order to achieve good input-referred offset, the capacitance associated with SA is not negligible. CSA can isolate the SA input from output load very well, and it loads each bitline with only 1 NFET gate. While VSA not only loads both NFET and PFET gates, but also the diffusion and gate-drain

capacitors, and the output logic gates also load the bitlines. For this reason, the sensing signal applied at the input of VSA is smaller comparing to CSA. Furthermore, the access transistors in VSA will add a RC delay for signal that further attenuation the sensing signal, while CSA does not have the issue. Thus, the two SAs end up similar FoM as shown in our simulation.

It is shown that nearly 20% - 30% FoM improvement is achieved for CSA comparing to VSA when reducing  $V_{DD}$ . The previous analysis on CSA explains this phenomenon by larger sampling phase gain ( $|A_{SP}|$ ) of CSA thus lower offset in lower supply voltage. The theory prediction and simulation on  $V_{DD}$  dependency shows consistency that CSA will be the better choice in low  $V_{DD}$  SRAMs and VSA will be better in high  $V_{DD}$  SRAM applications.

# CHAPTER 7

# Conclusion

This thesis provides a simple but intuitive analysis of two commonly used types sense amplifier: VSA and CSA. A time-domain method is used to precisely predict the sense amplifier operations, sensing speed and offset voltage for VSA and CSA. We start with generally deriving and proving an theoretical minimum offset for each sense amplifier and show an optimization methods for optimum offset design, which is set by circuit topology. We illustrate how we apply this method to optimize cross-coupled inverter and used it later for SAs comparison. We explain how timing arrangements of SA enables signals affect VSA offset, and to what degree. It is shown that around 10% offset can be achieved with almost no penalty if control signals are already separated. We explain how CSA offset is affected by supply voltage.

We show three effects on SA yield that offset, capacitance and RC delay. Based on above analysis, we conclude by assembling these results into a figure-of-merit that shows that although the VSA is superior at high  $V_{DD}$  the CSA will prevail below 0.6 - 0.7V. Predictions from analysis match simulations very well.

### References

- L. Rahhal, A. Bajolet, J.-P. Manceau, J. Rosa, S. Ricq, S. Lassere, and G. Ghibaudo, "A comparative mismatch study of the 20nm Gate-Last and 28nm Gate-First bulk CMOS technologies," *Solid-State Electronics*, vol. 108, pp. 53–60, 2015.
- [2] L. Rahhal, "Analysis and modeling of mismatch phenomena for advanced MOSFETs," Ph.D. dissertation, Université Grenoble Alpes, 2014.
- [3] L. Chang, D. M. Fried, J. Hergenrother, J. W. Sleight, R. H. Dennard, R. K. Montoye, L. Sekaric, S. J. McNab, A. W. Topol, C. D. Adams *et al.*, "Stable SRAM cell design for the 32 nm node and beyond," in *VLSI Technology*, 2005. Digest of Technical Papers. 2005 Symposium on. IEEE, 2005, pp. 128–129.
- [4] L. Chang, R. K. Montoye, Y. Nakamura, K. A. Batson, R. J. Eickemeyer, R. H. Dennard, W. Haensch, and D. Jamsek, "An 8T-SRAM for variability tolerance and low-voltage operation in high-performance caches," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 4, pp. 956–963, 2008.
- [5] N. Verma and A. P. Chandrakasan, "A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 1, pp. 141–149, 2008.
- [6] J. P. Kulkarni, J. Keane, K.-H. Koo, S. Nalam, Z. Guo, E. Karl, and K. Zhang, "5.6 Mb/mm<sup>2</sup> 1R1W 8T SRAM arrays operating down to 560 mV utilizing small-signal sensing with charge shared bitline and asymmetric sense amplifier in 14nm FinFET CMOS technology," *IEEE Journal of Solid-State Circuits*, 2016.
- [7] Y. Sinangil and A. P. Chandrakasan, "A 128 kbit SRAM with an embedded energy monitoring circuit and sense-amplifier offset compensation using body biasing," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 11, pp. 2730–2739, 2014.
- [8] T. Song, W. Rim, S. Park, Y. Kim, G. Yang, H. Kim, S. Baek, J. Jung, B. Kwon, S. Cho *et al.*, "A 10 nm FinFET 128 Mb SRAM With Assist Adjustment System for Power, Performance, and Area Optimization," *IEEE Journal of Solid-State Circuits*, 2016.
- [9] Y.-H. Chen, S.-Y. Chou, Q. Li, W.-M. Chan, D. Sun, H.-J. Liao, P. Wang, M.-F. Chang, and H. Yamauchi, "Compact measurement schemes for bit-line swing, sense amplifier offset voltage, and word-line pulse width to characterize sensing tolerance margin in a 40 nm fully functional embedded SRAM," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 4, pp. 969–980, 2012.

- [10] M. H. Abu-Rahma, K. Chowdhury, J. Wang, Z. Chen, S. S. Yoon, and M. Anis, "A methodology for statistical estimation of read access yield in SRAMs," in 2008 45th ACM/IEEE Design Automation Conference, June 2008, pp. 205–210.
- [11] B. Mohammad, P. Dadabhoy, K. Lin, and P. Bassett, "Comparative study of current mode and voltage mode sense amplifier used for 28nm SRAM," in *Microelectronics (ICM), 2012 24th International Conference on*. IEEE, 2012, pp. 1–6.
- [12] C.-H. Hong, Y.-W. Chiu, J.-K. Zhao, S.-J. Jou, W.-T. Wang, and R. Lee, "A low-power charge sharing hierarchical bitline and voltage-latched sense amplifier for SRAM macro in 28 nm CMOS technology," in *System-on-Chip Conference (SOCC), 2014 27th IEEE International.* IEEE, 2014, pp. 160– 164.
- [13] T. Na, S.-H. Woo, J. Kim, H. Jeong, and S.-O. Jung, "Comparative study of various latch-type sense amplifiers," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 22, no. 2, pp. 425–429, 2014.
- [14] S.-H. Woo, H. Kang, K. Park, and S.-O. Jung, "Offset voltage estimation model for latch-type sense amplifiers," *IET circuits, devices & systems*, vol. 4, no. 6, pp. 503–513, 2010.
- [15] M. H. Abu-Rahma, Y. Chen, W. Sy, W. L. Ong, L. Y. Ting, S. S. Yoon, M. Han, and E. Terzioglu, "Characterization of SRAM sense amplifier input offset for yield prediction in 28nm CMOS," in *Custom Integrated Circuits Conference (CICC)*, 2011 IEEE. IEEE, 2011, pp. 1–4.
- [16] A. Abidi and H. Xu, "Understanding the regenerative comparator circuit," in *Custom Integrated Circuits Conference (CICC)*, 2014 IEEE Proceedings of the. IEEE, 2014, pp. 1–8.
- [17] C. C. Enz, F. Krummenacher, and E. A. Vittoz, "An analytical MOS transistor model valid in all regions of operation and dedicated to low-voltage and low-current applications," *Analog integrated circuits and signal processing*, vol. 8, no. 1, pp. 83–114, 1995.
- [18] M. J. Pelgrom, A. C. Duinmaijer, and A. P. Welbers, "Matching properties of MOS transistors," *IEEE Journal of solid-state circuits*, vol. 24, no. 5, pp. 1433–1439, 1989.
- [19] B. Razavi and B. A. Wooley, "Design techniques for high-speed, high-resolution comparators," *IEEE journal of solid-state circuits*, vol. 27, no. 12, pp. 1916–1926, 1992.

- [20] M. Khayatzadeh, F. Frustaci, D. Blaauw, D. Sylvester, and M. Alioto, "A reconfigurable sense amplifier with 3X offset reduction in 28nm FDSOI CMOS," in VLSI Circuits (VLSI Circuits), 2015 Symposium on. IEEE, 2015, pp. C270–C271.
- [21] R. Singh and N. Bhat, "An offset compensation technique for latch type sense amplifiers in high-speed low-power SRAMs," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 12, no. 6, pp. 652–657, 2004.
- [22] P.-F. Chiu, B. Zimmer, and B. Nikolić, "A double-tail sense amplifier for low-voltage SRAM in 28nm technology," in *Solid-State Circuits Conference* (A-SSCC), 2016 IEEE Asian. IEEE, 2016, pp. 181–184.
- [23] M. E. Sinangil, J. W. Poulton, M. R. Fojtik, T. H. Greer III, S. G. Tell, A. J. Gotterba, J. Wang, J. Golbus, B. Zimmer, W. J. Dally *et al.*, "A 28 nm 2 Mbit 6 T SRAM with highly configurable low-voltage write-ability assist implementation and capacitor-based sense-amplifier input offset compensation," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 2, pp. 557–567, 2016.
- [24] L. Wen, X. Cheng, K. Zhou, S. Tian, and X. Zeng, "Bit-interleaving-enabled 8T SRAM with shared data-aware write and reference-based sense amplifier," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 63, no. 7, pp. 643–647, 2016.