NOISE-SHAPING SENSE AMPLIFIER FOR CROSS-POINT ARRAYS

by

Matthew B. Leslie

A thesis
Submitted in partial fulfillment
Of the requirements for the degree of
Master of Science in Electrical Engineering, Integrated Circuits
Boise State University

October 2007
The thesis presented by Matthew B. Leslie entitled Noise-Shaping Sense Amplifier for Cross-Point Arrays is hereby approved:

___________________________________________
Dr. R. Jacob Baker    Date

___________________________________________
Dr. Scott Smith    Date

___________________________________________
Dr. Jim Browning    Date

___________________________________________
Dr. John R. (Jack) Pelton    Date
DEDICATION

This research is dedicated to my mother, Elizabeth Leslie, who passed away during its completion.
ACKNOWLEDGEMENTS

I would like to acknowledge Dr. R. Jacob Baker for his assistance and valuable input during the course of my undergraduate and graduate career as well as during my research. I would also like to thank Dr. Barney Smith for her encouragement throughout my time at Boise State University.
AUTOBIOGRAPHICAL SKETCH

In the fall of 1998 Matthew B. Leslie matriculated at Boise State as a Civil Engineering major. Two and a half years later, Matthew changed majors to Electrical Engineering. Matthew graduated summa cum laude in May 2003 and was named one of Boise State’s Top Ten Scholars.

During his time as an undergraduate, Matthew focused on integrated circuits courses offered by Dr. R. Jacob Baker. In summer of 2003, he performed research which partially formed the basis for what would eventually be his thesis. Matthew returned to Boise State in the spring of 2007 to complete his research and also to assist as an instructor.

Matthew currently works at Marvell Semiconductor as an ASIC design engineer.
ABSTRACT

A sensing technique using a voltage-mode architecture, noise-shaping modulator, and digital filter (a counter) is presented for use with cross-point MRAM arrays and magnetic tunnel junction memory cells. The presented technique eliminates the need for precision components, the use of calibrations, and reduces the effects of power supply noise. To obviate the effects of cell-to-cell variations in the array, a digital self-referencing scheme using the counter is presented. Traditional transfer function analysis techniques are applied to gain a rudimentary understanding of the sense amplifier’s desired operation. Further insight results from behavioral simulations performed in Simulink. These simulations also dictate the block-level requirements for overall operation. The individual blocks are designed at the transistor level. Finally, the blocks are combined and the sense amplifier’s operation (at the transistor level) is evaluated.
# TABLE OF CONTENTS

Dedication ........................................................................................................... iv

Acknowledgements .............................................................................................. v

Autobiographical Sketch ...................................................................................... vi

Abstract ............................................................................................................... vii

Table of Contents ............................................................................................... viii

List of Tables ....................................................................................................... xi

List of Figures ...................................................................................................... xii

List of Symbols .................................................................................................... xv

Introduction ........................................................................................................... 1

Volatile and Non-Volatile Memories ................................................................. 1

Array Organization ............................................................................................ 2

Current Mode Sensing ...................................................................................... 3

Voltage Mode Sensing ...................................................................................... 5

Practical Sensing Considerations ....................................................................... 7

The Proposed Sense Algorithm .......................................................................... 7

Overall Design Strategy .................................................................................... 9

The Sense Amplifier ........................................................................................... 10

Block Diagram .................................................................................................. 10

Qualitative Description of Sense Amplifier Operation ...................................... 11

Sense Amplifier Signal and Noise Transfer Functions ....................................... 13

Behavioral Modeling of the Sense Amplifier .................................................... 17

Introduction ..................................................................................................... 17

Important Non-Idealities .................................................................................. 17
<table>
<thead>
<tr>
<th>Section</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>Integrators</td>
<td>17</td>
</tr>
<tr>
<td>Comparator</td>
<td>19</td>
</tr>
<tr>
<td>The Simulink Model</td>
<td>20</td>
</tr>
<tr>
<td>Ideal Simulation</td>
<td>22</td>
</tr>
<tr>
<td>Input Voltage Sweep</td>
<td>22</td>
</tr>
<tr>
<td>Examination of Key Sense Amplifier Nodes</td>
<td>25</td>
</tr>
<tr>
<td>The Effect of Non-Idealities on Sense Amplifier Performance</td>
<td>28</td>
</tr>
<tr>
<td>Finite Integrator Gain (and Unity-Gain Frequency)</td>
<td>28</td>
</tr>
<tr>
<td>Comparator Offset and Hysteresis</td>
<td>31</td>
</tr>
<tr>
<td>Comparator Delay</td>
<td>38</td>
</tr>
<tr>
<td>Comparator Non-Idealities in Unison</td>
<td>39</td>
</tr>
<tr>
<td>All Non-Idealities in Unison</td>
<td>40</td>
</tr>
<tr>
<td>Input Noise</td>
<td>42</td>
</tr>
<tr>
<td>White Noise</td>
<td>42</td>
</tr>
<tr>
<td>Flicker Noise</td>
<td>46</td>
</tr>
<tr>
<td>Transistor Level Model of Sense Amplifier</td>
<td>49</td>
</tr>
<tr>
<td>Introduction</td>
<td>49</td>
</tr>
<tr>
<td>DC Biasing</td>
<td>49</td>
</tr>
<tr>
<td>Operational Transconductance Amplifier</td>
<td>51</td>
</tr>
<tr>
<td>General Integrator Analysis</td>
<td>51</td>
</tr>
<tr>
<td>Maximizing Integrator Bandwidth</td>
<td>53</td>
</tr>
<tr>
<td>Maximizing Integrator Gain</td>
<td>54</td>
</tr>
<tr>
<td>Other Concerns Regarding the Integrator</td>
<td>56</td>
</tr>
<tr>
<td>First OTA Design and Performance</td>
<td>57</td>
</tr>
<tr>
<td>Section</td>
<td>Page</td>
</tr>
<tr>
<td>--------------------------------------------------------------</td>
<td>------</td>
</tr>
<tr>
<td>The Second OTA Design and Performance</td>
<td>59</td>
</tr>
<tr>
<td>Differential Comparator</td>
<td>62</td>
</tr>
<tr>
<td>Input Stage</td>
<td>63</td>
</tr>
<tr>
<td>Decision Making Stage</td>
<td>63</td>
</tr>
<tr>
<td>Output Latch</td>
<td>64</td>
</tr>
<tr>
<td>Transient Behavior</td>
<td>64</td>
</tr>
<tr>
<td>The Sense Amplifier</td>
<td>68</td>
</tr>
<tr>
<td>Sense Amplifier Performance Evaluation</td>
<td>71</td>
</tr>
<tr>
<td>Introduction</td>
<td>71</td>
</tr>
<tr>
<td>Basic Operation</td>
<td>71</td>
</tr>
<tr>
<td>Noiseless Performance</td>
<td>76</td>
</tr>
<tr>
<td>Performance in the Presence of Noise on the Input Voltage</td>
<td>77</td>
</tr>
<tr>
<td>Performance in the Presence of Noise on the Power Supply</td>
<td>79</td>
</tr>
<tr>
<td>Power Consumption</td>
<td>82</td>
</tr>
<tr>
<td>Summary</td>
<td>83</td>
</tr>
<tr>
<td>References</td>
<td>85</td>
</tr>
<tr>
<td>Glossary of terms</td>
<td>86</td>
</tr>
</tbody>
</table>
LIST OF TABLES

Table 1: POTA and NOTA Performance............................................................. 61
## LIST OF FIGURES

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>An Example Cross-point Array</td>
<td>3</td>
</tr>
<tr>
<td>2</td>
<td>Current Mode Sense of a Cross-point Array</td>
<td>4</td>
</tr>
<tr>
<td>3</td>
<td>Voltage Mode Sense of a Cross-point Array</td>
<td>6</td>
</tr>
<tr>
<td>4</td>
<td>Voltage Divider Formed by Sneak Resistance</td>
<td>6</td>
</tr>
<tr>
<td>5</td>
<td>The Sense Amplifier Block Diagram</td>
<td>10</td>
</tr>
<tr>
<td>6</td>
<td>Example Output Digital Waveform for 5mV Input</td>
<td>11</td>
</tr>
<tr>
<td>7</td>
<td>Example Output Digital Waveform for 15mV Input</td>
<td>12</td>
</tr>
<tr>
<td>8</td>
<td>Ideal Input-Output Relationship for Sense Amplifier</td>
<td>12</td>
</tr>
<tr>
<td>9</td>
<td>Example STF and NTF</td>
<td>16</td>
</tr>
<tr>
<td>10</td>
<td>Ideal vs. Real Integrators</td>
<td>19</td>
</tr>
<tr>
<td>11</td>
<td>The Simulink Model</td>
<td>20</td>
</tr>
<tr>
<td>12</td>
<td>The Differential Comparator Sub-Subsystem</td>
<td>21</td>
</tr>
<tr>
<td>13</td>
<td>Performance of Ideal Sense Amplifier</td>
<td>23</td>
</tr>
<tr>
<td>14</td>
<td>The Integrator Output Nodes</td>
<td>26</td>
</tr>
<tr>
<td>15</td>
<td>The Comparator's Positive Output and Average Value</td>
<td>27</td>
</tr>
<tr>
<td>16</td>
<td>Final Count Output</td>
<td>28</td>
</tr>
<tr>
<td>17</td>
<td>Final Count for Varying Integrator Gains</td>
<td>29</td>
</tr>
<tr>
<td>18</td>
<td>Final Count for Varying Integrator Gains and Unity Gain Frequencies</td>
<td>31</td>
</tr>
<tr>
<td>19</td>
<td>Final Count for Varying Comparator Offset Voltages</td>
<td>32</td>
</tr>
<tr>
<td>20</td>
<td>The Integrator Output Nodes with a 50mV Comparator Offset</td>
<td>33</td>
</tr>
<tr>
<td>21</td>
<td>Hysteresis</td>
<td>34</td>
</tr>
</tbody>
</table>
Figure 22: Final Count for Varying Comparator Hysteresis Voltages.................. 35
Figure 23: The Integrator Output Nodes with a 50mV Comparator Hysteresis ... 36
Figure 24: The Comparator's Positive Output and Average Value with a 50mV Hysteresis........................................................................................................... 37
Figure 25: Final Count Output with a 50mV Comparator Hysteresis ............... 38
Figure 26: Final Count for Varying Comparator Delays ...................................... 39
Figure 27: Final Count for 25mV Comparator Offset, 50mV Hysteresis, and 3ns Delay .................................................................................................................................. 40
Figure 28: Final Count with All Non-Idealities Included................................. 41
Figure 29: Schematic Used for Determining Input Noise Voltage.................... 42
Figure 30: Final Count with Varying Input Noise Variances............................. 44
Figure 31: Final Count Histogram for V_{noise,rms}=1mV.................................. 45
Figure 32: Final Count Histogram for V_{noise,rms}=4mV .................................. 45
Figure 33: The Wide-Swing Biasing Topology (LMIN=0.9µm)......................... 50
Figure 34: A Transconductance Based Integrator (with Reference Direction Indicated)............................................................................................................ 52
Figure 35: Integrator Frequency Response ........................................................ 54
Figure 36: The First OTA (POTA)..................................................................... 57
Figure 37: POTA Magnitude Response (I_{BIAS}=0.5µA) with C=500fF ........... 58
Figure 38: POTA Magnitude Response (I_{BIAS}=1µA) with C=500fF............. 58
Figure 39: The Second OTA (NOTA)................................................................. 59
Figure 40: NOTA Magnitude Response (I_{BIAS}=0.5µA) with C=500fF........... 60
Figure 41: NOTA Magnitude Response (I_{BIAS}=1µA) with C=500fF............. 61
Figure 42: The Differential Comparator ............................................................ 62
<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>43</td>
<td>The Comparator Delay</td>
<td>65</td>
</tr>
<tr>
<td>44</td>
<td>The Comparator Delay (Zoomed)</td>
<td>65</td>
</tr>
<tr>
<td>45</td>
<td>Comparator Offset for Mismatched Input Devices</td>
<td>67</td>
</tr>
<tr>
<td>46</td>
<td>The Comparator Hysteresis</td>
<td>68</td>
</tr>
<tr>
<td>47</td>
<td>The Sense Amplifier Design (Block Level)</td>
<td>69</td>
</tr>
<tr>
<td>48</td>
<td>Current pulled from $V_1$ by $V_{IN}$ via the POTA</td>
<td>71</td>
</tr>
<tr>
<td>49</td>
<td>The Integrator Output Nodes $V_1$ and $V_2$</td>
<td>72</td>
</tr>
<tr>
<td>50</td>
<td>The Comparator Outputs $V_{OUTP}$ ($V_{UP}$) and $V_{OUTM}$ ($V_{DOWN}$)</td>
<td>73</td>
</tr>
<tr>
<td>51</td>
<td>$V_{OUTP}$ and the Counter</td>
<td>74</td>
</tr>
<tr>
<td>52</td>
<td>Feedback Currents ($Gm3$) as Controlled by the Differential Comparator</td>
<td>75</td>
</tr>
<tr>
<td>53</td>
<td>Sense Amplifier Performance</td>
<td>76</td>
</tr>
<tr>
<td>54</td>
<td>Output Count Histogram for $V_{noise, rms} = 4mV</td>
<td>78</td>
</tr>
<tr>
<td>55</td>
<td>Output Count Box and Whisker Plot for $V_{noise, rms} = 4mV</td>
<td>79</td>
</tr>
<tr>
<td>56</td>
<td>50mV,RMS Supply Voltage Noise Affects Integrator Nodes</td>
<td>81</td>
</tr>
<tr>
<td>57</td>
<td>Sense Amplifier Performance with 50mV of RMS noise on VDD</td>
<td>81</td>
</tr>
<tr>
<td>58</td>
<td>Sense Amplifier Current Draw</td>
<td>82</td>
</tr>
</tbody>
</table>
### LIST OF SYMBOLS

- $\phi$: Clock input signal; sometimes shown as $V_{CLOCK}$ or $V_{CLK}$
- $\omega_n$: Integrator’s 3-dB bandwidth; sometimes shown as $f_n$
- $\omega_u$: Integrator’s unity-gain bandwidth; sometimes shown as $f_u$
- $A_{OLN}$: Open-loop DC gain of NOTA
- $A_{OLP}$: Open-loop DC gain of POTA
- $C_1$: Capacitor present on the output of the first OTA (POTA)
- $C_2$: Capacitor present on the output of the second OTA (NOTA)
- $E_p$: Quantization noise associated with the comparator’s positive output
- $E_m$: Quantization noise associated with the comparator’s negative output
- $FNN$: Flicker noise number
- $f_H$: Upper frequency limit associated with flicker noise calculations
- $f_L$: Lower frequency limit associated with flicker noise calculations
- $f_n$: Integrator’s 3-dB bandwidth; sometimes shown as $\omega_n$
- $f_u$: Integrator’s unity-gain bandwidth; sometimes shown as $\omega_u$
- $G_{m1}$: Transconductance of the first OTA (POTA)
- $G_{m2}$: Transconductance of the second OTA (NOTA)
- $G_{m3}$: Transconductance of the feedback current devices
- $g_m$: Individual MOSFET small-signal transconductance
- $H_{INT}$: Integrator transfer function
- $I_{BIAS}$: Bias current
\( I_D \): MOSFET drain current

\( I_{FB1} \): Feedback current to \( V_1 \); produced by \( G_{m3} \)

\( I_{FB2} \): Feedback current to \( V_2 \); produced by \( G_{m3} \)

\( K_{CM} \): Constant associated with flicker noise for current mode sensing

\( K_P \): Inherent MOSFET transconductance

\( K_{VM} \): Constant associated with flicker noise for voltage mode sensing

\( L \): MOSFET device length

\( NTF \): Noise transfer function

\( pre_{\_voutm} \): Comparator’s internal unbuffered negative output

\( pre_{\_voutp} \): Comparator’s internal unbuffered positive output

\( R_{AP} \): Resistance of the MTJ in its anti-parallel state

\( R_O \): Output resistance of OTA

\( r_O \): Individual MOSFET output resistance

\( R_P \): Resistance of the MTJ in its parallel state

\( STF \): Signal transfer function

\( t_{delay} \): Delay associated with comparator

\( V_1 \): Output of the first OTA/integrator

\( V_2 \): Output of the second OTA/integrator

\( V_{IN} \): Input voltage to the sense amp; sometimes shown as \( V_{SENSE} \)

\( V_{DD} \): Supply voltage; 5V for this work

\( V_{DS} \): NMOS drain-to-source voltage
$V_{GS}$: NMOS gate-to-source voltage

$V_{HYS}$: Comparator hysteresis voltage

$V_{INP}$: Comparator’s positive input terminal

$V_{INM}$: Comparator’s negative input terminal

$V_{OFF}$: Comparator offset voltage

$V_{OUT}$: Digital counter’s output voltage; also referred to as the final count

$V_{OUTP}$: Differential comparator’s positive output

$V_{OUTM}$: Differential comparator’s negative output

$V_{RMS,CM}$: Noise voltage present at input to current-mode sense

$V_{RMS,VM}$: Noise voltage present at input to voltage-mode sense

$V_{SAT}$: MOSFET saturation voltage

$V_{SG}$: PMOS source-to-gate voltage

$V_{SD}$: PMOS source-to-drain voltage

$V_{SENSE}$: Input voltage to the sense amp; sometimes shown as $V_{IN}$

$V_{UP}$: Counter input terminal which enables/disables counter increment

$V_{f/f}^2$: Flicker noise power

$W$: MOSFET device width
INTRODUCTION

Volatile and Non-Volatile Memories

Resistive memory is a type of non-volatile memory which may one day supplant older volatile technologies such as DRAM and SRAM. Non-volatile memory does not require a constant power source to maintain its information, therefore it offers a potentially huge energy savings over volatile memory. Traditional non-volatile memories have significant limitations associated with them. Hard drives have relatively large capacities but also contain mechanical (moving) parts which present power and mobility limitations. Resistive memory is solid-state technology; there are no moving parts and virtually no wear-out mechanisms, which is an advantage over FLASH. These advantages in power and durability (hence mobility) make resistive memory the logical choice to eventually replace today’s volatile memory technologies.

Resistive memory operates on the somewhat rudimentary principle that information can be stored in the resistance of a material. A relatively low resistance may be chosen to represent a logic “low” or 0, and a relatively high resistance may be chosen to represent a logic “high” or 1. These definitions are arbitrary. To construct a resistive memory then, a material which has a changeable resistance is required. The fashion in which the resistance is modulated then determines the type of resistive memory. Phase-change RAM (PC-RAM) exploits the material properties of chalcogenide glass. When in its crystalline state, chalcogenide is a low-resistance material. However, the application of heat causes the lattice to lose its organized state, and as such, it
presents a higher resistance to electrical current passing through. The state of the chalcogenide glass then (crystalline or amorphous) is responsible for the storage of a bit. Another nascent technology is magnetic RAM or MRAM. MRAM is based around a quantum phenomenon known as electron spin. If two adjacent magnetic materials possess the same “spin”, they present an overall lower resistance, compared to when the materials possess the opposite electron spin. Therefore, setting the “spin” to be aligned or opposed (parallel and anti-parallel, respectively) programs a lower or higher resistance and essentially amounts to the storage of a bit.

**Array Organization**

Typically, resistive memories can be organized into arrays in two fashions. The first method employs isolation devices, typically NMOS transistors, used to separate the desired resistive memory cell from the remainder of the array. Benefits of the isolation technique include eliminating the effect of neighboring memory cells and also providing a more straightforward sensing technique. There is a price to pay for the apparent simplicity - the area required to implement the isolation devices drastically reduces the density of the memory chip [1]. In addition, the use of isolation devices introduces active devices into a layer of the chip which is otherwise free of semiconductors.

The second method of organization is the cross-point array. An example cross-point array with four bitlines and four rowlines is visible in Figure 1. The cross-point array is a relatively simple structure; it is composed of row lines (horizontal metal lines) and bit lines (vertical metal lines). At every intersection
there exists one resistive cell or one bit of memory storage. Thus, the number of row lines multiplied by the number of bit lines determines the overall memory capacity. A cross-point array can be arranged to allow for a $4F^2$ ($2F \times 2F$) (where $F$ is the minimum feature size) cell size, which is the maximum density possible for any two-dimensional structure. Further increases in memory density are possible by expanding the cross-point array into the z-direction [2]. The tradeoff for greatly increased capacity is a commensurate increase in sensing complexity. Gone are the isolation devices which prevented neighboring cells from affecting the behavior of the desired memory cell.

![Figure 1: An Example Cross-point Array](image)

**Current Mode Sensing**

The conventional way of sensing the state of any desired resistive cell in a cross-point array will be referred to as current-mode sensing and is detailed in Figure 2. The sense amplifier (or sense-amp) is connected to the bitline of the desired cell. All rowlines are connected to $V_{dd}/2$ except for the rowline
corresponding to the desired memory cell. A current flows from the output of the op-amp and through the integrating capacitor [3]. If the cell in question has a relatively low resistance, more current will be generated as the op-amp attempts to maintain equal voltages on its input terminals and \( V_{\text{OUT}} \) will rise quickly. If instead the cell in question has a relatively large resistance, the current will be smaller, and \( V_{\text{OUT}} \) will not rise as quickly. To determine if the examined cell was a “0” or a “1” the sense circuitry need only measure the time it takes for \( V_{\text{OUT}} \) to go from ground to \( V_{\text{DD}}/2 \).

**Figure 2: Current Mode Sense of a Cross-point Array**

The chief advantage of using this precision sensing technique is speed; the sense can be completed in tens of nanoseconds. However, this method is not without its shortcomings. The op-amp must be calibrated in order to minimize its offset voltage. In reality, some amount of offset will always exist. This offset
voltage produces error currents that pass through all the other memory cells connected to the grounded rowline. These error currents cause the current passing through the integrating capacitor to differ from its nominal value. As the array size increases, so does the sum of the error currents. These error currents thus make current-mode sensing more difficult for large array sizes.

**Voltage Mode Sensing**

Voltage-mode sensing, as is seen in Figure 3, relies upon determining the voltage produced at the input of the sense-amp, which is connected to the bitline of the desired memory cell. All rowlines are grounded except for the one corresponding to the desired memory cell. When $V_{DD}$ is applied, a current flows from the supply voltage through the desired memory cell. The sense amp used in this scheme has infinite input impedance, so the current must flow back through the other $N-1$ resistors on the same bit line. These $N-1$ resistors in parallel are referred to as the “sneak resistance” of the array. It should be noted that these sneak currents occur not just for the bitline in question, but for all bitlines in the array. These sneak paths make voltage mode sensing more complex than current mode sensing [4].
As seen in Figure 4, the sneak resistance, along with the resistance of the cell in question, forms a voltage divider.

\[
V_{OUT} = \frac{R/(N-1)}{R/(N-1) + R} \cdot V_{DD} = \frac{V_{DD}}{N}
\]  

(1)

Figure 3: Voltage Mode Sense of a Cross-point Array

Figure 4: Voltage Divider Formed by Sneak Resistance
The output voltage $V_{\text{OUT}}$ is given in (1). As the number of rowlines in the array increases, the input voltage to the sense-amp decreases proportionately. For an array size of 1024 and a $V_{\text{DD}}$ of 5V, the input voltage to the sense-amp will be less than 5mV. Note that sensing a cell programmed as parallel will produce a relatively larger voltage when compared to a cell programmed as anti-parallel. Also note that the sneak resistances although unknown (they will be a parallel combination of parallel and anti-parallel states) remain constant for any given sense.

**Practical Sensing Considerations**

Because of variations in the process step used to deposit the materials which constitute the Magnetic Tunnel Junctions (MTJs), the resistances of the parallel and anti-parallel state can vary more than 100% across the array. In other words, one section of the array might have an $R_p$ and $R_{AP}$ of 100kΩ and 130kΩ respectively, while another section might have an $R_p$ and $R_{AP}$ of 200kΩ and 260kΩ. While the ratio of $R_p$ to $R_{AP}$ remains constant (at about 130% [5]), the individual resistance values are widely distributed. A robust and self-referencing sense algorithm will neutralize the effect of these undesired variations.

**The Proposed Sense Algorithm**

In this work, the sense amplifier (which is the subject of the remainder of this thesis) will convert a relatively small input voltage into a 8-bit digital word (sometimes referred to as the count). The larger the input voltage, the bigger the digital number produced. The sense-amp can be programmed to have a 1µs
sense time or longer if more accuracy is required. Because the parallel state will produce the higher input voltage and thus the higher digital number, it will represent a logic “1”. The anti-parallel state then will represent logic “0”.

A self-referenced scheme is important to ensure the robustness of the sense-amplifier [6]. To implement a self-referenced sensing scheme, it will be necessary to read in the unknown cell count and then compare it to a known state. This is called a destructive read, and when the sense is finished, the correct value must be re-written back into the cell.

The complete sense algorithm will be illustrated by way of example. Let the count be 63 for a parallel cell and 57 for an anti-parallel:

1. Read the unknown cell twice, sum the count and store: 63 + 63 = 126
2. Write a known “0” to the cell, subtract from step one: 126 - 57 = 69
3. Write a known “1” to the cell, subtract from step two: 126 - 63 = 6
4. If the result is positive, the cell was a “1”. If it is negative, the cell was a “0”
5. Re-write the cell’s correct state

Note that the cell-to-cell variation in MTJ resistance does not hinder the sense algorithm’s operation. So long as the parallel state produces a higher digital count than the anti-parallel state, it will function correctly. If the array becomes so large that the digital count produced for each state is the same, the algorithm will fail.

The proposed voltage sensing technique offers several advantages over current mode sensing. Foremost among these is accuracy. It will be shown that the voltage-mode technique is capable of accurately sensing a much larger array than the current-mode technique can handle. Furthermore, the resolution (count/mV) of the sense-amp in this work can be programmed for greater
accuracy (but less dynamic range) when required. Voltage mode sensing can also achieve larger signal-to-noise ratio (SNR) when allowed to sense for a longer period of time. The stated current mode sensing algorithm does not gain SNR due to flicker noise [7].

Due to the nature of the proposed sense amplifier, it can be operated at very low power. Because it employs an averaging technique, the required precision of the individual components of the sense-amp is relaxed, even though the overall sense-amp is quite precise.

These benefits come at the expense of speed. A complete sense requires three reads (the first read can simply be multiplied by two) and two writes. The result is a sense on the order of tens of microseconds. To obviate this speed barrier, the sense amplifiers can run in parallel, yielding a 32-bit or 64-bit output read per sense.

**Overall Design Strategy**

The goal of this work is to create a sense amplifier which, through the technique of voltage mode sensing and noise-shaping, can accurately sense a very small input voltage (from the memory array). The block diagram of the proposed sense-amp is discussed in Chapter II. In Chapter III, Simulink is employed to explore in further detail the sense amplifier. More specifically, the effect of several key parameters on the overall sense-amp performance is explored. Finally, Chapter IV examines the transistor level design of the sense-amp and Chapter V demonstrates transient performance.
THE SENSE AMPLIFIER

Block Diagram

As outlined in the introduction, the proposed sense amplifier is able to resolve very small input voltages (on the order of a few to tens of millivolts). To accomplish this degree of precision, a quasi second-order delta-sigma modulator has been employed. Figure 5 is a block diagram which illustrates the architecture of the sense-amp.

![Block Diagram of the Sense Amplifier](image)

Figure 5: The Sense Amplifier Block Diagram

The sense-amp is composed of several sub-systems:

- Two operational transconductance amplifiers (OTAs), represented by the $G_{m1}$ and $G_{m2}$ blocks
- A clocked comparator
- Two voltage controlled current sources, represented by the $G_{m3}$ blocks
- An 8-bit digital counter with enable and up/down count functionality
Qualitative Description of Sense Amplifier Operation

In essence, the sense-amp converts a small voltage present at $V_{IN}$ into an 8-bit digital number. But this is accomplished in two steps: First the delta-sigma portion of the sense-amp generates a digital waveform with a duty cycle proportional to the input voltage. This digital waveform is then used to enable the counter. The more often this waveform is high, the more often it enables the counter, and the resulting digital count is higher. As an example, imagine that the sense amp had an input range from 0mV to 20mV. If 5mV is present on the sense-amp input, we would expect an output waveform which has a duty cycle of $\frac{5mV}{20mV} \cdot 100\% = 25\%$. Figure 6 illustrates this relationship. If the input voltage were raised to 15mV, we would expect the output duty cycle to rise to 75%, as seen in Figure 7.

![Figure 6: Example Output Digital Waveform for 5mV Input](image-url)
Figure 7: Example Output Digital Waveform for 15mV Input

The sense amplifier’s input voltage to output digital number transfer function should be linear, as shown in Figure 8. This figure uses the example values from earlier in this section and also assumes the maximum count for any given sense is 100.

Figure 8: Ideal Input-Output Relationship for Sense Amplifier
Sense Amplifier Signal and Noise Transfer Functions

The delta-sigma portion of the sense amplifier takes a DC voltage and adds quantization noise but also keeps intact the original input voltage. The purpose of the digital counter then is to filter out the quantization noise by adding (averaging) the number of times the input waveform is high over a given sense time.

The best way to formally characterize how the sense amplifier processes the input voltage is to create a transfer function. This is not a straightforward affair; the usage of $V_{OUTM}$ along with $V_{OUTP}$ complicates the analysis. To begin, we write equations for the two integrator output nodes, $V_1$ and $V_2$:

$$V_1 = -V_{IN} \frac{G_m}{sC_1} + V_{OUTP} \frac{G_m}{sC_1}$$  \hspace{1cm} (2)

$$V_2 = -V_1 \frac{G_m}{sC_2} + V_{OUTM} \frac{G_m}{sC_2}$$  \hspace{1cm} (3)

It is desirable to achieve an expression which relates $V_{OUTP}$ to $V_{IN}$ and $E_P$ (the quantization noise produced by the comparator’s positive output) without involving the comparator’s negative output $V_{OUTM}$ or $E_M$ (the quantization noise produced by the comparator’s negative output). $E_P$ and $E_M$ are defined as the voltages the comparator must add to its input $V_2 - V_1$ to produce $V_{OUTP}$ and $V_{OUTM}$:

$$V_{OUTP} = V_2 - V_1 + E_P$$  \hspace{1cm} (4)

$$V_{OUTM} = V_2 - V_1 + E_M$$  \hspace{1cm} (5)
$E_p$ and $E_m$ are actually an attempt to linearize the inherently non-linear behavior of a comparator. For our purposes, we may regard them as additive white noise sources. Now, if (2) and (3) were inserted into (4), we would still be left with $V_{outm}$ and $E_m$ in the expression. To this end, it is helpful to note that:

$$E_p + E_m = V_{outp} \quad (6)$$

Equation (6) merely states that the quantization noise due to the comparator’s positive and negative outputs are correlated. Solving (6) for $E_m$ and substituting into (5) yields:

$$V_{outm} = V_2 - V_1 + (V_{outp} - E_p) \quad (7)$$

Using (2) and (3) once again to substitute for $V_1$ and $V_2$ in (7):

$$V_{outm} = \left( V_2 = -V_1 \frac{G_{m2}}{sC_2} + V_{outm} \frac{G_{m3}}{sC_2} \right) - \left( -V_{in} \frac{G_{m1}}{sC_1} + V_{outp} \frac{G_{m3}}{sC_1} \right) + (V_{outp} - E_p) \quad (8)$$

Solving (8) for $V_{outm}$ gives:

$$V_{outm} = V_{in} \frac{G_{m1}/sC_1}{1 - G_{m3}/sC_2} + V_{outp} \frac{1 - G_{m3}/sC_2}{1 - G_{m3}/sC_2} - V_1 \frac{G_{m2}/sC_2}{1 - G_{m3}/sC_2} - E_p \quad (9)$$

Equation (9) may now be used as a substitute for $V_{outm}$ in (7). Doing so and solving for $V_{outp}$ yields:

$$V_{outp} = V_{in} \cdot STF(s) + NTF(s) \cdot E_p(s) \quad (10)$$
The signal transfer function multiplies $V_{in}$ while the noise transfer function multiplies $E_p$:

$$STF(s) = \frac{G_{m1}G_{m2}}{s^2 + \frac{G_{m3}}{C_1C_2}(C_1 - 2C_2)s + \frac{G_{m2}G_{m3}}{C_1C_2}}$$

$$NTF(s) = \frac{s^2 - 2\frac{G_{m3}}{C_2}s}{s^2 + \frac{G_{m3}}{C_1C_2}(C_1 - 2C_2)s + \frac{G_{m2}G_{m3}}{C_1C_2}}$$

This is a useful result, and a property common to any type of $\Delta\Sigma$ modulator. The input signal is low-passed while the quantization noise introduced by the comparator is high-passed. Because the sense amp in this work is only used for DC measurements, it is informative to examine the DC gain of the sense-amp:

$$STF(s = 0) = \frac{G_{m1}G_{m2}}{\frac{G_{m2}G_{m3}}{C_1C_2} + G_{m1}/G_{m3}}$$

Equation (13) predicts that the input voltage to the sense-amp will be “gained-up” by a ratio of transconductances. The output of the $\Delta\Sigma$ modulator (but before the counter) will not be purely DC because of the quantization noise; however, the purpose of the counter is to filter the noise and provide a digital measurement of the DC component.
Figure 9 shows a Bode plot for typical signal and noise transfer functions. The low and high pass characteristics are visible in the shape of the plots. Notice that the low-frequency gain of the STF is not 0dB, but 40dB. Equation (13) states that the DC gain of the STF (40dB) is equal to the ratio of $G_{m3}$ to $G_{m1}$. Thus,

$$G_{s1}/G_{m1} = 100.$$
BEHAVIORAL MODELING OF THE SENSE AMPLIFIER

Introduction

Although Chapter II outlined the system-level behavior of the $\Delta\Sigma$ modulator used in the sense-amp, this approach has shortcomings:

1. Non-idealities: In reality, it is not possible to build an analog integrator with infinite DC gain. Modeling non-idealities such as these by hand is impractical.

2. Speed: Behavioral models simulate more quickly than full-scale transistor models allowing for swift analysis of sense-amp behavior.

3. Parameterization: There are key characteristics of the sense-amp (such as the bandwidth of the first $G_m$ stage) whose effects can be examined in more detail with the aid of modeling software.

Behavioral modeling represents a step away from abstraction and towards transistor-level implementation. To this end, Simulink was selected as the tool used to model the sense-amp.

Important Non-Idealities

Integrators

The sense-amp employs two integrators to perform the $\Delta\Sigma$ modulation, and these are simplistically modeled in Chapter 2 as having infinite gain:

$$H_{\text{INT,ideal}}(s) = \frac{G_m}{sC} = \frac{\omega_u}{s}, \quad \omega_u = \frac{G_m}{C}$$  \hspace{1cm} (14)
Because all transistors have finite output resistance, there is a frequency at which the output capacitance no longer dominates the overall output resistance. At frequencies lower than this, the gain of the integrator remains constant.

The integrator will instead be modeled as a single-pole system with finite DC gain, as given in (15):

\[
H_{INT,real}(s) = \frac{A_{OL}}{s + \frac{1}{\omega_n}} = \frac{A_{OL}}{s + \frac{1}{\omega_n / A_{OL}}}
\tag{15}
\]

\(A_{OL}\) is the open-loop DC gain of the OTA, and \(\omega_n\) is the 3-dB bandwidth of the OTA. Because the gain-bandwidth product is constant during roll-off, it is also acceptable to specify the integrator’s pole location as \(\omega_n / A_{OL}\). Note only two of these three characteristics are independent. If a DC gain and unity gain bandwidth are selected, the 3-dB frequency is determined.

Figure 10 shows, in graphical form, the effect of finite gain. Each integrator has the same unity-gain bandwidth of 1MHz, and only their gains vary. As the gain decreases from infinity, the pole location pushes out towards \(\omega_n\). Equation (15) is in agreement with this observation.
In addition to finite gain, a transistor-based integrator cannot produce an output voltage from \(-\infty\) to \(+\infty\). In reality, an OTA can only produce output voltages within a finite range, typically from ground to a source voltage. This non-linearity would be impossible to account for in hand calculations, but is easily modeled in Simulink with the saturation block.

These two non-idealities have the greatest impact on OTA performance for this work. Although there are many more ways a transistor-based integrator can differ from the ideal case, they do not degrade the performance by the same magnitude.

Comparator

The role of the comparator within the sense-amp is, on the rising clock edge, to decide which integrator output node \((V_1\) or \(V_2\)) is greater. In reality, transistors cannot be fabricated with precise and identical dimensions. As a result, the comparator will suffer from an offset which will require one input to be greater than the other input by some finite amount \(V_{\text{OFF}}\) (ideally \(V_{\text{OFF}}\) is zero).
Transistors introduce non-linear capacitances which can cause hysteresis in the comparator's outputs. Additionally, the comparator takes time to make a decision, and that delay is modeled as well.

**The Simulink Model**

Incorporating the block diagram from the previous chapter, and the non-idealities from earlier in this section, the Simulink model shown in Figure 11 provides an expedient and accurate platform to further examine the behavior and limitations of the sense amplifier.

![Figure 11: The Simulink Model](image)

Starting at the very left, there is a DC input voltage (a constant) as well as a block for adding white noise. These two voltages sum and enter the negative input terminal of the first OTA. The transfer function modeling the OTA exhibits the non-idealities discussed in the previous sub-section. The positive input for the first OTA input is zero. The two closely-grouped blue blocks on the left perform the first integration, or the first $G_{m}$ stage. The OTA output is summed with the
feedback stage which will be described momentarily. The summation is subjected to a saturation block, which as described earlier, models the finite output range of the OTA. The output of the first OTA (summed with the feedback) becomes the input to the second OTA, which also is fed into a saturation block.

The OTA output nodes, $V_1$ and $V_2$, are fed into the inputs of the differential comparator which is modeled as the orange block. To model the comparator and its non-idealities, several Simulink blocks are needed, as seen in Figure 12.

![Figure 12: The Differential Comparator Sub-Subsystem](image)

The positive input is subtracted from the negative input, and the result is fed into a zero-order hold (which effectively samples the subtraction). The relay block can be used to model the hysteresis or decision offset voltage. The comparator then produces a positive and negative output signal. Referring back to Figure 11, the comparator’s positive output, $V_{OUTP}$, and negative output, $V_{OUTM}$, are used to control the feedback currents. These feedback currents are modeled with green blocks; $V_{OUTP}$ controls the pull-up current for $V_1$ and $V_{OUTM}$ controls the pull-up current for $V_2$. Because the comparator outputs are digital - that is, either on or off - the feedback currents are relatively simple. Either $I_{FBI}$ is being pushed
into $V_1$ or the feedback current is off. The same situation applies for $V_2$, although when one integrator output node is receiving feedback current, the other is not.

Finally, the positive comparator output is used to enable the 8-bit digital counter. During a $1\mu s$ sense, for example, the counter is clocked 100 times (the clock is running at 100MHz). As described in the previous chapter, the more often the $\Delta \Sigma$ output is “high”, the more often the counter output counts up. In this scenario, the lowest possible count would be zero, and the highest possible count would be 100 (the counter counts up every chance it gets).

### Ideal Simulation

**Input Voltage Sweep**

In order to best understand the limitations imposed by the various non-idealities discussed above, it behooves us to first examine the behavior of an ideal sense amplifier. The parameters listed below have been carefully selected to provide practically ideal performance:

- $f_{un} = f_{up} = 20MHz$ (the unity-gain bandwidth of the N- and P-OTA)
- $A_{OLN} = A_{OLP} = 10^6$ (the DC gain of the N- and POTA)
- $C_1 = C_2 = 0.5\,pF$ (the integrating capacitors for $V_1$ and $V_2$)
- $G_{m3} = G_{m1}/250$

The performance of this ideal configuration can be seen in Figure 13. The plot shows the final output count for a sense that was performed at each input voltage. That is, each point on the graph (there are 200) corresponds to a $1\mu s$ sense. Also of note is the way in which the final count output “saturates” at 100
for all input voltages above 20mV. This is no accident, and it relates to a
discussion in the previous chapter. The upper limit for the final count should be
obvious; over 1μs a 100MHz clock will only have 100 rising edges. Thus, there
are only 100 opportunities to increment the counter.

![Graph: Performance of Ideal Sense Amplifier](image)

*Figure 13: Performance of Ideal Sense Amplifier*

Less obvious is the reason why 20mV should be the maximum input voltage.
Let’s return to the block diagram as seen in Figure 5. If we examine the node \( V_1 \),
we notice that there are two methods for changing the voltage. Either current can
be forced into \( V_1 \) via the feedback transconductance \( G_{m3} \), or the first OTA \( G_{m1} \)
can pull current out of the node.
It stands to reason that if our input voltage to output count relationship is linear, there is an input voltage which will create a 50% duty cycle on the output, that is, \( V_{OUTP} \). Because charge is conserved, the amount of current entering \( V_1 \) must equal the amount of current exiting as well. Putting this in equation form yields:

\[
V_{IN} \cdot G_{m1} = \frac{50}{100} \cdot V_{OUTP} \cdot G_{m3}
\]

(16)

The left side of (16) refers to the current being pulled from the node; the right side corresponds to the current being pushed into the node. Let us choose 10mV to be the midpoint voltage (as it is in Figure 13). Solving for the ratio of transconductances we obtain:

\[
\frac{G_{m1}}{G_{m3}} = \frac{\frac{50}{100} \cdot V_{OUTP}}{V_{IN}} = \frac{\frac{1}{2} \cdot 5}{10 \times 10^{-3}} = 250
\]

(17)

Note that this ratio of \( G_{m1} \) to \( G_{m3} \) is familiar; it is the DC gain of the signal transfer function. We independently derived this in (13). In addition, we can mentally verify that \( 20mV \cdot 250 = 5V \) which is equivalent to \( V_{OUTP} \) having a 100% duty cycle. Note that \( G_{m2} \) is conspicuously absent from (17). It turns out that \( G_{m2} \) has less effect on the overall transfer function; its primary effect is to alter the average value that \( V_1 \) has.
Examination of Key Sense Amplifier Nodes

When exploring the effect of a particular non-ideality, we are first interested in how the sense-amp as a system is affected. For example, lowering the DC gain of an OTA might cause the final count to be lower than that of the ideal case. But if we wish to understand why that non-ideality caused a problem, we must peer inside the sense-amp. That is why this section will explore the behavior of various internal sense-amp nodes. When the sense-amp is simulated with a non-ideality in place, comparing the result with the known ideal result will yield valuable insight.

Because 10mV is the midpoint of our sense-amp’s input range, we would expect $V_1$ and $V_2$ to spend an equal amount of time being the greater voltage. Although it isn’t easily visible in Figure 14, that is indeed the case. This would allow the comparator to produce a $V_{OUTP}$ signal which is logic “1” half the time and logic “0” during the other half. Notice that the two signals have drifted up above $V_{DD}/2$ which is the positive input to the second OTA. This makes sense, as the feedback current to the second output ($G_{m3}$) can only push current into $V_2$, which raises the voltage. If the input to the second OTA is above $V_{DD}/2$, it pulls current out of $V_2$, lowering the voltage.

It is also instructive to view the comparator’s output $V_{OUTP}$. Again, because 10mV is the midpoint of our input range, we expect that $V_{OUTP}$ will be at 50% duty cycle. The average value of a 50% duty cycle signal is $V_{DD}/2 = 2.5V$, which is
what the gain of our sense amp multiplied by the input voltage is also equal to:

\[ 250 \cdot 10 mV = 2.5 V. \]

\[ x = 10^{-6} \]

\[ 2.495 \]

\[ 2.5 \]

\[ 2.505 \]

\[ 2.51 \]

\[ 2.515 \]

\[ 2.52 \]

\[ V_1 \]

\[ V_2 \]

\[ 2.5 \cdot 10^{-6} \]

\[ 3 \]

Figure 14: The Integrator Output Nodes

In Figure 15 our discussion is validated. \( V_{out} \) appears to have a 50% duty cycle, and indeed its average is 2.5V.
Finally, the 8-bit counter’s output is plotted during the sense in Figure 16. The Simulink model is set up not to begin sensing until $2\mu$s. The sense completes at time equal to $3\mu$s. If we zoom into a portion of the graph which is increasing, we see that every 20ns the count increments. Because the clock period is 10ns, this corresponds to an increment every other clock cycle. This corresponds to our understanding of Figure 15.
Now that we have an understanding of these important portions of the sense-amp, we are more likely to understand the effects of the non-idealities which will be introduced.

**The Effect of Non-Idealities on Sense Amplifier Performance**

How a given non-ideality affects the sense-amp performance will be the goal of this subsection. All sense-amp parameters will be ideal, of course, with the exception of the particular non-ideality in question.

*Finite Integrator Gain (and Unity-Gain Frequency)*
Figure 17 is very helpful for determining how much gain is needed from both OTAs. An open-loop DC gain of 10,000 ensures virtually ideal performance from the sense-amp, but relaxing the gain constraint to 3000 hardly affects the operation while dramatically simplifying the transistor level design of the OTA. A gain of 10,000 would probably require gain-boosting or multiple gain stages which would use additional power. As the gain decreases below 3000, sense-amp performance begins to noticeably suffer. For an open-loop gain of only 10, the sense-amp fails to function.

![Figure 17: Final Count for Varying Integrator Gains](image-url)
There is one further note regarding Figure 17. For the $A_{OL} = 500$ and 1000 cases, the behavior seems erratic as $V_{IN}$ increases. This behavior is caused by the failure of the sense amp to reach steady-state before the 2μs warm-up is complete. Obviously, this is not acceptable performance. Therefore, a gain of 1000 or less is not permitted.

However, while analyzing the DC gain requirement, we have neglected to consider the unity-gain bandwidth of the OTAs. Thus, a more complete analysis of the required integrator characteristics is seen in Figure 18.

Figure 18 paints a more complete picture of the specifications demanded by the integrators. The curve color represents the unity gain frequency, and the data point marker represents the open-loop gain from a maximum of 5000 to a minimum of 500 (a legend specifying each curve color and marker was too large to fit on the plot). A unity gain frequency of 100kHz, as indicated by the blue curves, is too low for proper sense-amp operation. At the other end of the scale, if the OTA bandwidth is more than 10MHz, the OTA requires significantly more DC gain. Therefore, Figure 18 indicates that the optimum integrator bandwidth is between 1MHz and 5MHz, and the optimum integrator gain is between 500 and 2000. The optimization points appear to be broad, meaning that setting the exact bandwidth and DC gain is not critical. This is another advantage of the ΔΣ.
Although the unity gain frequency of an integrator is not a non-ideality, in this case it was a consideration that must be made in parallel with that of the integrator gain. We see from Figure 17 that we would have overestimated the amount of gain required had we not also brought $f_u$ into consideration.

**Comparator Offset and Hysteresis**

As discussed earlier, it is impossible to build a perfect comparator. One type of comparator non-ideality is offset. Comparator offset is the result of one
comparator input being stronger than the other. This is typically caused by a mismatch in transistor widths.

Immediately, we see in Figure 19 that comparator offset has virtually no effect on performance. Although this is a positive result, the reason behind it is less clear. To get a better idea of why comparator offset is practically irrelevant, we should examine Figure 20.

![Figure 19: Final Count for Varying Comparator Offset Voltages](image)

Figure 20 should be contrasted with Figure 14. The only change to the sense-amp was the addition of a 50mV comparator offset. Now it is relatively easy to recognize why comparator offset has little effect. Because the quantity \( V_2 - V_1 \) must exceed 50mV (not zero, as was the case before), \( V_2 \) must initially
charge up to roughly 50mV above \( V_1 \). If the digital counter were running during this time, it would not be incrementing when it should be. However, because the counter does not begin counting until \( 2 \mu s \), the final count is unaffected. As long as the comparator offset remains constant during a given sense, we have shown that the sense-amp is practically unaffected.

![Graph](image)

*Figure 20: The Integrator Output Nodes with a 50mV Comparator Offset*

Another possible source of error within a comparator is hysteresis, which is illustrated in Figure 21.
Figure 21: Hysteresis

If the output is negative, that is $V_{OUT} < 0$, then the input $(V_2 - V_1)$ must increase beyond $V_{HYSTP}$ in order to cause a positive output. Similarly, if the output is positive, the input must decrease below $V_{HYSTM}$ in order to produce a negative output.

Hysteresis is a non-linear phenomenon which would not easily be modeled by hand. Further, it is interesting to note that comparator offset is a more trivial case of hysteresis in which $V_{HYSTP} = V_{HYSTM} \neq 0$. Figure 22 examines the effect of various hysteresis voltages on the sense-amp performance.
The sense-amp also has resilience to comparator hysteresis. The degree of resistance is not as great as with comparator offset. At 500mV of hysteresis the final count is not smoothly linear. Again, to examine why the comparator hysteresis doesn’t drastically affect sense-amp performance, we turn our attention to inside the sense amp.

Figure 23 indicates that the OTA outputs have a greater voltage swing than before (see Figure 14). At first glance, it seems difficult to understand how this could produce the same final count as in the case without hysteresis. However, we do observe that although the waveforms are covering more voltage,
the term $V_2 - V_1$ appears to be equally positive and negative over the sense time.

Therefore, Figure 24 comes as no surprise.
Figure 24: The Comparator's Positive Output and Average Value with a 50mV Hysteresis

Although the quantization noise spectrum has a different shape (this is another way of stating that the output waveform is lower in frequency when compared to Figure 15), the average value remains the same. We can then predict how final count will track versus time based on Figure 24. Instead of counting up every other clock cycle, as in Figure 16, the final count with hysteresis will spend a larger amount of time incrementing, and then a larger amount of time idle. The conclusion is that for comparator hysteresis voltages smaller than approximately 100mV, there is very little effect.
As predicted, the counter alternates between incrementing and idling, although much more slowly than in Figure 16.

**Comparator Delay**

No real comparator implementation is possible without introducing some amount of delay. However, too much delay in the comparator’s decision will allow $V_1$ and $V_2$ to drift apart, possibly even too far to recover during the sense time.

In Figure 26, we observe that the sense-amp can tolerate a substantial amount of delay from the comparator. In fact, even a half-period of delay (5ns)
causes a relatively insignificant effect on the overall final count. No more than half a period of delay is possible, as the comparator design will show.

**Figure 26: Final Count for Varying Comparator Delays**

**Comparator Non-Idealities in Unison**

Up until this point, we have considered each non-ideality in an isolated case. In order to ensure proper performance, we should select a reasonable amount of offset, hysteresis, and delay and determine if the sense-amp still performs properly.

We can see from Figure 27 that even when the non-idealities are combined, the effect is minimal. Figure 27 has a comparator offset of 25mV, a
hysteresis of 50mV, and a delay of 3ns; yet it performs almost perfectly when compared to the completely ideal scenario presented in Figure 13.

Figure 27: Final Count for 25mV Comparator Offset, 50mV Hysteresis, and 3ns Delay

All Non-Idealities in Unison

The final step towards ensuring robustness of the sense-amp design is to simulate the sense-amp with all significant non-linearities (as well as a selection for $f_u$). This would include:

- $A_{OL} = 2000$
- $f_u = 3MHz$
- $V_{\text{OFFSET}} = 10mV$
- $V_{\text{HYS}} = 20mV$
- $t_{\text{delay}} = 2\text{ns}$

Not only does Figure 28 indicate that the sense-amp architecture is fairly robust, but it gives a starting point for transistor design specifications. As long as the OTA and comparator blocks meet or exceed these specifications, the sense-amp should be in proper working order.

Figure 28: Final Count with All Non-Idealities Included
Input Noise

White Noise

When performing an actual sense, there will be noise present on the input signal $V_{IN}$. One of the noise sources comes from the resistive array. Each of the $N$ resistors present on the selected bitline produce a noise voltage of $i^2 R / \Delta f = 4kT / R$. To calculate the total root-mean square voltage that is produced on the sense-amp input, it is instructive to examine Figure 29:

![Figure 29: Schematic Used for Determining Input Noise Voltage](image)

All the noise current sources combine in parallel, and all the resistances combine in parallel to give:

$$\frac{V_{IN}^2}{\Delta f} = \left(i_{RS}^2 + i_{R1}^2 + i_{R2}^2 + \cdots + i_{R(N-1)}^2\right) \left(R_S \parallel R_1 \parallel R_2 \parallel \cdots \parallel R_{N-1}\right)^2 \tag{18}$$

In order to simplify the calculation, we can assume that $R_S = R_1 = R_2 = \cdots = R_{N-1} = R$. This is not a cavalier assumption; for any given sense there is no way to actually know the resistance values of the individual sneak resistances or of the cell to be sensed (if we did, we wouldn’t need to sense). Instead, we can approximate every resistance as $R$. When a numerical
result is desired, we can choose $R$ to be the average of the parallel and anti-parallel resistances. Using this assumption then, we have:

$$\frac{V_{in}^2}{\Delta f} = N \cdot \left( i^2 \right) \cdot \left( \frac{R}{N} \right)^2 = N \cdot \left( \frac{4kT}{R} \right) \cdot \left( \frac{R}{N} \right)^2 = \frac{4kTR}{N}$$

(19)

Figure 9 provides us with a way to estimate the noise-equivalent bandwidth for the input noise. If we also assume an array with $N=1024$ rowlines and $R = 400\,k\Omega$, we can then solve for $V_{in,\text{RMS}}$:

$$V_{in,\text{RMS}} = \sqrt{\frac{2kT}{\Delta f}} \cdot \frac{4kT(400k\Omega)}{1024} = 4.2\,mV$$

(20)

This RMS voltage will only decrease as the array size increases. Fortunately, this sense-amp fares well against white noise at its input. This is because the digital counter which sits at the end of the sense-amp circuit is essentially an extremely low-pass filter. Thus, only a very small noise bandwidth actually passes through all the way to the final count. Using different variances for the input noise in Simulink, we see in Figure 30 that the sense-amp is able to correctly function even in the presence of noise which has an RMS value that is equal to or greater than that of the input signal.

Figure 30 is useful for verifying the fact that the larger the input noise becomes, the more difficult an accurate sense is to obtain. Therefore, it is useful to introduce another graphical method for evaluating the sense-amp’s performance.
Each curve in Figure 31 represents an input voltage from 0 to 20mV. The horizontal axis is the final output count, and the vertical axis represents normalized frequency - if for four counts out of five an input voltage produced an output count of 60, the data point would be at 0.8. Because the curves do not touch, each input voltage (from 0mV to 20mV by 1mV steps) corresponds to a unique set of output counts. This means that the sense amplifier can correctly distinguish the relatively small input voltage levels.
If the input noise voltage is increased to 4mV,RMS the histogram then is represented by Figure 32.

Figure 31: Final Count Histogram for $V_{noise,rms}=1mV$

Figure 32: Final Count Histogram for $V_{noise,rms}=4mV$
Again, notice that the curves do not overlap. This means that even with the presence of 4mV of RMS noise on the input, the sense amplifier would function correctly.

**Flicker Noise**

The array is not the only source of noise. Because the sense-amp is built from transistors, input-referred noise will be of concern. Due to the nature of the oxide-semiconductor junction in the MOSFET, the input referred white noise will not be purely white. At low frequencies, flicker noise will dominate. Flicker noise power is proportional to \(1/f\) -- a fact which causes difficulty for traditional sensing techniques.

Figure 2 illustrates current-mode sensing. Flicker noise on the input of the sense-amp, which in this case is an integrator, is of the form:

\[
V_{iff}^2(f) = \frac{FNN}{f}
\]  

(21)

The constant \(FNN\) (which stands for flicker noise number) is a function of integrator bias currents as well as process technology. To determine the integrator's output noise, the flicker noise is multiplied by the magnitude squared of the integrator's transfer function:

\[
V_{out}^2(f) = \left| \omega_0 / \omega \right|^2 \cdot \frac{FNN}{f} = \frac{K_{CM}}{f^3}
\]  

(22)
accounts for all the constants that appear in (22). The effect of integrating the flicker noise is to magnify its power at low frequencies. Given a sense time of $T$, the RMS value of the flicker noise is then:

$$V_{\text{RMS,CM}} = \sqrt{\int_{f_l}^{f_h} V_{\text{OUT}}^2(f) \, df} = \sqrt{\int_{1/T}^{\infty} K_{\text{CM}} \frac{df}{f^3}} = \sqrt{\left[ \frac{-K_{\text{CM}}}{2f^2} \right]_0^{\infty}} = T \sqrt{\frac{K_{\text{CM}}}{2}} \tag{23}$$

The limits of integration correspond to the sense time; the upper limit is placed at infinity for convenience, its effect is negligible. We estimate the lowest frequency detectable by our integrator as the inverse of the sense time. Equation (23) shows the drawback of directly integrating flicker noise. As the sense time increases, the RMS value of the output noise increases linearly. Therefore, increasing the time allotted for a current mode sense will not lead to an increase in measurement signal-to-noise ratio (SNR) because both the signal and the output noise grow linearly with time.

In contrast to the current-mode sense technique, the proposed voltage-mode sense does not integrate the flicker noise. The input flicker noise is instead multiplied by the DC gain given by (13):

$$V_{\text{OUT,VM}}^2(f) = \left[ \frac{G_{m1}}{G_{m3}} \right]^2 \frac{F_{\text{NN}}}{f} = \frac{K_{\text{VM}}}{f} \tag{24}$$

accounts for all the constants that appear in (24). As before, to obtain the RMS value of the sense-amp’s output noise, we integrate over the relevant bandwidth:
For Equation (25), \( f_H \) cannot be allowed to equal infinity. In (23) it was permissible because it had little effect on the overall noise. We may refer to Figure 9 to verify that the input noise bandwidth will not exceed 1GHz, thus the term \( \ln(f_H) \) has an upper bound. The output noise voltage, given by Equation (25), will increase over time. However, because the desired voltage will increase linearly with time, the result is an SNR which increases with increasing sense time [8]:

\[
SNR_{YM} = \frac{V_{IN} \cdot T}{\sqrt{K_{YM} (\ln(f_H) + \ln(T))}} \tag{26}
\]
TRANSISTOR LEVEL MODEL OF SENSE AMPLIFIER

Introduction

In Section III, the sense-amp was divided into several key blocks. Each block had one or several characteristics which directly affected the overall performance of the sense amplifier. Through behavioral simulation, the requirements of these individual blocks were established. In this way, the transistor-level design phase is greatly reduced in difficulty. We are no longer building a sense amplifier outright but rather a set of small circuit blocks which are assembled to create the complete system.

AMI’s C5N (0.5µm, 5V supply) digital CMOS process was the chosen technology for the design. This is a proof-of-concept work; scaling down the sense-amp to smaller technology nodes would in all likelihood be possible once the initial design is demonstrated. Additionally, AMI makes the C5N process readily available to Boise State University through the MOSIS program.

DC Biasing

To enable maximum output voltage swing for the two integrators, a wide-swing biasing topology is employed, as seen in Figure 33.
It is important to note that Chapter III does not address the biasing stage for the sense-amplifier.

The reference current enters via the RefInput pin. The leftmost branch of NMOS devices mirrors the reference current to M3 and M4. M6 is a long-L device intended to have its $V_{SG}$ equal to $V_{THP} + 2V_{SATP}$. This would allow an integrator output voltage to rise as high as $V_{DD} - 2V_{SATP}$, while staying in the saturation region. Realistically, biasing this close to the edge of saturation significantly reduces the OTA gain for $V_{OUT} \approx V_{DD} - 2V_{SATP}$.

Looking back to Figure 14, we see that the both integrator's outputs remain fairly close to $V_{DD}/2$. This indicates that sacrificing gain for the maximum possible output voltage range is unnecessary. Thus, we would be better served by sacrificing some output swing for a larger DC gain.
To that end, the length of M6 is boosted up to 15µm. The result is a $V_{SG}$ of just over $V_{THP} + 4V_{SATP}$. This extra voltage headroom ensures that the drain of the topmost PMOS device in the OTA will not easily leave saturation. The same line of reasoning applies to M14 --- a long length boosts the $V_{GS}$ enough to ensure that the bottommost NMOS device in the OTA will not easily leave saturation either.

The relatively small device widths hint at a very low-power approach to the sense-amp design. Indeed, this is true but note at this stage the design does not mandate a selection for $I_{BIAS}$. All bulks are tied to $V_{DD}$ and ground as is appropriate.

**Operational Transconductance Amplifier**

An OTA receives an input voltage and produces an output current. If that output current is directed into a capacitor, an integrator is formed. At the heart of the $\Delta\Sigma$ modulator architecture are two integrators, thus the need for two OTAs in the design.

**General Integrator Analysis**

Chapter III explains the specifications required of both OTAs. Therefore, this sub-section is dedicated to the realization of those integrators in transistor form. Before selecting a topology, it would be instructive to review the performance of a transconductance-based integrator.

The fact that Figure 34 represents an integrator is easily verified. If $V_{IN}$ is constant for example, it generates an $I_{OUT}$ equal to $V_{IN} \cdot G_m$ which enters the
capacitor's top plate. The capacitor will behave according to
\[ I_c = C \frac{dV_C}{dt}, \]
which yields an output voltage of:
\[ V_C(t) = V_{OUT}(t) = V_{IN} \cdot \frac{G_m}{C} \cdot t \quad (27) \]

![Figure 34: A Transconductance Based Integrator (with Reference Direction Indicated)](image)

So, a constant voltage input produces a linearly rising output voltage. We may say that the input voltage is being integrated over time by the combination of a \( G_m \) (transconductance) stage followed by a capacitor. Note that this \( G_m \) stage is an ideal OTA with infinite gain, a detail which will be addressed shortly.

Of more interest to us is the frequency performance of the integrator. We again analyze Figure 34, but in the Laplace domain:
\[ V_{OUT}(s) = V_{IN}(s) \cdot G_m \cdot \frac{1}{sC} \quad (28) \]

Let's concentrate on the magnitude of \( V_{OUT}(s)/V_{IN}(s) \), the integration transfer function, as shown in (29).
\[ |H_{INT}(s)| = \left| \frac{V_{OUT}(s)}{V_{IN}(s)} \right| = \frac{\omega_o}{\omega}, \omega_o = G_m/C \quad (29) \]
\( \omega_u \) is the unity-gain bandwidth of the integrator. Equation (29) shows that there are two methods of increasing the integrator’s bandwidth: either increase the transconductance or decrease the capacitance.

Thermal noise \( (kT/C \text{ noise}) \) and capacitive feed-through place a practical lower limit on the integration capacitor size. This design employs 500fF capacitors. They allow for increased bandwidth without introducing signal integrity or mismatch issues.

**Maximizing Integrator Bandwidth**

The only way then to increase integrator bandwidth involves increasing the OTA transconductance. Most fully differential OTA topologies have a transconductance equal to the transconductance of the input diff-pair. Thus, it is desirable that we seek to maximize the \( g_m \) of the input devices in order to obtain a high integrator bandwidth.

The transconductance of a MOSFET device is given by:

\[
g_m = \sqrt{2 \cdot KP \frac{W}{L} I_D} \tag{30}
\]

\( KP \) is the MOSFET’s transconductance parameter and is fixed by the technology node. Thus, for a given drain current \( I_D \), increasing the transistor width is the only viable way of boosting the transistor’s \( g_m \). Increased width translates to smaller gate overdrive voltage for a given drain current. Extremely large widths will result in the input diff-pair being biased close to sub-threshold operation.
The emphasis on maximizing $g_m$ apart from $I_D$ stems from the fact that this is a low-power design. Drain current will be set independently from OTA considerations (unless obtaining the necessary $g_m$ becomes impossible).

**Maximizing Integrator Gain**

Although the ideal integrator has infinite gain at DC, this is not attainable in practice. In reality, the integrator will have a magnitude response which flattens out for low enough frequencies, as illustrated in Figure 35.

![Figure 35: Integrator Frequency Response](image)

Previously, we concerned ourselves with maximizing $\omega_u = g_m / C$.

However, Chapter III also covered the importance of the integrator’s DC gain. Figure 35 refers to this gain as $A_{OLDC}$ (open-loop DC gain). The open loop gain of the OTA is given by:

$$A_{OLDC} = G_m R_o \quad (31)$$
Again, $G_m$ is the overall transconductance of the OTA, and $R_o$ is the output resistance of the OTA. Note that $\omega_{3dB}$ is not independent of both $\omega_u$ and $A_{OLDC}$ because the gain-bandwidth product is constant during “roll-off”:

$$\omega_{3dB} = \omega_u \frac{1}{A_{OLDC}}.$$ It is for this reason $\omega_{3dB}$ is not explicitly discussed.

Back to equation (31), we see that we need to maximize both OTA transconductance and output resistance. But these seem like conflicting demands because $g_m \propto \sqrt{I_D}$ but $r_o \propto 1/I_D$ (the individual transconductance and output resistance, respectively). Fortunately, a deeper look reveals a solution.

The OTA is a cascaded structure, meaning that the output resistance is enhanced via localized feedback. The result is that $R_o$ is a function not only of individual transistor output resistance $r_o$, but also of the $g_m$. For all cascaded structures:

$$R_o \propto g_m r_o^2$$  \hspace{1cm} (32)

Although the $g_m$ from equation (32) is not necessarily related to the $G_m$ from equation (31), they still track each other. Therefore, we can describe the relationship between $A_{OLDC}$ and $I_D$ more accurately.

$$A_{OLDC} = G_m R_o \propto \left(\sqrt{I_D}\right) \left(\frac{1}{I_D}\right) \propto \frac{1}{I_D}$$  \hspace{1cm} (33)

Equation (33) indicates that by reducing the bias current flowing through the OTA, the small-signal DC gain is actually increased. This happens because
the output resistance increases faster than the transconductance decreases when reducing $I_D$. Attempting low-power operation then is not as large an impediment to large DC gain as it is to large unity gain bandwidth.

*Other Concerns Regarding the Integrator*

As shown in Figure 11, the input voltage from the cross-point is relatively small. Additionally, the array voltage is compared with ground. The common-mode voltage to our OTA then is essentially zero. This eliminates using NMOS devices for the input diff-pair so PMOS devices must be used.

The output of the OTA is single-ended because there is only one integrator output. Therefore, the OTA topology exhibits a Miller op-amp style gate-drain connection which mirrors output current from the unused branch to the designated output branch. This technique doubles the available gain.

Finally, the first OTA output should be DC biased to serve as one input for the next OTA stage. A reasonable value for that DC voltage is $V_{DD}/2$ which would easily enable using more “powerful” NMOS devices on the input of the second OTA.

Figure 36 shows the design implemented for the first OTA. The biasing for the OTA is detailed in the DC Biasing sub-section. Because the input devices are PMOS, it is also referred to as the POTA.
The connection from node Va3 to the gates of M6 and M8 accomplishes the differential to single-ended conversion. The widths of M3 and M4 are multiplied by two to account for incoming current from the input diff-pair.

Also of note are the bulk connections for M11 and M12. Normally the bulk connection is made to \( V_{DD} \) for PMOS devices. In an effort to boost input diff-pair \( g_m \) (and thus overall transconductance \( G_m \)) the bulks are connected to the sources, eliminating body effect for those devices. This is acceptable because the input to the POTA is DC. The additional capacitance placed on the source of M11 and M12 does not affect low-frequency performance.

To generate the frequency response plot seen in Figure 37, a bias current of 0.5µA was used. Inspection of the plot gives a gain of 66.2dB (or
approximately 2000) and a unity-gain frequency of 3.4MHz. This bandwidth is well within the required range set in Chapter III.

If additional bandwidth is required, the bias current can be doubled to 1µA which yields Figure 38.

Figure 37: POTA Magnitude Response ($I_{\text{BIAS}}=0.5\mu\text{A}$) with $C=500f\text{F}$

Figure 38: POTA Magnitude Response ($I_{\text{BIAS}}=1\mu\text{A}$) with $C=500f\text{F}$
The bandwidth has increased to 6.5MHz which is approximately doubled from Figure 37. It is interesting to note that the DC gain has increased slightly (instead of decreasing as we had predicted earlier).

The Second OTA Design and Performance

The input to the second OTA, shown in Figure 39, is the same node as the output of the POTA. This node is biased at \( V_{dd}/2 \); the second OTA then can have NMOS input devices. The second OTA then is referred to as the NOTA. Aside from that difference, the NOTA is just a mirrored version of the POTA.

![Figure 39: The Second OTA (NOTA)](image)

In order to ensure that the output of the POTA fluctuates around \( V_{dd}/2 \), the positive input of the NOTA is connected to \( V_{dd}/2 \) in the same way that the
positive input of the POTA is connected to ground. The magnitude response then for the integrator is seen in Figure 40.

Figure 40: NOTA Magnitude Response ($I_{\text{BIAS}}=0.5\mu\text{A}$) with $C=500\text{fF}$

The NOTA has a magnitude of 68.5dB (approximately 2500) and a unity-gain bandwidth of 4MHz. Note that the NOTA has better performance for a given bias current. This is because the transconductance parameter $K_P$ for the NMOS device is inherently larger. It would be preferable to use two NMOS devices, but the zero-volt input common mode is a requirement for the first integrator. With increased bias current, we generate Figure 41: NOTA Magnitude Response ($I_{\text{BIAS}}=1\mu\text{A}$) with $C=500\text{fF}$.
The NOTA responds similarly to a doubling of bias current. The DC gain increases slightly to 69dB but the bandwidth increases to 7.75MHz.

<table>
<thead>
<tr>
<th>$I_{\text{BIAS}}$</th>
<th>$f_u$ (MHz)</th>
<th>$A_{\text{OLDC}}$ (dB)</th>
<th>$G_m$ (µS)</th>
<th>$f_u$ (MHz)</th>
<th>$A_{\text{OLDC}}$ (dB)</th>
<th>$G_m$ (µS)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.5µA</td>
<td>3.4</td>
<td>66.2</td>
<td>10.7</td>
<td>4</td>
<td>68.5</td>
<td>12.6</td>
</tr>
<tr>
<td>1µA</td>
<td>6.5</td>
<td>68</td>
<td>20.4</td>
<td>7.75</td>
<td>69</td>
<td>24.3</td>
</tr>
</tbody>
</table>

Table 1 summarizes the integrator performance.

It is interesting to note that the POTA and NOTA transconductance almost double with a doubling in bias current. Earlier we stated that $G_m$ is proportional to
the square root of $I_D$. But Table 1 indicates a linear relationship. This indicates that the input devices are straddling the saturation and sub-threshold region. The $G_m$ for each configuration was calculated using equation (29).

**Differential Comparator**

The design of the differential comparator is in some ways more straightforward than that of the integrators. As long as delay and hysteresis are minimized, the comparator does not have a large effect on the overall sense-amp operation. Figure 42 shows the comparator design.

![Figure 42: The Differential Comparator](image-url)
The operation and design configuration of the differential comparator will now be discussed.

**Input Stage**

The input stage for the comparator closely resembles that of the POTA. A cascaded current source is used to bias a PMOS input differential pair. The widths of M13 and M14 are not made as wide as in the case of the POTA. This input pair steers current into the gate-drain connected transistors M11 and M12 which act like resistors.

If the inputs to the comparator did not pass through the input stage but rather were connected directly to the gates of M3 and M4, the comparator would not be able to accept a wide range of input common mode voltage. The input would likely be too high, pushing M3 and M4 into the linear region of operation, or too low shutting M3 and M4 off. The input stage, because of the cascaded current source biasing the diff-pair, allows the integrator output nodes (the inputs to the differential comparator) to drift between over a much larger range of voltages.

**Decision Making Stage**

The decision-making stage of the comparator has pre-outputs labeled pre_voutm and pre_voutp. While the clock is low, M7 and M8 pull these pre-outputs high. At the same time, M5 and M6 isolate the inputs from the differential comparator’s outputs. When the clock goes high, M5-M8 essentially disappear from the comparator, leaving only a cross-coupled inverter pair set at their switching point. The level of the input voltage on M3 and M4 determines which
inverter will pull its output up to $V_{dd}$ and which will be forced to ground. If, for example, the input voltage on M3 was slightly greater than the input voltage on M4 (meaning that M12 has more current flowing through it than M11 does), M10 would need to source more current. To produce that extra current, M10 must reduce its drain voltage, which is pre_voutm. That drain voltage is also the input voltage to the opposite inverter - lowering it pulls up pre_voutp at the same time pre_voutm is being pulled to ground.

**Output Latch**

The final stage of the differential comparator serves two purposes. While the clock is low, the decision-making stage pulls both pre_voutp and pre_voutm high. The cross-coupled NAND gates, configured as an SR-latch, hold their previous state when both inputs are high. When the clock rises, the decision-making stage of the comparator pulls either pre_voutp or pre_voutm low, causing a change on the appropriate latch output.

Inverters are used as buffers between the pre_voutp and pre_voutm nodes and the SR-latch inputs. When the latch’s own positive feedback is in action, it can produce “kickback” noise that adversely affects the comparator’s decision.

The inverters that have inputs connected to pre_voutp and pre_voutm are designed to have a lower than $V_{dd}/2$ switching point. This prevents the comparator’s outputs from glitching, especially when the input voltages are very similar.
Transient Behavior

In Chapter III, we modeled three key comparator characteristics: delay, voltage offset, and hysteresis. Now these characteristics will be observed in the transistor-level comparator implementation.

To test the comparator’s delay, the negative input is held constant at \( V_{DD}/2 \) while the positive input is slowly “ramped up” until it is greater than the negative input. In Figure 43, the comparator is clocked at 100MHz, or every 10ns.

Figure 43: The Comparator Delay
In Figure 43, the comparator’s positive input becomes larger than its negative input at t=1.00µs. Because the comparator only clocks every 10ns, the first opportunity to assert $V_{OUTP}$ comes at 1.01µs. Figure 44 has a zoomed-in look at $V_{OUTP}$, $V_{OUTM}$, and $V_{CLOCK}$. By inspection, we see that it takes approximately 1.6ns from rising clock edge to valid comparator output. Referring to Figure 26, we see that this is perfectly acceptable.

Figure 43 also allows evaluation of the comparator offset. Over 2µs of time, $V_{INP}$ rises from 2.4V to 2.6V. This equates to a rate of 0.1mV/ns. When the comparator asserts $V_{OUTP}$ just after the clock edge at t=1.01µs, $V_{INP}$ is 1mV larger than $V_{INM}$. Therefore, the comparator’s offset voltage could not be any larger than 1mV. According to Figure 19, this is more than acceptable for an offset voltage. Such analysis depends on the perfect matching of input devices. In reality, there
is never a perfect matching. A more likely scenario is a percentage mismatch of input device widths (M13 and M14 in Figure 42).

In Figure 45, M13 (which has $V_{\text{INP}}$ connected to its gate) is reduced from 20µm to 18µm whereas M14 (which has $V_{\text{INM}}$ connected to its gate) is increased from 20 µm to 22µm. The result is the creation of 7mV input voltage. Again referring to Figure 19, this input voltage will not affect the overall sense-amp performance.

![Figure 45: Comparator Offset for Mismatched Input Devices](image)

To evaluate comparator hysteresis, we examine Figure 46. It is primarily the same as Figure 43, except that $V_{\text{INP}}$ begins at a greater voltage than $V_{\text{INM}}$, and “ramps down”.

From Figure 46, we see that the comparator exhibits no hysteresis - the comparator ($V_{\text{OUTP}}$) transitions from high to low at the same voltage as it transitions from low to high. If there were any hysteresis voltage, it would be less
than 2mV, which according to Figure 22 is insignificant. The comparator then meets all the requirements detailed in Chapter III.

Figure 46: The Comparator Hysteresis

The Sense Amplifier

The sense amplifier consists primarily of the blocks described in the above sub-sections. Figure 47 illustrates the overall design.
There are components in Figure 47 which have not previously been discussed. In Figure 5 there are two feedback transconductances (both with a value of $G_{m3}$) which are implemented in the transistor level as simple two-transistor current sources. Because the current source can only have two output currents ($I_{FB}$ or 0), it was possible to model it as a linear transconductance. The value of the feedback current is:

$$I_{FB} = G_{m3} \cdot V_{DD} \quad (34)$$
The feedback current is switched on by applying $V_{DD}$ to the top of the current source, and switched off by driving it to ground. The alternative is to use pass gates to allow for $V_{BIAS3}$ and $V_{BIAS4}$ to drive the current source gates. However, this technique causes feed-through which can disturb the high-impedance integrator output nodes. The employed technique places the switching node well away from the sensitive integrator outputs.

For a very brief period before performing a sense, the two integrator output nodes are equalized to $V_{DD}/2$. If these nodes were not equalized prior to sensing, it is possible a previous sense could drive the nodes so far apart as to affect the next sense. Providing the OTAs themselves with the drive strength to recover would not be feasible under a low-power scheme, and so it is accomplished with these equalization devices (seen as M3 and M4 in Figure 47).

Visible in the bottom right of Figure 47 are the digital counter and digital-to-analog converter (DAC). The counter is the final stage of the sense amplifier, and the DAC is present to provide more palpable interpretation of the counter’s 8-bit output. Both digital blocks were modeled with ideal switches.
SENSE AMPLIFIER PERFORMANCE EVALUATION

Introduction

While the performance of the individual transistor-level blocks which comprise the sense-amp have been discussed in the previous chapter, the full system level performance has not yet been detailed.

Basic Operation

The purpose of this section is to verify the operation of the sense amplifier by examining key voltages and currents. We'll assume a $V_{IN}$ of 10mV is present on the sense amplifier input for this section.

We begin at the input, where the 10mV input voltage produces an output current of approximately 100nA or 0.1µA. This current is seen in Figure 48.

![Figure 48: Current pulled from $V_I$ by $V_{IN}$ via the POTA](image)
The current direction is considered to be positive if entering the POTA, and based on Figure 47 this makes intuitive sense because the only input to the POTA is negative. The noise present in the current is the result of switching occurring elsewhere in the sense-amp. An input voltage of 10mV and an output current of approximately 0.1µA gives a $G_m$ of 10µS.

Next, we look at the two integrator output node voltages in Figure 49: $V_1$ and $V_2$. Because these inputs feed directly into the differential comparator, their behavior offers insight into the sense-amp’s overall performance.

![Figure 49: The Integrator Output Nodes $V_1$ and $V_2$](image)

Because the input voltage is approximately in the middle of allowable range, we would expect a final output count of approximately 50 (out of a possible 100). That would lead us to believe that $V_1$ and $V_2$ should be approximately equal to each other during a sense. That is in fact what we observe in Figure 49. It is interesting to note that when $V_2$ becomes larger than...
\( V_1 \), the comparator “kicks in” and pushes the nodes apart for a moment. When
the comparator turns off, the nodes start drifting towards each other once again.

Figure 50 shows the outputs of the differential comparator. As expected, \( V_{OUTP} \) and \( V_{OUTM} \) both have approximately 50% duty cycles. Notice also that the
comparator’s outputs only have the opportunity to change on the clock’s rising
edge. Figure 50 is not shown on the same time scale as Figure 49.

\[ \text{Figure 50: The Comparator Outputs } V_{OUTP} \text{ (} V_{UP} \text{) and } V_{OUTM} \text{ (} V_{DOWN} \text{)} \]

The comparator’s positive output is what enables and disables the
counter. While it is high, the counter counts up for every rising clock edge. If
\( V_{OUTP} \) is low, the counter will remain at its current count until \( V_{OUTP} \) goes high
once more. Figure 51 illustrates that functionality.
It is easy to observe that while $V_{UP}$ in the figure) remains low, the counter does not increment. The astute reader will note that $V_{OUT}$ in Figure 51 is not an 8-bit digital value but an analog voltage. It is analog because the counter’s digital output is passed through an ideal DAC to allow for more direct observation.

The comparator’s output are used in two portions of the sense-amp: to control the digital counter and to provide feedback currents to the integrator output nodes. Figure 52 shows the feedback currents being directed by the comparator’s outputs.
The negative feedback present can be easily verified. If $V_2$ were larger than $V_1$, it would produce a positive output: $V_{OUTP} = V_{DD}$ and $V_{OUTM} = 0$. To remedy this imbalance, the comparator uses $V_{OUTP}$ to drive a positive current into $V_1$, which pulls the node up. The comparator does not drive an input current into $V_2$ (that is to say it sources 0A into $V_2$). This allows the NOTA, which from Figure 49 has an average value above $V_{DD}/2$, to pull current from $V_1$. If $V_{OUTP} = 0$ and $V_{OUTM} = V_{DD}$, the situation is reversed and the comparator works to raise $V_2$ and lower $V_1$.

Also observable in Figure 52 is the value of $G_{m3}$. A 5V input produces an output current of approximately 250nA. This is equivalent to a $G_{m3}$ of 0.05µS. As mentioned earlier, because the feedback currents are either zero or 0.20µA, they can be modeled as a linear transconductance.
Noiseless Performance

The noiseless, purely deterministic SPICE simulation of the sense amplifier is shown in Figure 53. The sense-amp exhibits a very linear relationship between input voltage and final output count. Figure 53 should be compared with the ideal case in Figure 8 and the behavioral simulation result from Figure 28.

![Sense Amplifier Performance](image)

Figure 53: Sense Amplifier Performance

By inspecting Figure 53, we immediately find that the DC gain of $G_{m1}/G_{m3}$ is not precisely equal to 250. The gain appears to be:

$$
\frac{G_{m1}}{G_{m3}} = \frac{100/100 \cdot 5V}{21mV} = 238
$$

(35)
The cause of the reduced gain is due to the fact that OTA transconductance is not a pure linear function of $V_{\text{IN}}$ and tends to saturate for large input voltages.

**Performance in the Presence of Noise on the Input Voltage**

Input noise will cause output count variations, so a single input voltage will produce a range of possible output count values. The more closely grouped these count values are, the more precise the sense amplifier may be considered. As the input noise voltage becomes greater, eventually the sense-amp will not able to resolve differences in the actual input signal voltage from the array.

Equation (20) predicts the worst-case amount of noise voltage that will appear on the sense amp input due to the resistances which comprise the memory array (for a given array size).

Figure 54 attempts to relate the output count variation to the input voltage with an additive white noise source that has an RMS value of 1mV. Figure 54 actually is a distribution; it relates the number of times a particular output count was recorded (out of 10 total) for sense-amp inputs from 0 to 20mV. Each curve represents a different input voltage; thus there are 21 curves present.
Of chief importance in Figure 54 is the fact that none of the curves overlap. That is to say, no two input voltages ever register the same output count. If this were to happen, the sense-amp may have difficulty resolving the difference between the two inputs, resulting in a failed sense.

Figure 54 should be directly compared with Figure 32. The former was produced using SPICE, and the latter was produced using Simulink. The behavioral model correctly predicted that 4mV of RMS noise on the input would not adversely affect sense-amp performance.

Another way to visualize the sense-amp performance is seen in Figure 55.
The final count for each $V_{IN}$ is self-contained, meaning that its minimum is not less than the count generated by $1\text{mV}$ less than $V_{IN}$ and that its maximum is not greater than the count generated by $1\text{mV}$ more than $V_{IN}$.

**Performance in the Presence of Noise on the Power Supply**

Sensitive analog measurements are susceptible to noise. The input to the sense amplifier is not the only source of this undesired interference; the power supply can also introduce noise into the sense-amp. This is especially true when the majority of the gates on the chip are performing digital functions. All digital gates produce a so-called “crowbar current” from the supply; this current peaks while the gate is in the process of switching and that usually happens around a
clock edge. Because much of the gate switching activity is coordinated by the clock edges, the power supply voltage can fluctuate significantly even over one clock period.

Precision sensing techniques are vulnerable to this supply noise simply because they depend on a quiescent supply voltage. Fortunately, this design exhibits near immunity to a reasonable amount of supply noise. Because the sense-amp employs a technique that averages the input signal over time, interfering noise sources that have a zero mean will average out. The noise can be sourced from the input terminal or from the supply voltage (or from ground), but the result is the same - the noise averages out. Furthermore, the more time averaging techniques are allowed, the more effectively they neutralize zero-mean noise sources.

Figure 56 shows the sense-amp performing a sense while in the presence of a 50mV, RMS noise source on the supply voltage. Note that disturbances on $V_{DD}$ are also seen on the integrator output nodes.
Figure 56: 50mV, RMS Supply Voltage Noise Affects Integrator Nodes

Figure 57 demonstrates the overall lack of affect the 50mV, RMS supply noise voltage had on the sense-amp's performance.

Figure 57: Sense Amplifier Performance with 50mV of RMS noise on VDD
Notice that Figure 57 details performance that is almost indistinguishable from Figure 53.

**Power Consumption**

The current drawn by the sense amplifier is shown in Figure 58.

![Figure 58: Sense Amplifier Current Draw](image)

The average current drawn while sensing is approximately 80µA. The power consumption then is:

\[ P_{AVG} = V_{DD} \cdot I_{AVG} = 5V \cdot 80\mu A = 400\mu W \]  \hspace{1cm} (36)

The majority of the power is consumed in the comparator during its decision-making. It is clear from Figure 58 that after the clock edges, the current draw levels out to either 6µA or 8.5µA, depending on if the clock is high or low. Therefore, the stand-by power-draw is considerably lower:

\[ P_{STANDBY} = V_{DD} \cdot I_{STANDBY} = 5V \cdot 9\mu A = 45\mu W \]  \hspace{1cm} (37)
SUMMARY

The performance of the sense-amp meets the requirements set forth in Chapter III. The sense-amp can accurately resolve voltages as small as 1mV using the proposed sensing method.

Future design revisions would focus on even greater reductions in power while simultaneously achieving an increase in speed. Scaling down the technology used to fabricate the sense-amp would provide such performance enhancements. However, there are other changes which would have a significant effect on the sense amp’s performance.

The design of the comparator was not optimized in terms of power consumption, and as a result, the bulk of the power is consumed by digital portion of the sense-amp. Comparator re-design would focus on conserving power during the “decision-making” phase of operation.

Optimizing the clock speed depending on the required precision is another area of interest. If lower accuracy is desired, the sense-amp should be able to reduce the amount of time required to perform a sense. If maximum precision is required, the sense time could be increased. Applying additional creativity and ingenuity to the sense-amp design in order to reduce the required sense time would be of utmost importance.

In general, the sense amplifier detailed in this thesis can be characterized as “slow but accurate.” Indeed, the sense-amp can even improve its accuracy given more time. However, this shortcoming in speed ultimately will limit the available market space for noise-shaping sense amplifiers. While current-mode sensing may not be as accurate, the memory array can be partitioned into sub-
arrays until sensing is feasible. Noise-shaping sense technology could find a niche in the space between high-speed memory and high-density memory.

Additional extenuating circumstances have slowed the impetus behind developing ultra high precision sense technologies. Because FLASH memory has continued to scale well below 65nm, it currently outstrips the density of anything offered by MRAM or PCRAM. Until FLASH cannot be scaled further, the available densities of these nascent technologies will likely remain several generations behind.
REFERENCES


GLOSSARY OF TERMS

AMI: Semiconductor manufacturer located in Pocatello, Idaho.

DRAM: Dynamic random access memory where the memory element is capacitive charge.

dB: Decibels

FLASH: A type of non-volatile memory with read speeds similar to DRAM but write times on the order of milliseconds

MRAM: Magnetic random access memory

MTJ: Magnetic tunnel junction

NMOS: n-type MOS transistor

NOTA: OTA with an NMOS input differential pair

OTA: Operational transconductance amplifier

PC-RAM: Phase-change random access memory

PMOS: p-type MOS transistor

POTA: OTA with a PMOS input differential pair

RMS: Root mean-squared

SPICE: A general purpose analog circuit simulator

SNR: Signal-to-noise ratio; often measured in decibels.

SRAM: Static random access memory where the memory element is a pair of cross-coupled inverters.