OPERATION OF DIMM’S USING DDR4

ECG721 (Memory Circuit Design)
Vikas Vinayaka
April 24th, 2017
Outline

• Computer memory organization
• DDR interface
• DDR4 advantages
• DDR4 techniques
• Types of DIMM
• The future
Memory terminologies

- DDR – Double Data Rate
  - Data changes on both rising and falling edge
- SDRAM – Synchronous DRAM
  - An input clock dictates input and output of data compared to Asynchronous DRAM which is dependent on the internal latencies
- Memory bank
  - Collection of small DRAM arrays which has its own peripheral circuitry
- Memory configuration - x4, x8
  - Memory is arranged such that 4 bits (for x4) are read from columns at a time from separate memory arrays in the same chip
Computer memory organization

- **Northbridge** (memory controller hub)
  - Graphics card slot
  - High-speed graphics bus (AGP or PCI Express)
- **Southbridge** (I/O controller hub)
  - PCI Bus
  - PCI Slots
  - IDE SATA
  - USB
  - Ethernet
  - Audio Codec
  - CMOS Memory
- **Super I/O**
  - Serial Port
  - Parallel Port
  - Floppy Disk
  - Keyboard
  - Mouse
- **SDRAM DIMMs**
- **CPU**
  - Clock Generator
  - Front-side bus
- **Cables and ports leading off-board**
Computer memory packaging evolution

- Discrete DRAM memory chips
- SIMM (Single Inline Memory Module)
- DIMM (Dual Inline Memory Module)
DDR interface block diagram

- Basic DDR SDRAM interface:
Why DDR4?

- Low voltage leading to low power
- Higher data transfer speeds
- Higher module density
- Maximum theoretical capacity per DIMM of 512GiB
## DDR comparison

<table>
<thead>
<tr>
<th></th>
<th>DDR1</th>
<th>DDR2</th>
<th>DDR3</th>
<th>DDR4</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>VDD [V]</strong></td>
<td>2.5</td>
<td>1.8</td>
<td>1.5</td>
<td>1.2</td>
</tr>
<tr>
<td><strong>Data Rate [bps/pin]</strong></td>
<td>200M~400M</td>
<td>400M~800M</td>
<td>800M~2.1G</td>
<td>1.6G~3.2G</td>
</tr>
<tr>
<td><strong>Pre-Fetch</strong></td>
<td>2 bit</td>
<td>4 bit</td>
<td>8 bit</td>
<td>8 bit</td>
</tr>
<tr>
<td><strong>STROBE</strong></td>
<td>Single DQS</td>
<td>Differential DQS, DQSB</td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>Interface</strong></td>
<td>SSTL_2</td>
<td>SSTL_18</td>
<td>SSTL_15</td>
<td>POD_12</td>
</tr>
<tr>
<td><strong>New Feature</strong></td>
<td>OCD calibration, ODT</td>
<td>Dynamic ODT, ZQ calibration, Write leveling</td>
<td>CA parity, DBI*, CRC*</td>
<td>PDA*, CAL*, FGREF*, TCAR*, Bank grouping</td>
</tr>
</tbody>
</table>

* DBI: Data bus inversion
* CRC: Cyclic redundancy check
* CAL: Command address latency

* PDA: Per DRAM addressability
* FGREF: Fine granularity refresh
* TCAR: Temperature controlled array refresh
• DDR4 modules feature a curved edge to help with insertion and alleviate stress on the PCB during memory insertion.
## DDR4 Speed Grades

<table>
<thead>
<tr>
<th>Standard name</th>
<th>Memory clock (MHz)</th>
<th>I/O bus clock (MHz)</th>
<th>Data rate (MT/s)</th>
<th>Module name</th>
<th>Peak transfer rate (MB/s)</th>
<th>Timings, CL-tRCD-tRP</th>
<th>CAS latency (ns)</th>
</tr>
</thead>
<tbody>
<tr>
<td>DDR4-1600J*</td>
<td>200</td>
<td>800</td>
<td>1600</td>
<td>PC4-12800</td>
<td>12800</td>
<td>10-10-10</td>
<td>12.5</td>
</tr>
<tr>
<td>DDR4-1600K</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>11-11-11</td>
<td>13.75</td>
</tr>
<tr>
<td>DDR4-1600L</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>12-12-12</td>
<td>15</td>
</tr>
<tr>
<td>DDR4-1866L*</td>
<td>233.33</td>
<td>933.33</td>
<td>1866.67</td>
<td>PC4-14900</td>
<td>14933.33</td>
<td>12-12-12</td>
<td>12.857</td>
</tr>
<tr>
<td>DDR4-1866M</td>
<td>233.33</td>
<td>933.33</td>
<td>1866.67</td>
<td>PC4-14900</td>
<td>14933.33</td>
<td>13-13-13</td>
<td>13.929</td>
</tr>
<tr>
<td>DDR4-1866N</td>
<td>233.33</td>
<td>933.33</td>
<td>1866.67</td>
<td>PC4-14900</td>
<td>14933.33</td>
<td>14-14-14</td>
<td>15</td>
</tr>
<tr>
<td>DDR4-2133N*</td>
<td>266.67</td>
<td>1066.67</td>
<td>2133.33</td>
<td>PC4-17000</td>
<td>17066.67</td>
<td>14-14-14</td>
<td>13.125</td>
</tr>
<tr>
<td>DDR4-2133N</td>
<td>266.67</td>
<td>1066.67</td>
<td>2133.33</td>
<td>PC4-17000</td>
<td>17066.67</td>
<td>15-15-15</td>
<td>14.063</td>
</tr>
<tr>
<td>DDR4-2133R</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>16-16-16</td>
<td>15</td>
</tr>
<tr>
<td>DDR4-2400P*</td>
<td>300</td>
<td>1200</td>
<td>2400</td>
<td>PC4-19200</td>
<td>19200</td>
<td>15-15-15</td>
<td>12.5</td>
</tr>
<tr>
<td>DDR4-2400R</td>
<td>300</td>
<td>1200</td>
<td>2400</td>
<td>PC4-19200</td>
<td>19200</td>
<td>16-16-16</td>
<td>13.33</td>
</tr>
<tr>
<td>DDR4-2400U</td>
<td>300</td>
<td>1200</td>
<td>2400</td>
<td>PC4-19200</td>
<td>19200</td>
<td>18-18-18</td>
<td>15</td>
</tr>
<tr>
<td>DDR4-3200</td>
<td>400</td>
<td>1600</td>
<td>3200</td>
<td>PC4-25600</td>
<td>25600</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
## DDR4 command set

<table>
<thead>
<tr>
<th>Command</th>
<th>CS</th>
<th>BG1</th>
<th>BG0, BA1–0</th>
<th>ACT</th>
<th>A17</th>
<th>A16 RAS</th>
<th>A15 CAS</th>
<th>A14 WE</th>
<th>A13</th>
<th>A12</th>
<th>A11</th>
<th>A10</th>
<th>A9–0</th>
</tr>
</thead>
<tbody>
<tr>
<td>Deselect (no operation)</td>
<td>H</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>X</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Active (activate): open a row</td>
<td>L</td>
<td></td>
<td>Bank</td>
<td>L</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Row address</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>No operation</td>
<td>L</td>
<td>V</td>
<td>H</td>
<td>V</td>
<td>H</td>
<td>H</td>
<td>H</td>
<td>H</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ZQ calibration</td>
<td>L</td>
<td>V</td>
<td>H</td>
<td>V</td>
<td>H</td>
<td>H</td>
<td>L</td>
<td>L</td>
<td>V</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Long</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Read (BC, burst chop)</td>
<td>L</td>
<td></td>
<td>Bank</td>
<td>H</td>
<td>V</td>
<td>H</td>
<td>L</td>
<td>H</td>
<td>V</td>
<td>BC</td>
<td>V</td>
<td>AP</td>
<td>Column</td>
</tr>
<tr>
<td>Write (AP, auto-precharge)</td>
<td>L</td>
<td></td>
<td>Bank</td>
<td>H</td>
<td>V</td>
<td>H</td>
<td>L</td>
<td>L</td>
<td>V</td>
<td>BC</td>
<td>V</td>
<td>AP</td>
<td>Column</td>
</tr>
<tr>
<td>Unassigned, reserved</td>
<td>L</td>
<td></td>
<td>H</td>
<td>V</td>
<td>L</td>
<td>H</td>
<td>H</td>
<td>L</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Precharge all banks</td>
<td>L</td>
<td>V</td>
<td>H</td>
<td>V</td>
<td>L</td>
<td>H</td>
<td>L</td>
<td>L</td>
<td>V</td>
<td></td>
<td></td>
<td>H</td>
<td>V</td>
</tr>
<tr>
<td>Precharge one bank</td>
<td>L</td>
<td></td>
<td>Bank</td>
<td>H</td>
<td>V</td>
<td>L</td>
<td>H</td>
<td>L</td>
<td>V</td>
<td></td>
<td></td>
<td>L</td>
<td>V</td>
</tr>
<tr>
<td>Refresh</td>
<td>L</td>
<td>V</td>
<td>H</td>
<td>V</td>
<td>L</td>
<td>L</td>
<td>L</td>
<td>H</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Mode register set (MR0–MR6)</td>
<td>L</td>
<td>L</td>
<td>Register</td>
<td>H</td>
<td>L</td>
<td>L</td>
<td>L</td>
<td>L</td>
<td>L</td>
<td></td>
<td></td>
<td></td>
<td>Data</td>
</tr>
</tbody>
</table>

**Signal level (H, high • L, low • V, either low or high, a valid signal • X, irrelevant) • Logic level (🏾 Active • ⬜ Inactive • □ Not Interpreted)**
DDR4 techniques for high-speed

• Prefetch/burst-length
• Bank groups
• Smaller row sizes eases bank switching
Prefetch

- Number of data words fetched for a single column command
- Prefetch is used to increase the data input/output rate even when the core memory speed remains same
Prefetch (DDR3)

- $t_{CCD} = \text{Time between CAS to CAS commands}$
• DDR4 has 8n prefetch but uses bank groups to increase throughput
• Data should be read/written to different bank groups alternately to take advantage of lower tCCD_S
Prefetch (DDR4)

- $t_{CCD\_S} = t_{CCD}$ if data is prefetched from different bank groups
- $t_{CCD\_L} = t_{CCD}$ if data is in same bank group
Techniques for low power

- Reduced VDDQ (power supply voltage)
- POD12 (Pseudo Open Drain at 1.2V)
- DBI (Data Bus Inversion)
- Smaller row sizes reduces activation currents
POD12 (Pseudo Open Drain 1.2V)

- POD enables reduced switching current when driving data since only 0’s consume power
- Additional switching current savings can be realized with DBI enabled
- JEDEC JESD8-24 standard
DBI (Data Bus Inversion)

- Adopted for power savings
- DQ inverted if the data byte contains more than 4 0’s
DDR4 SDRAM chip block diagram

<table>
<thead>
<tr>
<th>Symbol</th>
<th>Parameter</th>
<th>Rating</th>
<th>Unit</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>$V_{DD}$</td>
<td>Supply voltage</td>
<td>1.14, 1.2, 1.26</td>
<td>V</td>
<td>1, 2, 3, 4, 5</td>
</tr>
<tr>
<td>$V_{DDQ}$</td>
<td>Supply voltage for output</td>
<td>1.14, 1.2, 1.26</td>
<td>V</td>
<td>1, 2, 6</td>
</tr>
<tr>
<td>$V_{PP}$</td>
<td>Wordline supply voltage</td>
<td>2.375, 2.5, 2.750</td>
<td>V</td>
<td>7</td>
</tr>
</tbody>
</table>
PCB microstrip transmission line

- DQ[x:0] and DQS should reach at the same time
- Any wire carrying an AC signal whose period is less than the propagation delay of the wire needs to be considered as a transmission line
- PCB microstrip signal speed = 2 inch/ns
- Target impedance for DDR4 microstrip transmission line is 100 to 120 ohms
PCB trace length matching

Unmatched

Matched
DDR4 DIMM PCB

- 8 layer PCB
- 8 SDRAM chips
- FBGA footprint (Fine Pitch Ball Grid Array)
Types of DDR4 DIMM modules

- Mainstream
  - UDIMM (Unbuffered DIMM)
  - LRDIMM (Load-reduced DIMM)
  - RDIMM (Registered DIMM)
  - FBDIMM (Fully-buffered DIMM)
  - NVDIMM (Non-volatile DIMM)

- Physical sizes
  - DIMM
  - SODIMM (Small-outline DIMM)
  - VLP-DIMM (Very low profile DIMM)
  - Mini-DIMM
UDIMM (Unregistered DIMM)

- Regular DIMMs used in desktop computers
- Memory controller interfaces directly to SDRAM chips
- SPD (Serial Presence Detect) ROM provides information about the DIMM during power on self test
RDIMM (Registered DIMM)

- Buffers command and address signals. Data signals connect unbuffered
- Contains RCD (Registering Clock Driver)
- More robust than UDIMMs
- ECC option available
LRDIMM (Load-Reduced DIMM)

- Trades-off latency for capacity
- Buffers both data and command/address lines
- Contains RCD (Registering Clock Driver) and DB (Data Buffer) for distributed buffering of data
- ECC available
FB-DIMM (Fully-Buffered DIMM)

- Scalable to very large memory sizes. Trades-off latency and speed
- Uses serial data link to transfer data
- Additional logic on DIMM converts serial to parallel
- JEDEC JESD206
- Used in servers
NVDIMM (Non-volatile DIMM)

- DDR4 DIMMs can contain memory other than DRAM
- Called “Hybrid Memory Modules”
- JEDEC JESD248 standard for NAND backed DRAM
- Another example is Intel and Micron’s 3D Xpoint DDR4 DIMM
- Mainly used in servers
DDR4 DIMM sizes

UDIMM/RDIMM/LRDIMM

SODIMM/ SORDIMM

VLP-UDIMM/VLP-RDIMM

VLP-Mini UDIMM
Implementing a DDR4 DIMM system

- JEDEC standards dictate almost all aspects of DDR4 system which ensures interoperability
- List of standards
  - JESD79-4 : DDR4 SDRAM specs
  - JESD8-24 : POD12
  - JESD82-32 : DB specs
  - JESD82-31 : RCD specs
  - JESD248 : NVDIMM specs
  - SPD4.1.2 : SPD specs
  - Etc…
- Ad-hoc standards such as DFI (DDR PHY Interface)
Other types of DDR memory

- **GDDR – Graphics DDR**
  - Optimized for large bandwidth and soldered on board. Has large IO like x16/x32
  - Used in conjunction with GPUs
- **LPDDR – Low Power DDR (also called Mobile DDR)**
  - Optimized for low power consumption and used as PoP or SiP
  - Used in mobile devices
The future

- DDR4 is the most cutting edge memory protocol standard to date (as of April 2017)
- DDR5 development in progress. JEDEC is working on the specification and plans to release in 2018 (estimated). DIMMs available for end user purchase in 2020
- Hybrid Memory Cube seeks to extend the life of DRAM by using stacked dies connected by TSVs
- Memory technologies like 3D Xpoint, PCM, RRAM etc. to be the successors (supposedly)
“That’s all Folks!”
References

• Micron DDR4 SDRAM MT40A1G4, MT40A512M8, MT40A256M16 datasheet
• ISSCC2013_T2, Chulwoo Kim, "High-Bandwidth Memory Interface Design“
• JEDEC JESD79-4
• JEDEC JESD8-24
• https://en.wikipedia.org/wiki/DDR4_SDRAM
• https://en.wikipedia.org/wiki/Synchronous_dynamic_random-access_memory
• https://www.micron.com/products/dram/ddr3-to-ddr4
• https://www.synopsys.com/designware-ip/technical-bulletin/ddr4-bank-groups.html
• http://frankdenneman.nl/2015/02/25/memory-deep-dive-ddr4/
• https://en.wikipedia.org/wiki/DIMM
• https://en.wikipedia.org/wiki/Memory_controller#SCRAMBLING
• http://www.anandtech.com/show/3851/everything-you-always-wanted-to-know-about-sdram-memory-but-were-afraid-to-ask/
• https://en.wikipedia.org/wiki/Registered_memory
• https://en.wikipedia.org/wiki/Fully_Buffered_DIMM
• https://en.wikipedia.org/wiki/Northbridge_(computing)
• https://en.wikipedia.org/wiki/NVDIMM
QUESTIONS?