Hubbry Logo
Dynamic random-access memoryDynamic random-access memoryMain
Open search
Dynamic random-access memory
Community hub
Dynamic random-access memory
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Dynamic random-access memory
Dynamic random-access memory
from Wikipedia

A die photograph of the Micron Technology MT4C1024 DRAM integrated circuit (1994). It has a capacity of 1 megabit equivalent to bits or 128 KiB.[1]
Motherboard of the NeXTcube computer, 1990, with 64 MiB main memory DRAM (top left) and 256 KiB of VRAM[2] (lower edge, right of middle)

Dynamic random-access memory (dynamic RAM or DRAM) is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal–oxide–semiconductor (MOS) technology. While most DRAM memory cell designs use a capacitor and transistor, some only use two transistors. In the designs where a capacitor is used, the capacitor can either be charged or discharged; these two states are taken to represent the two values of a bit, conventionally called 0 and 1. The electric charge on the capacitors gradually leaks away; without intervention the data on the capacitor would soon be lost. To prevent this, DRAM requires an external memory refresh circuit which periodically rewrites the data in the capacitors, restoring them to their original charge. This refresh process is the defining characteristic of dynamic random-access memory, in contrast to static random-access memory (SRAM) which does not require data to be refreshed. Unlike flash memory, DRAM is volatile memory (as opposed to non-volatile memory), since it loses its data quickly when power is removed. However, DRAM does exhibit limited data remanence.

DRAM typically takes the form of an integrated circuit chip, which can consist of dozens to billions of DRAM memory cells. DRAM chips are widely used in digital electronics where low-cost and high-capacity computer memory is required. One of the largest applications for DRAM is the main memory (colloquially called the RAM) in modern computers and graphics cards (where the main memory is called the graphics memory). It is also used in many portable devices and video game consoles. In contrast, SRAM, which is faster and more expensive than DRAM, is typically used where speed is of greater concern than cost and size, such as the cache memories in processors.

The need to refresh DRAM demands more complicated circuitry and timing than SRAM. This complexity is offset by the structural simplicity of DRAM memory cells: only one transistor and a capacitor are required per bit, compared to four or six transistors in SRAM. This allows DRAM to reach very high densities with a simultaneous reduction in cost per bit. Refreshing the data consumes power, causing a variety of techniques to be used to manage the overall power consumption. For this reason, DRAM usually needs to operate with a memory controller; the memory controller needs to know DRAM parameters, especially memory timings, to initialize DRAMs, which may be different depending on different DRAM manufacturers and part numbers.

DRAM had a 47% increase in the price-per-bit in 2017, the largest jump in 30 years since the 45% jump in 1988, while in recent years the price has been going down.[3] In 2018, a "key characteristic of the DRAM market is that there are currently only three major suppliers — Micron Technology, SK Hynix and Samsung Electronics" that are "keeping a pretty tight rein on their capacity".[4] There is also Kioxia (previously Toshiba Memory Corporation after 2017 spin-off) which doesn't manufacture DRAM. Other manufacturers make and sell DIMMs (but not the DRAM chips in them), such as Kingston Technology, and some manufacturers that sell stacked DRAM (used e.g. in the fastest supercomputers on the exascale), separately such as Viking Technology. Others sell such integrated into other products, such as Fujitsu into its CPUs, AMD in GPUs, and Nvidia, with HBM2 in some of their GPU chips.

History

[edit]

Precursors

[edit]
A schematic drawing depicting the cross-section of the original one-transistor, one-capacitor NMOS DRAM cell. It was patented in 1968.

The cryptanalytic machine code-named Aquarius used at Bletchley Park during World War II incorporated a hard-wired dynamic memory. Paper tape was read and the characters on it "were remembered in a dynamic store." The store used a large bank of capacitors, which were either charged or not, a charged capacitor representing cross (1) and an uncharged capacitor dot (0). Since the charge gradually leaked away, a periodic pulse was applied to top up those still charged (hence the term 'dynamic')".[5]

In November 1965, Toshiba introduced a bipolar dynamic RAM for its electronic calculator Toscal BC-1411.[6][7][8] In 1966, Tomohisa Yoshimaru and Hiroshi Komikawa from Toshiba applied for a Japanese patent of a memory circuit composed of several transistors and a capacitor, in 1967 they applied for a patent in the US.[9]

The earliest forms of DRAM mentioned above used bipolar transistors. While it offered improved performance over magnetic-core memory, bipolar DRAM could not compete with the lower price of the then-dominant magnetic-core memory.[10] Capacitors had also been used for earlier memory schemes, such as the drum of the Atanasoff–Berry Computer, the Williams tube and the Selectron tube.

Single MOS DRAM

[edit]

In 1966, Dr. Robert Dennard invented modern DRAM architecture in which there's a single MOS transistor per capacitor,[11] at the IBM Thomas J. Watson Research Center, while he was working on MOS memory and was trying to create an alternative to SRAM which required six MOS transistors for each bit of data. While examining the characteristics of MOS technology, he found it was capable of building capacitors, and that storing a charge or no charge on the MOS capacitor could represent the 1 and 0 of a bit, while the MOS transistor could control writing the charge to the capacitor. This led to his development of the single-transistor MOS DRAM memory cell.[12] He filed a patent in 1967, and was granted U.S. patent number 3,387,286 in 1968.[13] MOS memory offered higher performance, was cheaper, and consumed less power, than magnetic-core memory.[14] The patent describes the invention: "Each cell is formed, in one embodiment, using a single field-effect transistor and a single capacitor."[15]

MOS DRAM chips were commercialized in 1969 by Advanced Memory Systems, Inc of Sunnyvale, CA. This 1024 bit chip was sold to Honeywell, Raytheon, Wang Laboratories, and others. The same year, Honeywell asked Intel to make a DRAM using a three-transistor cell that they had developed. This became the Intel 1102 in early 1970.[16] However, the 1102 had many problems, prompting Intel to begin work on their own improved design, in secrecy to avoid conflict with Honeywell. This became the first commercially available DRAM, the Intel 1103, in October 1970, despite initial problems with low yield until the fifth revision of the masks. The 1103 was designed by Joel Karp and laid out by Pat Earhart. The masks were cut by Barbara Maness and Judy Garcia.[17][original research?] MOS memory overtook magnetic-core memory as the dominant memory technology in the early 1970s.[14]

The first DRAM with multiplexed row and column address lines was the Mostek MK4096 4 Kbit DRAM designed by Robert Proebsting and introduced in 1973. This addressing scheme uses the same address pins to receive the low half and the high half of the address of the memory cell being referenced, switching between the two halves on alternating bus cycles. This was a radical advance, effectively halving the number of address lines required, which enabled it to fit into packages with fewer pins, a cost advantage that grew with every jump in memory size. The MK4096 proved to be a very robust design for customer applications. At the 16 Kbit density, the cost advantage increased; the 16 Kbit Mostek MK4116 DRAM,[18][19] introduced in 1976, achieved greater than 75% worldwide DRAM market share. However, as density increased to 64 Kbit in the early 1980s, Mostek and other US manufacturers were overtaken by Japanese DRAM manufacturers, which dominated the US and worldwide markets during the 1980s and 1990s.

Early in 1985, Gordon Moore decided to withdraw Intel from producing DRAM.[20] By 1986, many, but not all, United States chip makers had stopped making DRAMs.[21] Micron Technology and Texas Instruments continued to produce them commercially, and IBM produced them for internal use.

In 1985, when 64K DRAM memory chips were the most common memory chips used in computers, and when more than 60 percent of those chips were produced by Japanese companies, semiconductor makers in the United States accused Japanese companies of export dumping for the purpose of driving makers in the United States out of the commodity memory chip business. Prices for the 64K product plummeted to as low as 35 cents apiece from $3.50 within 18 months, with disastrous financial consequences for some U.S. firms. On 4 December 1985 the US Commerce Department's International Trade Administration ruled in favor of the complaint.[22][23][24][25]

Synchronous dynamic random-access memory (SDRAM) was developed by Samsung. The first commercial SDRAM chip was the Samsung KM48SL2000, which had a capacity of 16 Mb,[26] and was introduced in 1992.[27] The first commercial DDR SDRAM (double data rate SDRAM) memory chip was Samsung's 64 Mb DDR SDRAM chip, released in 1998.[28]

Later, in 2001, Japanese DRAM makers accused Korean DRAM manufacturers of dumping.[29][30][31][32]

In 2002, US computer makers made claims of DRAM price fixing.

Principles of operation

[edit]
The principles of operation for reading a simple 4 4 DRAM array
Basic structure of a DRAM cell array

DRAM is usually arranged in a rectangular array of charge storage cells consisting of one capacitor and transistor per data bit. The figure to the right shows a simple example with a four-by-four cell matrix. Some DRAM matrices are many thousands of cells in height and width.[33][34]

The long horizontal lines connecting each row are known as word-lines. Each column of cells is composed of two bit-lines, each connected to every other storage cell in the column (the illustration to the right does not include this important detail). They are generally known as the + and bit lines.

A sense amplifier is essentially a pair of cross-connected inverters between the bit-lines. The first inverter is connected with input from the + bit-line and output to the − bit-line. The second inverter's input is from the − bit-line with output to the + bit-line. This results in positive feedback which stabilizes after one bit-line is fully at its highest voltage and the other bit-line is at the lowest possible voltage.

Operations to read a data bit from a DRAM storage cell

[edit]
  1. The sense amplifiers are disconnected.[35]
  2. The bit-lines are precharged to exactly equal voltages that are in between high and low logic levels (e.g., 0.5 V if the two levels are 0 and 1 V). The bit-lines are physically symmetrical to keep the capacitance equal, and therefore at this time their voltages are equal.[35]
  3. The precharge circuit is switched off. Because the bit-lines are relatively long, they have enough capacitance to maintain the precharged voltage for a brief time. This is an example of dynamic logic.[35]
  4. The desired row's word-line is then driven high to connect a cell's storage capacitor to its bit-line. This causes the transistor to conduct, transferring charge from the storage cell to the connected bit-line (if the stored value is 1) or from the connected bit-line to the storage cell (if the stored value is 0). Since the capacitance of the bit-line is typically much higher than the capacitance of the storage cell, the voltage on the bit-line increases very slightly if the storage cell's capacitor is discharged and decreases very slightly if the storage cell is charged (e.g., 0.54 and 0.45 V in the two cases). As the other bit-line holds 0.50 V there is a small voltage difference between the two twisted bit-lines.[35]
  5. The sense amplifiers are now connected to the bit-lines pairs. Positive feedback then occurs from the cross-connected inverters, thereby amplifying the small voltage difference between the odd and even row bit-lines of a particular column until one bit line is fully at the lowest voltage and the other is at the maximum high voltage. Once this has happened, the row is open (the desired cell data is available).[35]
  6. All storage cells in the open row are sensed simultaneously, and the sense amplifier outputs latched. A column address then selects which latch bit to connect to the external data bus. Reads of different columns in the same row can be performed without a row opening delay because, for the open row, all data has already been sensed and latched.[35]
  7. While reading of columns in an open row is occurring, current is flowing back up the bit-lines from the output of the sense amplifiers and recharging the storage cells. This reinforces (i.e. refreshes) the charge in the storage cell by increasing the voltage in the storage capacitor if it was charged to begin with, or by keeping it discharged if it was empty. Note that due to the length of the bit-lines there is a fairly long propagation delay for the charge to be transferred back to the cell's capacitor. This takes significant time past the end of sense amplification, and thus overlaps with one or more column reads.[35]
  8. When done with reading all the columns in the current open row, the word-line is switched off to disconnect the storage cell capacitors (the row is closed) from the bit-lines. The sense amplifier is switched off, and the bit-lines are precharged again.[35]

To write to memory

[edit]
Writing to a DRAM cell

To store data, a row is opened and a given column's sense amplifier is temporarily forced to the desired high or low-voltage state, thus causing the bit-line to charge or discharge the cell storage capacitor to the desired value. Due to the sense amplifier's positive feedback configuration, it will hold a bit-line at stable voltage even after the forcing voltage is removed. During a write to a particular cell, all the columns in a row are sensed simultaneously just as during reading, so although only a single column's storage-cell capacitor charge is changed, the entire row is refreshed (written back in), as illustrated in the figure to the right.[35]

Refresh rate

[edit]

Typically, manufacturers specify that each row must be refreshed every 64 ms or less, as defined by the JEDEC standard.

Some systems refresh every row in a burst of activity involving all rows every 64 ms. Other systems refresh one row at a time staggered throughout the 64 ms interval. For example, a system with 213 = 8,192 rows would require a staggered refresh rate of one row every 7.8 μs which is 64 ms divided by 8,192 rows. A few real-time systems refresh a portion of memory at a time determined by an external timer function that governs the operation of the rest of a system, such as the vertical blanking interval that occurs every 10–20 ms in video equipment.

The row address of the row that will be refreshed next is maintained by external logic or a counter within the DRAM. A system that provides the row address (and the refresh command) does so to have greater control over when to refresh and which row to refresh. This is done to minimize conflicts with memory accesses, since such a system has both knowledge of the memory access patterns and the refresh requirements of the DRAM. When the row address is supplied by a counter within the DRAM, the system relinquishes control over which row is refreshed and only provides the refresh command. Some modern DRAMs are capable of self-refresh; no external logic is required to instruct the DRAM to refresh or to provide a row address.

Under some conditions, most of the data in DRAM can be recovered even if the DRAM has not been refreshed for several minutes.[36]

Memory timing

[edit]

Many parameters are required to fully describe the timing of DRAM operation. Here are some examples for two timing grades of asynchronous DRAM, from a data sheet published in 1998:[37]

Asynchronous DRAM typical timing
"50 ns" "60 ns" Description
tRC 84 ns 104 ns Random read or write cycle time (from one full /RAS cycle to another)
tRAC 50 ns 60 ns Access time: /RAS low to valid data out
tRCD 11 ns 14 ns /RAS low to /CAS low time
tRAS 50 ns 60 ns /RAS pulse width (minimum /RAS low time)
tRP 30 ns 40 ns /RAS precharge time (minimum /RAS high time)
tPC 20 ns 25 ns Page-mode read or write cycle time (/CAS to /CAS)
tAA 25 ns 30 ns Access time: Column address valid to valid data out (includes address setup time before /CAS low)
tCAC 13 ns 15 ns Access time: /CAS low to valid data out
tCAS 8 ns 10 ns /CAS low pulse width minimum

Thus, the generally quoted number is the /RAS low to valid data out time. This is the time to open a row, settle the sense amplifiers, and deliver the selected column data to the output. This is also the minimum /RAS low time, which includes the time for the amplified data to be delivered back to recharge the cells. The time to read additional bits from an open page is much less, defined by the /CAS to /CAS cycle time. The quoted number is the clearest way to compare between the performance of different DRAM memories, as it sets the slower limit regardless of the row length or page size. Bigger arrays forcibly result in larger bit line capacitance and longer propagation delays, which cause this time to increase as the sense amplifier settling time is dependent on both the capacitance as well as the propagation latency. This is countered in modern DRAM chips by instead integrating many more complete DRAM arrays within a single chip, to accommodate more capacity without becoming too slow.

When such a RAM is accessed by clocked logic, the times are generally rounded up to the nearest clock cycle. For example, when accessed by a 100 MHz state machine (i.e. a 10 ns clock), the 50 ns DRAM can perform the first read in five clock cycles, and additional reads within the same page every two clock cycles. This was generally described as "5-2-2-2" timing, as bursts of four reads within a page were common.

When describing synchronous memory, timing is described by clock cycle counts separated by hyphens. These numbers represent tCL-tRCD-tRP-tRAS in multiples of the DRAM clock cycle time. Note that this is half of the data transfer rate when double data rate signaling is used. JEDEC standard PC3200 timing is 3-4-4-8[38] with a 200 MHz clock, while premium-priced high performance PC3200 DDR DRAM DIMM might be operated at 2-2-2-5 timing.[39]

Synchronous DRAM typical timing
PC-3200 (DDR-400) PC2-6400 (DDR2-800) PC3-12800 (DDR3-1600) Description
cycles time cycles time cycles time
tCL Typical 3 15 ns 5 12.5 ns 9 11.25 ns /CAS low to valid data out (equivalent to tCAC)
Fast 2 10 ns 4 10 ns 8 10 ns
tRCD Typical 4 20 ns 5 12.5 ns 9 11.25 ns /RAS low to /CAS low time
Fast 2 10 ns 4 10 ns 8 10 ns
tRP Typical 4 20 ns 5 12.5 ns 9 11.25 ns /RAS precharge time (minimum precharge to active time)
Fast 2 10 ns 4 10 ns 8 10 ns
tRAS Typical 8 40 ns 16 40 ns 27 33.75 ns Row active time (minimum active to precharge time)
Fast 5 25 ns 12 30 ns 24 30 ns

Minimum random access time has improved from tRAC = 50 ns to tRCD + tCL = 22.5 ns, and even the premium 20 ns variety is only 2.5 times faster than the asynchronous DRAM. CAS latency has improved even less, from tCAC = 13 ns to 10 ns. However, the DDR3 memory does achieve 32 times higher bandwidth; due to internal pipelining and wide data paths, it can output two words every 1.25 ns (1600 Mword/s), while the EDO DRAM can output one word per tPC = 20 ns (50 Mword/s).

Timing abbreviations

[edit]
  • tCL – CAS latency
  • tCR – Command rate
  • tPTP – precharge to precharge delay
  • tRAS – RAS active time
  • tRCD – RAS to CAS delay
  • tREF – Refresh period
  • tRFC – Row refresh cycle time
  • tRP – RAS precharge
  • tRRD – RAS to RAS delay
  • tRTP – Read to precharge delay
  • tRTR – Read to read delay
  • tRTW – Read to write delay
  • tWR – Write recovery time
  • tWTP – Write to precharge delay
  • tWTR – Write to read delay
  • tWTW – Write to write delay

Memory cell design

[edit]

Each bit of data in a DRAM is stored as a positive or negative electrical charge in a capacitive structure. The structure providing the capacitance, as well as the transistors that control access to it, is collectively referred to as a DRAM cell. They are the fundamental building block in DRAM arrays. Multiple DRAM memory cell variants exist, but the most commonly used variant in modern DRAMs is the one-transistor, one-capacitor (1T1C) cell. The transistor is used to admit current into the capacitor during writes, and to discharge the capacitor during reads. The access transistor is designed to maximize drive strength and minimize transistor-transistor leakage (Kenner, p. 34).

The capacitor has two terminals, one of which is connected to its access transistor, and the other to either ground or VCC/2. In modern DRAMs, the latter case is more common, since it allows faster operation. In modern DRAMs, a voltage of +VCC/2 across the capacitor is required to store a logic one; and a voltage of −VCC/2 across the capacitor is required to store a logic zero. The resultant charge is , where Q is the charge in coulombs and C is the capacitance in farads.[40]

Reading or writing a logic one requires the wordline be driven to a voltage greater than the sum of VCC and the access transistor's threshold voltage (VTH). This voltage is called VCC pumped (VCCP). The time required to discharge a capacitor thus depends on what logic value is stored in the capacitor. A capacitor containing logic one begins to discharge when the voltage at the access transistor's gate terminal is above VCCP. If the capacitor contains a logic zero, it begins to discharge when the gate terminal voltage is above VTH.[41]

Capacitor design

[edit]

Up until the mid-1980s, the capacitors in DRAM cells were co-planar with the access transistor (they were constructed on the surface of the substrate), thus they were referred to as planar capacitors. The drive to increase both density and, to a lesser extent, performance, required denser designs. This was strongly motivated by economics, a major consideration for DRAM devices, especially commodity DRAMs. The minimization of DRAM cell area can produce a denser device and lower the cost per bit of storage. Starting in the mid-1980s, the capacitor was moved above or below the silicon substrate in order to meet these objectives. DRAM cells featuring capacitors above the substrate are referred to as stacked or folded plate capacitors. Those with capacitors buried beneath the substrate surface are referred to as trench capacitors. In the 2000s, manufacturers were sharply divided by the type of capacitor used in their DRAMs and the relative cost and long-term scalability of both designs have been the subject of extensive debate. The majority of DRAMs, from major manufactures such as Hynix, Micron Technology, Samsung Electronics use the stacked capacitor structure, whereas smaller manufacturers such Nanya Technology use the trench capacitor structure (Jacob, pp. 355–357).

The capacitor in the stacked capacitor scheme is constructed above the surface of the substrate. The capacitor is constructed from an oxide-nitride-oxide (ONO) dielectric sandwiched in between two layers of polysilicon plates (the top plate is shared by all DRAM cells in an IC), and its shape can be a rectangle, a cylinder, or some other more complex shape. There are two basic variations of the stacked capacitor, based on its location relative to the bitline—capacitor-under-bitline (CUB) and capacitor-over-bitline (COB). In the former, the capacitor is underneath the bitline, which is usually made of metal, and the bitline has a polysilicon contact that extends downwards to connect it to the access transistor's source terminal. In the latter, the capacitor is constructed above the bitline, which is almost always made of polysilicon, but is otherwise identical to the COB variation. The advantage the COB variant possesses is the ease of fabricating the contact between the bitline and the access transistor's source as it is physically close to the substrate surface. However, this requires the active area to be laid out at a 45-degree angle when viewed from above, which makes it difficult to ensure that the capacitor contact does not touch the bitline. CUB cells avoid this, but suffer from difficulties in inserting contacts in between bitlines, since the size of features this close to the surface are at or near the minimum feature size of the process technology (Kenner, pp. 33–42).

The trench capacitor is constructed by etching a deep hole into the silicon substrate. The substrate volume surrounding the hole is then heavily doped to produce a buried n+ plate with low resistance. A layer of oxide-nitride-oxide dielectric is grown or deposited, and finally the hole is filled by depositing doped polysilicon, which forms the top plate of the capacitor. The top of the capacitor is connected to the access transistor's drain terminal via a polysilicon strap (Kenner, pp. 42–44). A trench capacitor's depth-to-width ratio in DRAMs of the mid-2000s can exceed 50:1 (Jacob, p. 357).

Trench capacitors have numerous advantages. Since the capacitor is buried in the bulk of the substrate instead of lying on its surface, the area it occupies can be minimized to what is required to connect it to the access transistor's drain terminal without decreasing the capacitor's size, and thus capacitance (Jacob, pp. 356–357). Alternatively, the capacitance can be increased by etching a deeper hole without any increase to surface area (Kenner, p. 44). Another advantage of the trench capacitor is that its structure is under the layers of metal interconnect, allowing them to be more easily made planar, which enables it to be integrated in a logic-optimized process technology, which have many levels of interconnect above the substrate. The fact that the capacitor is under the logic means that it is constructed before the transistors are. This allows high-temperature processes to fabricate the capacitors, which would otherwise degrade the logic transistors and their performance. This makes trench capacitors suitable for constructing embedded DRAM (eDRAM) (Jacob, p. 357). Disadvantages of trench capacitors are difficulties in reliably constructing the capacitor's structures within deep holes and in connecting the capacitor to the access transistor's drain terminal (Kenner, p. 44).

Historical cell designs

[edit]

First-generation DRAM ICs (those with capacities of 1 Kbit), such as the archetypical Intel 1103, used a three-transistor, one-capacitor (3T1C) DRAM cell with separate read and write circuitry. The write wordline drove a write transistor which connected the capacitor to the write bitline just as in the 1T1C cell, but there was a separate read wordline and read transistor which connected an amplifier transistor to the read bitline. By the second generation, the drive to reduce cost by fitting the same amount of bits in a smaller area led to the almost universal adoption of the 1T1C DRAM cell, although a couple of devices with 4 and 16 Kbit capacities continued to use the 3T1C cell for performance reasons (Kenner, p. 6). These performance advantages included, most significantly, the ability to read the state stored by the capacitor without discharging it, avoiding the need to write back what was read out (non-destructive read). A second performance advantage relates to the 3T1C cell's separate transistors for reading and writing; the memory controller can exploit this feature to perform atomic read-modify-writes, where a value is read, modified, and then written back as a single, indivisible operation (Jacob, p. 459).

Proposed cell designs

[edit]

The one-transistor, zero-capacitor (1T, or 1T0C) DRAM cell has been a topic of research since the late-1990s. 1T DRAM is a different way of constructing the basic DRAM memory cell, distinct from the classic one-transistor/one-capacitor (1T/1C) DRAM cell, which is also sometimes referred to as 1T DRAM, particularly in comparison to the 3T and 4T DRAM which it replaced in the 1970s.

In 1T DRAM cells, the bit of data is still stored in a capacitive region controlled by a transistor, but this capacitance is no longer provided by a separate capacitor. 1T DRAM is a "capacitorless" bit cell design that stores data using the parasitic body capacitance that is inherent to silicon on insulator (SOI) transistors. Considered a nuisance in logic design, this floating body effect can be used for data storage. This gives 1T DRAM cells the greatest density as well as allowing easier integration with high-performance logic circuits since they are constructed with the same SOI process technologies.[42]

Refreshing of cells remains necessary, but unlike with 1T1C DRAM, reads in 1T DRAM are non-destructive; the stored charge causes a detectable shift in the threshold voltage of the transistor.[43] Performance-wise, access times are significantly better than capacitor-based DRAMs, but slightly worse than SRAM. There are several types of 1T DRAMs: the commercialized Z-RAM from Innovative Silicon, the TTRAM[44] from Renesas and the A-RAM from the UGR/CNRS consortium.

Array structures

[edit]
Self-aligned storage node locations simplify the fabrication process in modern DRAM.[45]

DRAM cells are laid out in a regular rectangular, grid-like pattern to facilitate their control and access via wordlines and bitlines. The physical layout of the DRAM cells in an array is typically designed so that two adjacent DRAM cells in a column share a single bitline contact to reduce their area. DRAM cell area is given as nF2, where n is a number derived from the DRAM cell design, and F is the smallest feature size of a given process technology. This scheme permits comparison of DRAM size over different process technology generations, as DRAM cell area scales at linear or near-linear rates with respect to feature size. The typical area for modern DRAM cells varies between 6–8 F2.

The horizontal wire, the wordline, is connected to the gate terminal of every access transistor in its row. The vertical bitline is connected to the source terminal of the transistors in its column. The lengths of the wordlines and bitlines are limited. The wordline length is limited by the desired performance of the array, since propagation time of the signal that must transverse the wordline is determined by the RC time constant. The bitline length is limited by its capacitance (which increases with length), which must be kept within a range for proper sensing (as DRAMs operate by sensing the charge of the capacitor released onto the bitline). Bitline length is also limited by the amount of operating current the DRAM can draw and by how power can be dissipated, since these two characteristics are largely determined by the charging and discharging of the bitline.

Bitline architecture

[edit]

Sense amplifiers are required to read the state contained in the DRAM cells. When the access transistor is activated, the electrical charge in the capacitor is shared with the bitline. The bitline's capacitance is much greater than that of the capacitor (approximately ten times). Thus, the change in bitline voltage is minute. Sense amplifiers are required to resolve the voltage differential into the levels specified by the logic signaling system. Modern DRAMs use differential sense amplifiers, and are accompanied by requirements as to how the DRAM arrays are constructed. Differential sense amplifiers work by driving their outputs to opposing extremes based on the relative voltages on pairs of bitlines. The sense amplifiers function effectively and efficient only if the capacitance and voltages of these bitline pairs are closely matched. Besides ensuring that the lengths of the bitlines and the number of attached DRAM cells attached to them are equal, two basic architectures to array design have emerged to provide for the requirements of the sense amplifiers: open and folded bitline arrays.

Open bitline arrays

[edit]

The first generation (1 Kbit) DRAM ICs, up until the 64 Kbit generation (and some 256 Kbit generation devices) had open bitline array architectures. In these architectures, the bitlines are divided into multiple segments, and the differential sense amplifiers are placed in between bitline segments. Because the sense amplifiers are placed between bitline segments, to route their outputs outside the array, an additional layer of interconnect placed above those used to construct the wordlines and bitlines is required.

The DRAM cells that are on the edges of the array do not have adjacent segments. Since the differential sense amplifiers require identical capacitance and bitline lengths from both segments, dummy bitline segments are provided. The advantage of the open bitline array is a smaller array area, although this advantage is slightly diminished by the dummy bitline segments. The disadvantage that caused the near disappearance of this architecture is the inherent vulnerability to noise, which affects the effectiveness of the differential sense amplifiers. Since each bitline segment does not have any spatial relationship to the other, it is likely that noise would affect only one of the two bitline segments.

Folded bitline arrays

[edit]

The folded bitline array architecture routes bitlines in pairs throughout the array. The close proximity of the paired bitlines provide superior common-mode noise rejection characteristics over open bitline arrays. The folded bitline array architecture began appearing in DRAM ICs during the mid-1980s, beginning with the 256 Kbit generation. This architecture is favored in modern DRAM ICs for its superior noise immunity.

This architecture is referred to as folded because it takes its basis from the open array architecture from the perspective of the circuit schematic. The folded array architecture appears to remove DRAM cells in alternate pairs (because two DRAM cells share a single bitline contact) from a column, then move the DRAM cells from an adjacent column into the voids.

The location where the bitline twists occupies additional area. To minimize area overhead, engineers select the simplest and most area-minimal twisting scheme that is able to reduce noise under the specified limit. As process technology improves to reduce minimum feature sizes, the signal to noise problem worsens, since coupling between adjacent metal wires is inversely proportional to their pitch. The array folding and bitline twisting schemes that are used must increase in complexity in order to maintain sufficient noise reduction. Schemes that have desirable noise immunity characteristics for a minimal impact in area is the topic of current research (Kenner, p. 37).

Future array architectures

[edit]

Advances in process technology could result in open bitline array architectures being favored if it is able to offer better long-term area efficiencies; since folded array architectures require increasingly complex folding schemes to match any advance in process technology. The relationship between process technology, array architecture, and area efficiency is an active area of research.

Row and column redundancy

[edit]

The first DRAM integrated circuits did not have any redundancy. An integrated circuit with a defective DRAM cell would be discarded. Beginning with the 64 Kbit generation, DRAM arrays have included spare rows and columns to improve yields. Spare rows and columns provide tolerance of minor fabrication defects which have caused a small number of rows or columns to be inoperable. The defective rows and columns are physically disconnected from the rest of the array by a triggering a programmable fuse or by cutting the wire by a laser. The spare rows or columns are substituted in by remapping logic in the row and column decoders (Jacob, pp. 358–361).

Error detection and correction

[edit]

Electrical or magnetic interference inside a computer system can cause a single bit of DRAM to spontaneously flip to the opposite state. The majority of one-off ("soft") errors in DRAM chips occur as a result of background radiation, chiefly neutrons from cosmic ray secondaries, which may change the contents of one or more memory cells or interfere with the circuitry used to read/write them.

The problem can be mitigated by using redundant memory bits and additional circuitry that use these bits to detect and correct soft errors. In most cases, the detection and correction are performed by the memory controller; sometimes, the required logic is transparently implemented within DRAM chips or modules, enabling the ECC memory functionality for otherwise ECC-incapable systems.[46] The extra memory bits are used to record parity and to enable missing data to be reconstructed by error-correcting code (ECC). Parity allows the detection of all single-bit errors (actually, any odd number of wrong bits). The most common error-correcting code, a SECDED Hamming code, allows a single-bit error to be corrected and, in the usual configuration, with an extra parity bit, double-bit errors to be detected.[47]

Recent studies give widely varying error rates with over seven orders of magnitude difference, ranging from 10−10−10−17 error/bit·h, roughly one bit error, per hour, per gigabyte of memory to one bit error, per century, per gigabyte of memory.[48][49][50] The Schroeder et al. 2009 study reported a 32% chance that a given computer in their study would suffer from at least one correctable error per year, and provided evidence that most such errors are intermittent hard rather than soft errors and that trace amounts of radioactive material that had gotten into the chip packaging were emitting alpha particles and corrupting the data.[51] A 2010 study at the University of Rochester also gave evidence that a substantial fraction of memory errors are intermittent hard errors.[52] Large scale studies on non-ECC main memory in PCs and laptops suggest that undetected memory errors account for a substantial number of system failures: the 2011 study reported a 1-in-1700 chance per 1.5% of memory tested (extrapolating to an approximately 26% chance for total memory) that a computer would have a memory error every eight months.[53]

Security

[edit]

Data remanence

[edit]

Although dynamic memory is only specified and guaranteed to retain its contents when supplied with power and refreshed every short period of time (often 64 ms), the memory cell capacitors often retain their values for significantly longer time, particularly at low temperatures.[54] Under some conditions most of the data in DRAM can be recovered even if it has not been refreshed for several minutes.[55]

This property can be used to circumvent security and recover data stored in the main memory that is assumed to be destroyed at power-down. The computer could be quickly rebooted, and the contents of the main memory read out; or by removing a computer's memory modules, cooling them to prolong data remanence, then transferring them to a different computer to be read out. Such an attack was demonstrated to circumvent popular disk encryption systems, such as the open source TrueCrypt, Microsoft's BitLocker Drive Encryption, and Apple's FileVault.[54] This type of attack against a computer is often called a cold boot attack.

Memory corruption

[edit]

Dynamic memory, by definition, requires periodic refresh. Furthermore, reading dynamic memory is a destructive operation, requiring a recharge of the storage cells in the row that has been read. If these processes are imperfect, a read operation can cause soft errors. In particular, there is a risk that some charge can leak between nearby cells, causing the refresh or read of one row to cause a disturbance error in an adjacent or even nearby row. The awareness of disturbance errors dates back to the first commercially available DRAM in the early 1970s (the Intel 1103). Despite the mitigation techniques employed by manufacturers, commercial researchers proved in a 2014 analysis that commercially available DDR3 DRAM chips manufactured in 2012 and 2013 are susceptible to disturbance errors.[56] The associated side effect that led to observed bit flips has been dubbed row hammer.

Packaging

[edit]

Memory module

[edit]

Dynamic RAM ICs can be packaged in molded epoxy cases, with an internal lead frame for interconnections between the silicon die and the package leads. The original IBM PC design used ICs, including those for DRAM, packaged in dual in-line packages (DIP), soldered directly to the main board or mounted in sockets. As memory density skyrocketed, the DIP package was no longer practical. For convenience in handling, several dynamic RAM integrated circuits may be mounted on a single memory module, allowing installation of 16-bit, 32-bit or 64-bit wide memory in a single unit, without the requirement for the installer to insert multiple individual integrated circuits. Memory modules may include additional devices for parity checking or error correction. Over the evolution of desktop computers, several standardized types of memory module have been developed. Laptop computers, game consoles, and specialized devices may have their own formats of memory modules not interchangeable with standard desktop parts for packaging or proprietary reasons.

Embedded

[edit]

DRAM that is integrated into an integrated circuit designed in a logic-optimized process (such as an application-specific integrated circuit, microprocessor, or an entire system on a chip) is called embedded DRAM (eDRAM). Embedded DRAM requires DRAM cell designs that can be fabricated without preventing the fabrication of fast-switching transistors used in high-performance logic, and modification of the basic logic-optimized process technology to accommodate the process steps required to build DRAM cell structures.

Versions

[edit]

Since the fundamental DRAM cell and array has maintained the same basic structure for many years, the types of DRAM are mainly distinguished by the many different interfaces for communicating with DRAM chips.

Asynchronous DRAM

[edit]

The original DRAM, now known by the retronym asynchronous DRAM was the first type of DRAM in use. From its origins in the late 1960s, it was commonplace in computing up until around 1997, when it was mostly replaced by synchronous DRAM. In the present day, manufacture of asynchronous RAM is relatively rare.[57]

Principles of operation

[edit]

An asynchronous DRAM chip has power connections, some number of address inputs (typically 12), and a few (typically one or four) bidirectional data lines. There are three main active-low control signals:

  • RAS, the Row Address Strobe. The address inputs are captured on the falling edge of RAS, and select a row to open. The row is held open as long as RAS is low.
  • CAS, the Column Address Strobe. The address inputs are captured on the falling edge of CAS, and select a column from the currently open row to read or write.
  • WE, Write Enable. This signal determines whether a given falling edge of CAS is a read (if high) or write (if low). If low, the data inputs are also captured on the falling edge of CAS. If high, the data outputs are enabled by the falling edge of CAS and produce valid output after the internal access time.

This interface provides direct control of internal timing: when RAS is driven low, a CAS cycle must not be attempted until the sense amplifiers have sensed the memory state, and RAS must not be returned high until the storage cells have been refreshed. When RAS is driven high, it must be held high long enough for precharging to complete.

Although the DRAM is asynchronous, the signals are typically generated by a clocked memory controller, which limits their timing to multiples of the controller's clock cycle.

For completeness, we mention two other control signals which are not essential to DRAM operation, but are provided for the convenience of systems using DRAM:

  • CS, Chip Select. When this is high, all other inputs are ignored. This makes it easy to build an array of DRAM chips which share the same control signals. Just as DRAM internally uses the word lines to select one row of storage cells connect to the shared bit lines and sense amplifiers, CS is used to select one row of DRAM chips to connect to the shared control, address, and data lines.
  • OE, Output Enable. This is an additional signal that (if high) inhibits output on the data I/‍O pins, while allowing all other operations to proceed normally. In many applications, OE can be permanently connected low (output enabled whenever CS, RAS and CAS are low and WE is high), but in high-speed applications, judicious use of OE can prevent bus contention between two DRAM chips connected to the same data lines. For example, it is possible to have two interleaved memory banks sharing the address and data lines, but each having their own RAS, CAS, WE and OE connections. The memory controller can begin a read from the second bank while a read from the first bank is in progress, using the two OE signals to only permit one result to appear on the data bus at a time.
RAS-only refresh
[edit]

Classic asynchronous DRAM is refreshed by opening each row in turn.

The refresh cycles are distributed across the entire refresh interval in such a way that all rows are refreshed within the required interval. To refresh one row of the memory array using RAS only refresh (ROR), the following steps must occur:

  1. The row address of the row to be refreshed must be applied at the address input pins.
  2. RAS must switch from high to low. CAS must remain high.
  3. At the end of the required amount of time, RAS must return high.

This can be done by supplying a row address and pulsing RAS low; it is not necessary to perform any CAS cycles. An external counter is needed to iterate over the row addresses in turn.[58] In some designs, the CPU handled RAM refresh. The Zilog Z80 is perhaps the best known example, as it has an internal row counter R which supplies the address for a special refresh cycle generated after each instruction fetch.[59] In other systems, especially home computers, refresh was handled by the video circuitry as a side effect of its periodic scan of the frame buffer.[60]

CAS before RAS refresh
[edit]

For convenience, the counter was quickly incorporated into the DRAM chips themselves. If the CAS line is driven low before RAS (normally an illegal operation), then the DRAM ignores the address inputs and uses an internal counter to select the row to open.[58][61] This is known as CAS-before-RAS (CBR) refresh. This became the standard form of refresh for asynchronous DRAM, and is the only form generally used with SDRAM.

Hidden refresh
[edit]

Given support of CAS-before-RAS refresh, it is possible to deassert RAS while holding CAS low to maintain data output. If RAS is then asserted again, this performs a CBR refresh cycle while the DRAM outputs remain valid. Because data output is not interrupted, this is known as hidden refresh.[61] Hidden refresh is no faster than a normal read followed by a normal refresh, but does maintain the data output valid during the refresh cycle.

Page mode DRAM

[edit]

Page mode DRAM is a minor modification to the first-generation DRAM IC interface which improves the performance of reads and writes to a row by avoiding the inefficiency of precharging and opening the same row repeatedly to access a different column. In page mode DRAM, after a row is opened by holding RAS low, the row can be kept open, and multiple reads or writes can be performed to any of the columns in the row. Each column access is initiated by presenting a column address and asserting CAS. For reads, after a delay (tCAC), valid data appears on the data out pins, which are held at high-Z before the appearance of valid data. For writes, the write enable signal and write data is presented along with the column address.[62]

Page mode DRAM was in turn later improved with a small modification which further reduced latency. DRAMs with this improvement are called fast page mode DRAMs (FPM DRAMs). In page mode DRAM, the chip does not capture the column address until CAS is asserted, so column access time (until data out was valid) begins when CAS is asserted. In FPM DRAM, the column address can be supplied while CAS is still deasserted, and the main column access time (tAA) begins as soon as the address is stable. The CAS signal is only needed to enable the output (the data out pins were held at high-Z while CAS was deasserted), so time from CAS assertion to data valid (tCAC) is greatly reduced.[63] Fast page mode DRAM was introduced in 1986 and was used with the Intel 80486.

Static column is a variant of fast page mode in which the column address does not need to be latched, but rather the address inputs may be changed with CAS held low, and the data output will be updated accordingly a few nanoseconds later.[63]

Nibble mode is another variant in which four sequential locations within the row can be accessed with four consecutive pulses of CAS. The difference from normal page mode is that the address inputs are not used for the second through fourth CAS edges but are generated internally starting with the address supplied for the first CAS edge.[63] The predictable addresses let the chip prepare the data internally and respond very quickly to the subsequent CAS pulses.

Extended data out DRAM

[edit]
A pair of 32 MB EDO DRAM modules

Extended data out DRAM (EDO DRAM) was invented and patented in the 1990s by Micron Technology who then licensed technology to many other memory manufacturers.[64] EDO RAM, sometimes referred to as hyper page mode enabled DRAM, is similar to fast page mode DRAM with the additional feature that a new access cycle can be started while keeping the data output of the previous cycle active. This allows a certain amount of overlap in operation (pipelining), allowing somewhat improved performance.[65] It is up to 30% faster than FPM DRAM,[66] which it began to replace in 1995 when Intel introduced the 430FX chipset with EDO DRAM support. Irrespective of the performance gains, FPM and EDO SIMMs can be used interchangeably in many (but not all) applications.[67][68]

To be precise, EDO DRAM begins data output on the falling edge of CAS but does not disable the output when CAS rises again. Instead, it holds the current output valid (thus extending the data output time) even as the DRAM begins decoding a new column address, until either a new column's data is selected by another CAS falling edge, or the output is switched off by the rising edge of RAS. (Or, less commonly, a change in CS, OE, or WE.)

This ability to start a new access even before the system has received the preceding column's data made it possible to design memory controllers which could carry out a CAS access (in the currently open row) in one clock cycle, or at least within two clock cycles instead of the previously required three. EDO's capabilities were able to partially compensate for the performance lost due to the lack of an L2 cache in low-cost, commodity PCs. More expensive notebooks also often lacked an L2 cache due to size and power limitations, and benefitted similarly. Even for systems with an L2 cache, the availability of EDO memory improved the average memory latency seen by applications over earlier FPM implementations.

Single-cycle EDO DRAM became very popular on video cards toward the end of the 1990s. It was very low cost, yet nearly as efficient for performance as the far more costly VRAM.

Burst EDO DRAM

[edit]

An evolution of EDO DRAM, burst EDO DRAM (BEDO DRAM), could process four memory addresses in one burst, for a maximum of 5-1-1-1, saving an additional three clocks over optimally designed EDO memory. It was done by adding an address counter on the chip to keep track of the next address. BEDO also added a pipeline stage allowing page-access cycle to be divided into two parts. During a memory-read operation, the first part accessed the data from the memory array to the output stage (second latch). The second part drove the data bus from this latch at the appropriate logic level. Since the data is already in the output buffer, quicker access time is achieved (up to 50% for large blocks of data) than with traditional EDO.

Although BEDO DRAM showed additional optimization over EDO, by the time it was available the market had made a significant investment towards synchronous DRAM, or SDRAM.[69] Even though BEDO RAM was superior to SDRAM in some ways, the latter technology quickly displaced BEDO.

Synchronous dynamic RAM

[edit]

Synchronous dynamic RAM (SDRAM) significantly revises the asynchronous memory interface, adding a clock (and a clock enable) line. All other signals are received on the rising edge of the clock.

The RAS and CAS inputs no longer act as strobes, but are instead, along with WE, part of a 3-bit command:

SDRAM Command summary
CS RAS CAS WE Address Command
H x x x x Command inhibit (no operation)
L H H H x No operation
L H H L x Burst Terminate: stop a read or write burst in progress.
L H L H Column Read from currently active row.
L H L L Column Write to currently active row.
L L H H Row Activate a row for read and write.
L L H L x Precharge (deactivate) the current row.
L L L H x Auto refresh: refresh one row of each bank, using an internal counter.
L L L L Mode Load mode register: address bus specifies DRAM operation mode.

The OE line's function is extended to a per-byte DQM signal, which controls data input (writes) in addition to data output (reads). This allows DRAM chips to be wider than 8 bits while still supporting byte-granularity writes.

Many timing parameters remain under the control of the DRAM controller. For example, a minimum time must elapse between a row being activated and a read or write command. One important parameter must be programmed into the SDRAM chip itself, namely the CAS latency. This is the number of clock cycles allowed for internal operations between a read command and the first data word appearing on the data bus. The Load mode register command is used to transfer this value to the SDRAM chip. Other configurable parameters include the length of read and write bursts, i.e. the number of words transferred per read or write command.

The most significant change, and the primary reason that SDRAM has supplanted asynchronous RAM, is the support for multiple internal banks inside the DRAM chip. Using a few bits of bank address that accompany each command, a second bank can be activated and begin reading data while a read from the first bank is in progress. By alternating banks, a single SDRAM device can keep the data bus continuously busy, in a way that asynchronous DRAM cannot.

Single data rate synchronous DRAM

[edit]

Single data rate SDRAM (SDR SDRAM or SDR) is the original generation of SDRAM; it made a single transfer of data per clock cycle.

Double data rate synchronous DRAM

[edit]
The die of a Samsung DDR-SDRAM 64-MBit package

Double data rate SDRAM (DDR SDRAM or DDR) was a later development of SDRAM, used in PC memory beginning in 2000. Subsequent versions are numbered sequentially (DDR2, DDR3, etc.). DDR SDRAM internally performs double-width accesses at the clock rate, and uses a double data rate interface to transfer one half on each clock edge. DDR2 and DDR3 increased this factor to 4× and 8×, respectively, delivering 4-word and 8-word bursts over 2 and 4 clock cycles, respectively. The internal access rate is mostly unchanged (200 million per second for DDR-400, DDR2-800 and DDR3-1600 memory), but each access transfers more data.

Direct Rambus DRAM

[edit]

Direct RAMBUS DRAM (DRDRAM) was developed by Rambus. First supported on motherboards in 1999, it was intended to become an industry standard, but was outcompeted by DDR SDRAM, making it technically obsolete by 2003.

Reduced Latency DRAM

[edit]

Reduced Latency DRAM (RLDRAM) is a high performance double data rate (DDR) SDRAM that combines fast, random access with high bandwidth, mainly intended for networking and caching applications.

Graphics RAM

[edit]

Graphics RAMs are asynchronous and synchronous DRAMs designed for graphics-related tasks such as texture memory and framebuffers, found on video cards.

Video DRAM

[edit]

Video DRAM (VRAM) is a dual-ported variant of DRAM that was once commonly used to store the frame buffer in some graphics adapters.

Window DRAM

[edit]

Window DRAM (WRAM) is a variant of VRAM that was once used in graphics adapters such as the Matrox Millennium and ATI 3D Rage Pro. WRAM was designed to perform better and cost less than VRAM. WRAM offered up to 25% greater bandwidth than VRAM and accelerated commonly used graphical operations such as text drawing and block fills.[70]

Multibank DRAM

[edit]
MoSys MDRAM MD908

Multibank DRAM (MDRAM) is a type of specialized DRAM developed by MoSys. It is constructed from small memory banks of 256 kB, which are operated in an interleaved fashion, providing bandwidths suitable for graphics cards at a lower cost to memories such as SRAM. MDRAM also allows operations to two banks in a single clock cycle, permitting multiple concurrent accesses to occur if the accesses were independent. MDRAM was primarily used in graphic cards, such as those featuring the Tseng Labs ET6x00 chipsets. Boards based upon this chipset often had the unusual capacity of 2.25 MB because of MDRAM's ability to be implemented more easily with such capacities. A graphics card with 2.25 MB of MDRAM had enough memory to provide 24-bit color at a resolution of 1024×768—a very popular setting at the time.

Synchronous graphics RAM

[edit]

Synchronous graphics RAM (SGRAM) is a specialized form of SDRAM for graphics adapters. It adds functions such as bit masking (writing to a specified bit plane without affecting the others) and block write (filling a block of memory with a single color). Unlike VRAM and WRAM, SGRAM is single-ported. However, it can open two memory pages at once, which simulates the dual-port nature of other video RAM technologies.

Graphics double data rate SDRAM

[edit]
A 512-MBit Qimonda GDDR3 SDRAM package
Inside a Samsung GDDR3 256-MBit package

Graphics double data rate SDRAM is a type of specialized DDR SDRAM designed to be used as the main memory of graphics processing units (GPUs). GDDR SDRAM is distinct from commodity types of DDR SDRAM such as DDR3, although they share some core technologies. Their primary characteristics are higher clock frequencies for both the DRAM core and I/O interface, which provides greater memory bandwidth for GPUs. As of 2025, there are eight successive generations of GDDR: GDDR2, GDDR3, GDDR4, GDDR5, GDDR5X, GDDR6, GDDR6X and GDDR7.

Pseudostatic RAM

[edit]
1 Mbit high speed CMOS pseudostatic RAM, made by Toshiba

Pseudostatic RAM (PSRAM or PSDRAM) is dynamic RAM with built-in refresh and address-control circuitry to make it behave similarly to static RAM (SRAM). It combines the high density of DRAM with the ease of use of true SRAM. PSRAM is used in the Apple iPhone and other embedded systems such as XFlar Platform.[71]

Some DRAM components have a self-refresh mode. While this involves much of the same logic that is needed for pseudo-static operation, this mode is often equivalent to a standby mode. It is provided primarily to allow a system to suspend operation of its DRAM controller to save power without losing data stored in DRAM, rather than to allow operation without a separate DRAM controller as is in the case of mentioned PSRAMs.

An embedded variant of PSRAM was sold by MoSys under the name 1T-SRAM. It is a set of small DRAM banks with an SRAM cache in front to make it behave much like a true SRAM. It is used in Nintendo GameCube and Wii video game consoles.

Cypress Semiconductor's HyperRAM[72] is a type of PSRAM supporting a JEDEC-compliant 8-pin HyperBus[73] or Octal xSPI interface.

See also

[edit]

References

[edit]

Further reading

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia

(DRAM) is a type of volatile that stores each bit of as an electric charge in an array of capacitors integrated into a single chip, with each capacitor paired to a in a one-transistor-one-capacitor (1T1C) configuration. The "dynamic" aspect arises because the stored charge in the capacitors leaks over time due to inherent imperfections, requiring periodic refreshing by a dedicated circuit to restore the before it dissipates. Invented by at IBM's , with a patent application filed in 1967 and granted on June 4, 1968 (U.S. 3,387,286), DRAM achieved commercial viability in 1970 through Intel's production of the first 1-kilobit chip, enabling vastly higher memory densities and lower per-bit costs than (SRAM) due to its simpler cell structure using fewer transistors per bit. This technology underpins the primary system memory in virtually all modern computers, servers, and electronic devices, supporting scalable capacities from megabits to terabits through iterative advancements like synchronous DRAM (SDRAM) and (DDR) variants.

Fundamentals and Principles of Operation

Storage Mechanism and Physics

The storage mechanism in dynamic random-access (DRAM) relies on a one-transistor, one- (1T1C) cell , where each bit is represented by the presence or absence of on a small . The access , typically an n-channel , controls connectivity between the storage and the bit line, while the holds the charge corresponding to the data bit. In the charged state (logical '1'), the storage node of the is driven to a voltage near the supply voltage VCC, storing a charge Q ≈ Cs · VCC, where Cs is the storage ; the discharged state (logical '0') holds negligible charge. To optimize sensing and reduce voltage stress, the 's reference plate is often biased at VCC/2, resulting in effective charge levels of Q = ± (VCC/2) · Cs. The physics of charge storage depends on the electrostatic field across the capacitor's , which separates conductive plates or electrodes to maintain the potential difference. follows Cs = ε · A / d, where ε is the of the , A is the effective plate area, and d is the separation distance; modern DRAM cells achieve Cs values of 20–30 fF through high-k and three-dimensional structures to counteract scaling limitations. However, charge retention is imperfect due to leakage mechanisms, including dielectric tunneling, junction leakage from carrier generation-recombination, and through the off-state . These currents, often on the order of 1 fA per cell at , cause of stored charge, with voltage dropping as ΔV = - (Ileak · t) / Cs over time t. Retention time, defined as the duration until stored charge falls below a detectable threshold (typically 50–70% of initial voltage), ranges from milliseconds to seconds depending on , variations, and cell design, but standard DRAM specifications mandate refresh intervals of 64 ms to ensure across the array. This dynamic nature stems from the causal primacy of charge leakage governed by semiconductor physics, where minority carrier generation rates increase exponentially with (following Arrhenius behavior) and , necessitating active refresh to counteract entropy-driven dissipation. Lower temperatures extend retention by reducing leakage, as observed in cryogenic applications where times exceed room-temperature limits by orders of magnitude.

Read and Write Operations

In the conventional 1T1C DRAM cell, a write operation stores data by charging or discharging the storage through an n-channel access . The bit line is driven to VDD (typically 1-1.8 V in modern processes) to represent logic '1', charging the to store positive charge Q ≈ C × VDD, or to ground (0 V) for logic '0', where C is the cell (around 20-30 fF in sub-10 nm nodes). The word line is pulsed high to turn on the , transferring charge bidirectionally until equilibrium, with write time determined by RC delay (bit line resistance and ). This process overwrites prior cell state without sensing, enabling fast writes limited mainly by drive strength and plate voltage biasing to minimize voltage droop. Read operations in 1T1C cells are destructive due to charge sharing between the capacitor and precharged bit line. The bit line pair (BL and BL-bar) is equilibrated to VDD/2 via equalization transistors, minimizing offset errors. Asserting the row address strobe (RAS) activates the word line, connecting the cell capacitor to the bit line; for a '1' state, charge redistribution raises BL voltage by ΔV ≈ (VDD/2) × (Ccell / (Ccell + CBL)), typically 100-200 mV given CBL >> Ccell (bit line capacitance ~200-300 fF). A differential latch-based sense amplifier then resolves this small differential by cross-coupling PMOS loads for positive feedback and NMOS drivers to pull low, latching BL to full rails (VDD or 0 V) while BL-bar inverts, enabling column access via column address strobe (CAS). The sensed value is restored to the cell by driving the bit line back through the still-open transistor, compensating for leakage-induced loss (retention time ~64 ms at 85°C). Sense amplifiers, often shared across 512-1024 cells per bit line in folded bit line arrays, incorporate reference schemes or open bit line pairing to reject common-mode noise, with timing constrained by tRCD (RAS-to-CAS delay ~10-20 ns) and access time ~30-50 ns in DDR4/5 modules. Write-after-read restore ensures non-volatility within refresh cycles, but amplifies errors from process variations or alpha particle strikes, necessitating error-correcting codes (ECC). In advanced nodes, dual-contact cell designs separate read/write paths in some embedded DRAM variants to mitigate read disturb, though standard commodity DRAM retains single-port 1T1C for density.

Refresh Requirements and Timing

Dynamic random-access memory (DRAM) cells store data as charge on a , which inevitably leaks over time due to mechanisms such as subthreshold leakage in the access and junction leakage at the capacitor's storage node, necessitating periodic refresh operations to prevent . The refresh involves activating the wordline to read the cell's charge state via a , which detects and amplifies the voltage differential on the bitlines, followed by rewriting the sensed data back to the capacitor to replenish the charge, typically to full levels of approximately VDD/2 for a logic '1' or ground for '0'. This destructive readout inherent to DRAM operation makes refresh a read-modify-write cycle that consumes bandwidth and power, with the entire array's rows distributed across the refresh interval to minimize impact. JEDEC standards mandate that all rows in a DRAM device retain data for a minimum of 64 milliseconds at operating temperatures from 0°C to 85°C, reduced to 32 milliseconds above 85°C to account for accelerated leakage at higher temperatures, ensuring reliability across worst-case cells with the shortest retention times. To meet this, modern devices require 8192 auto-refresh commands per 64 ms interval, each command refreshing 32 or more rows depending on density and architecture, resulting in an average inter-refresh interval (tREFI) of 7.8 microseconds for DDR3 and DDR4 generations. Systems issue these commands periodically via the , often in a distributed manner to spread overhead evenly, though burst refresh—completing all rows consecutively—is possible but increases latency spikes. While the specification conservatively assumes uniform worst-case retention, empirical studies reveal significant variation across cells, with many retaining data for seconds rather than milliseconds, enabling techniques like retention-aware refresh to skip stable rows and reduce energy overhead by up to 79% in optimized systems. However, compliance requires refreshing every cell at least once within the interval, as failure to do so risks bit errors from charge decay below the sense amplifier's threshold, typically around 100-200 mV differential. Self-refresh mode, entered via a dedicated command, shifts responsibility to the DRAM's internal circuitry, using on-chip timers and oscillators to maintain refreshes during low-power states like system sleep, with exit timing requiring stabilization periods of at least tPDEX plus 200 clock cycles.

Historical Development

Precursors and Early Concepts

The concept of originated in the mid-20th century with non-semiconductor technologies that enabled direct addressing of data without sequential access. The Williams–Kilburn tube, demonstrated on June 11, 1947, at the , represented the first functional electronic , storing bits as electrostatic charges on a cathode-ray tube's screen, with read operations erasing the data and necessitating rewriting. This volatile storage offered speeds up to 3,000 accesses per second but suffered from low capacity (typically 1,000–2,000 bits) and instability due to charge decay. , introduced commercially in 1951 by Jay Forrester's team at MIT for the computer, used arrays of ferrite toroids threaded with wires to store bits magnetically, providing non-destructive reads, capacities scaling to kilobits, and reliabilities exceeding one million hours in mainframes. By the , core memory dominated computing but faced escalating costs (around $1 per bit) and fabrication challenges as densities approached 64 kilobits, prompting searches for solid-state alternatives. Semiconductor memory concepts emerged in the early 1960s, building on bipolar transistor advancements to replace core's bulk and power demands. Robert Norman's U.S. Patent 3,387,286, filed in 1963 and granted in 1968, outlined monolithic integrated circuits for random-access storage using bipolar junction transistors in flip-flop configurations, emphasizing planar processing for scalability. Initial commercial bipolar static RAM (SRAM) chips appeared in 1965, including Signetics' 8-bit device for Scientific Data Systems' Sigma 7 and IBM's 16-bit SP95 for the System/360 Model 95, both employing multi-transistor cells for bistable storage without refresh needs but at higher power (tens of milliwatts per bit) and die area costs. These offered access times under 1 , outperforming core's 1–2 microseconds, yet their six-to-eight transistors per bit limited density to tens of bits per chip. Metal–oxide–semiconductor (MOS) field-effect () technology, refined from Mohamed Atalla's 1960 silicon-surface passivation at , introduced lower-power alternatives by the mid-1960s. Fairchild Semiconductor produced a 64-bit p-channel MOS SRAM in 1964 under John Schmidt, using four-transistor cells for static storage on a single die, followed by 256-bit and 1,024-bit MOS SRAMs by 1968 for systems like Burroughs B1700. MOS designs reduced cell complexity and power to microwatts per bit in standby but retained static architectures, capping densities due to and susceptibility to soft errors from cosmic rays. The physics of charge storage in MOS structures—leveraging for temporary bit representation—hinted at dynamic approaches, where a single could gate access to a holding charge representing data, trading stability (via refresh cycles every few milliseconds to counter leakage governed by defects and thermal generation) for drastic area savings and cost reductions toward cents per bit. This paradigm shift addressed core memory's scaling barriers, driven by exponential demand for mainframe capacities exceeding megabits.

Invention of MOS DRAM

The invention of metal-oxide-semiconductor (MOS) dynamic random-access memory (DRAM) is credited to , an engineer at IBM's . In , Dennard conceived the single-transistor memory cell, which stores a bit of as charge on a gated by a MOS (MOSFET). This design addressed the limitations of prior memory technologies by enabling higher density and lower cost through semiconductor integration. Dennard filed a patent application for the MOS DRAM cell in 1967, which was granted as U.S. Patent 3,387,286 on June 4, 1968, titled "Field-Effect Memory." The cell consists of one and one per bit, where the acts as a switch to read or write charge to the , representing binary states via voltage levels. Unlike static RAM, the charge leaks over time, necessitating periodic refresh, but the simplicity allowed for planar fabrication compatible with MOS integrated circuits. This innovation laid the foundation for scalable , supplanting in computing systems. The MOS DRAM cell's efficiency stemmed from leveraging MOS technology's advantages in power consumption and scaling, as Dennard also formulated principles for MOS transistor density increase without proportional power rise. Initial prototypes were developed at , demonstrating feasibility for with sense amplifiers to detect minute charge differences. By reducing the transistor count per bit from multi-transistor designs, MOS DRAM enabled exponential memory capacity growth, pivotal for the revolution.

Commercial Milestones and Scaling Eras

The , introduced in October 1970, was the first commercially available DRAM chip, offering 1 kilobit of storage organized as 1024 × 1 bits on an 8-micrometer process. Its low cost and compact size relative to enabled rapid adoption, surpassing core memory sales by 1972 and facilitating the transition to semiconductor-based main memory in computers. Early scaling progressed quickly, with 4-kilobit DRAMs entering production around 1974, exemplified by Mostek's MK4096, which introduced address multiplexing to reduce the pin count from 22 to 16, lowering packaging costs and improving system integration efficiency. This era () saw densities double roughly every two years through process shrinks and layout optimizations, reaching 16 kilobits by 1976 and 64 kilobits by 1979, primarily using planar one-transistor-one-capacitor cells; these chips powered minicomputers and early systems like the Altair 8800. The 1980s marked a shift to higher volumes and PC adoption, with 256-kilobit DRAMs commercialized around 1984 and 1-megabit chips by 1986, as seen in and designs integrated into IBM's Model 3090 mainframe, which stored approximately 100 double-spaced pages per chip. Japanese firms dominated production amid U.S. exits like Intel's in 1985 due to pricing pressures, while single in-line memory modules (SIMMs) standardized packaging for capacities up to 4 megabits, aligning with density doublings every 18-24 months via sub-micrometer lithography. The 1990s introduced synchronous DRAM (SDRAM) for pipelined operation, starting with 16-megabit chips around 1993, followed by Samsung's 64-megabit (DDR) SDRAM in 1998, which doubled bandwidth by transferring data on both clock edges; dual in-line memory modules (DIMMs) supported up to 128 megabits, enabling gigabyte-scale systems. DDR evolutions (DDR2 in 2003, DDR3 in 2007, DDR4 in 2014) sustained scaling to gigabit densities using stacked and trench capacitors, with Korean manufacturers like and leading alongside Micron. Into the 2010s and beyond, process nodes advanced to 10-14 nanometer classes (e.g., 1x, 1y, 1z nm generations), achieving 8-24 gigabit densities per die by 2024 through EUV and high-k dielectrics, though scaling slowed to 30-40% gains every two years due to leakage limits. DDR5, standardized in 2020, supports speeds over 8 gigatransfers per second for servers and PCs, while high-bandwidth (HBM) variants address AI demands; emerging 3D stacking proposals aim to extend viability beyond 2030 despite physical scaling barriers.

Memory Cell and Array Design

Capacitor Structures and Materials

The storage capacitor in a DRAM cell, paired with an access transistor in the canonical 1T1C configuration, must provide sufficient charge capacity—typically 20-30 fF per cell in modern nodes—to maintain signal margins despite leakage, while fitting within shrinking footprints dictated by scaling laws. Early implementations relied on planar metal-oxide-semiconductor (MOS) capacitors, where the dielectric—often (SiO₂, dielectric constant k ≈ 3.9)—separated a polysilicon storage from the p-type substrate, limiting to roughly the cell area times oxide thickness inverse. These structures sufficed for densities up to 256 Kbit but failed to scale further without excessive thinning, which exacerbated leakage via quantum tunneling. To increase effective surface area without expanding lateral dimensions, trench capacitors emerged in the early , etched vertically into the silicon substrate to form deep, cylindrical or rectangular depressions lined with a thin (initially ONO: oxide-nitride-oxide stacks for improved endurance) and a polysilicon counter-electrode. The first experimental trench cells appeared in 1-Mbit DRAM prototypes around 1982, with commercial adoption by and others in mid-1980s 1-Mbit products, achieving up to 3-5 times the of planar designs at comparable depths of 4-6 μm. However, trenches introduced parasitic capacitances to adjacent cells and substrate coupling, complicating isolation and increasing soft error susceptibility from alpha particles. Stacked capacitors addressed these drawbacks by fabricating the capacitor atop the access and bitline, leveraging (CVD) of polysilicon electrodes in fin, crown, or cylindrical geometries to multiply surface area—often by factors of 10-20 via sidewall extensions. Introduced conceptually in the late and scaled for 4-Mbit DRAM by (e.g., Hitachi's implementations), stacked cells evolved into metal-insulator-metal (MIM) stacks by the 2000s, with TiN electrodes enabling higher work functions and reduced depletion effects compared to polysilicon. Modern variants, such as pillar- or cylinder-type in vertical arrays, further densify by lateral staggering and high-aspect-ratio (up to 100:1), supporting sub-10 nm nodes. Dielectric materials have paralleled this structural progression to elevate (targeting >100 fF/μm²) while curbing leakage below 10⁻⁷ A/cm² at 1 V. Initial ONO films (effective k ≈ 6-7) gave way to (Ta₂O₅, k ≈ 25) in the 1990s for stacked cells, but its hygroscopicity and crystallization-induced defects prompted exploration of perovskites like barium strontium titanate (BST, k > 200). BST trials faltered due to poor thermal stability and interface traps, yielding to atomic-layer-deposited (ALD) high-k oxides: (ZrO₂, k ≈ 40 in tetragonal phase) dominates current DRAM, often in ZrO₂/Al₂O₃/ZrO₂ () laminates where thin Al₂O₃ (k ≈ 9) barriers suppress leakage via engineering and passivation. dioxide (HfO₂, k ≈ 20-25) serves in doped or forms (e.g., HfO₂-ZrO₂) for enhanced phase stability and , with or aluminum doping mitigating risks in paraelectric applications. These materials, conformal via ALD, enable conformal filling of 3D trenches, though challenges persist in , such as and under 10¹² read/write cycles. Future candidates include TiO₂-based or SrTiO₃ dielectrics for k > 100, contingent on resolving leakage and integration with sub-5 nm electrodes.

Cell Architectures: Historical and Modern

Early semiconductor DRAM implementations favored multi-transistor cells to simplify sensing and mitigate destructive readout issues inherent to charge-based storage. The , the first commercial DRAM chip released in October 1970 with 1 Kbit capacity, utilized a 3T1C (three-transistor, one-) architecture, where two transistors facilitated write operations and one enabled non-destructive read via a reference capacitor scheme, though this increased cell area and power consumption compared to later designs. This configuration delayed the full adoption of refresh mechanisms by providing stable sensing without immediate data restoration post-read. The 1T1C (one-transistor, one-) cell, patented by Robert Dennard at in 1968 following its conception in 1967, revolutionized density by reducing transistor count, with the single access transistor gating the storage to the bitline for both read and write. Read operations in 1T1C involve sharing charge between the and bitline, causing destructive sensing that requires restoration via rewrite, thus mandating periodic refresh cycles every few milliseconds to combat leakage governed by dielectric properties and . Despite these refresh demands—arising from finite charge retention times typically 64 ms in modern variants—the 1T1C's minimal footprint enabled a 50-fold capacity increase over core equivalents, supplanting 3T1C designs by the 4 Kbit generation in 1973 and driving Moore's Law-aligned scaling through planar integration. By the , 1T1C had solidified as the canonical architecture for commodity DRAM, with refinements like buried strap contacts and vertical transistors emerging in the to counter short-channel effects at sub-micron nodes. Modern high-bandwidth DRAM, such as DDR5 released in 2020, retains the 1T1C core but incorporates recessed channel array transistors (RCAT) or fin-like structures for sub-20 nm densities, achieving cell sizes around 6 F² (where F is the minimum feature size) through aggressive and materials like high-k dielectrics. These evolutions prioritize leakage reduction and coupling minimization over architectural overhaul, as alternatives like 2T gain cells—employing floating-body effects for capacitorless storage—exhibit insufficient retention (microseconds) and variability for standalone gigabit-scale arrays, confining them to low-density embedded DRAM. Emerging proposals, including 3D vertically channeled 1T1C variants demonstrated in 2024 using IGZO transistors for improved , signal potential extensions beyond planar limits, yet as of 2025, production DRAM universally adheres to planar or quasi-planar 1T1C amid capacitor scaling challenges below 10 nm. This persistence underscores the causal trade-off: 1T1C's simplicity facilitates cost-effective fabrication at terabit densities, outweighing refresh overheads mitigated by on-chip controllers, while multi-transistor cells remain niche for applications demanding zero-refresh volatility like some SRAM hybrids.

Array Organizations and Redundancy Techniques

In DRAM, memory cells are arranged in a two-dimensional grid within subarrays (also called mats), where rows are selected by wordlines and columns by bitlines, enabling to individual cells. Subarrays are hierarchically organized into banks to balance density, access speed, and power efficiency, with sense amplifiers typically shared between adjacent subarrays to minimize area overhead. The primary array organizations differ in bitline pairing relative to sense amplifiers, influencing noise immunity, density, and susceptibility to coupling effects. In open bitline architectures, each sense amplifier connects to one bitline from an adjacent subarray on each side, allowing bitline pairs to straddle the sense amplifier array; this configuration supports higher cell densities (e.g., enabling 6F² cell sizes in advanced variants) but increases vulnerability to from wordline-to-bitline and reference bitline imbalances, as true and complementary bitlines are physically separated. Open bitline designs dominated early DRAM generations, from 1 Kbit up to 64 Kbit (and some 256 Kbit) devices, due to their area efficiency during initial scaling phases. In contrast, folded bitline architectures route both the true and complementary bitlines within the same subarray, twisting them to align at a single per pair, which enhances differential sensing and common-mode rejection by equalizing parasitic capacitances and reducing imbalance errors. This organization trades density for reliability, typically yielding 8F² cell sizes, and became prevalent from the 256 Kbit generation onward in DRAMs to mitigate scaling-induced in denser arrays. Hybrid open/folded schemes have been proposed for ultra-high-density DRAMs, combining open bitline density in core arrays with folded sensing for improved immunity, though adoption remains limited by manufacturing complexity. Redundancy techniques in DRAM address manufacturing defects and field failures by incorporating spare elements to replace faulty rows, columns, or cells, thereby boosting yield without discarding entire chips. Conventional approaches provision 2–8 spare rows and columns per bank or subarray, programmed via laser fuses or electrical fuses during to map defective lines to spares, with replacement logic redirecting addresses transparently to the . This row/column handles clustered defects common in fabrication, occupying approximately 5% of chip area in high-density designs (e.g., 5.8 mm² in a 1.6 GB/s DRAM example). Advanced built-in self-repair (BISR) schemes extend this by enabling runtime or post-packaging diagnosis and repair, using on-chip analyzers to identify faults and allocate spares at finer granularities, such as intra-subarray row segments or individual bits, which improves repair coverage for clustered errors over global row/column swaps. For instance, BISR with 2 spare rows, 2 spare columns, and 8 spare bits per subarray has demonstrated higher yield rates in simulations compared to fuse-only methods, particularly for multi-bit faults. These techniques integrate with error-correcting codes (ECC) for synergistic reliability, though they increase control logic overhead by 1–2% of array area.

Reliability and Error Management

Detection and Correction Mechanisms

Dynamic random-access memory (DRAM) is susceptible to both transient s, primarily induced by cosmic rays and alpha particles that cause bit flips through charge deposition in the or substrate, and permanent hard errors from manufacturing defects or wear-out mechanisms such as . Soft error rates in DRAM have been measured to range from 10^{-9} to 10^{-12} errors per bit-hour under terrestrial conditions, escalating with density scaling as cell capacitance decreases and susceptibility to single-event upsets increases. Detection mechanisms typically employ parity checks or generation to identify discrepancies between stored data and redundant check bits, while correction relies on error-correcting codes (ECC) that enable reconstruction of the original data. The foundational ECC scheme for DRAM, Hamming code, supports single-error correction (SEC) by appending log2(n) + m parity bits to m data bits, where n is the total codeword length, allowing detection and correction of any single-bit error within the codeword through syndrome decoding. Extended to SECDED configurations using an overall parity bit, this detects double-bit errors while correcting singles, a standard adopted in server-grade DRAM modules since the 1980s to tolerate cosmic-ray-induced multi-bit upsets confined to a single chip. In practice, external ECC at the module level interleaves check bits across multiple DRAM chips, as in 72-bit wide modules (64 data + 8 ECC), enabling chipkill variants like orthogonal Latin square codes that correct entire chip failures by distributing data stripes. With DRAM scaling beyond 10 nm nodes, raw bit error rates have risen, prompting integration of on-die ECC directly within DRAM chips to mask internal errors before data reaches the . Introduced in low-power variants like LPDDR4 around 2014 and standardized in DDR5 specifications from 2020, on-die ECC typically employs shortened BCH or Reed-Solomon codes operating on 128-512 bit bursts, correcting 1-2 bits per codeword internally without latency visible to the system. This internal mechanism reduces effective error rates by up to 100x for single-bit failures but does not address inter-chip errors, necessitating complementary system-level ECC for comprehensive protection in high-reliability applications. Advanced proposals, such as collaborative on-die and in-controller ECC, further enhance correction capacity for emerging multi-bit error patterns observed in field data.

Built-in Redundancy and Testing

DRAM employs built-in redundancy primarily through arrays of spare rows and columns, which replace defective primary lines identified during manufacturing testing, thereby improving die yield in the face of defect densities inherent to scaled processes. This technique, recognized as essential from DRAM's early commercialization stages by entities including and Bell Laboratories, allows faulty word lines or bit lines to be remapped via laser fusing or electrical programming of fuses/anti-fuses, preserving functionality without discarding entire chips. Redundancy allocation occurs post-fault detection, often using built-in redundancy analysis (BIRA) circuits that implement algorithms to match defects to available spares, optimizing repair rates; for instance, enhanced fault collection schemes in 1 Mb embedded RAMs with up to 10 spares per can boost repair effectiveness by up to 5%. Configurations typically include 2–4 spare rows and columns per or subarray, alongside occasional spare bits for finer , with hierarchical or flexible mapping reducing area overhead to around 3% in multi-bank designs. Testing integrates (BIST) mechanisms, which generate deterministic patterns like algorithms to probe for stuck-at, transition, and coupling faults across the , sense amplifiers, and decoders, often with programmable flexibility for embedded or commodity DRAM variants. In commercial implementations, such as 16 Gb DDR4 devices on 10-nm-class nodes, in-DRAM BIST achieves equivalent coverage to traditional methods while cutting test time by 52%, minimizing reliance on costly external (ATE). BIST circuits handle refresh operations during evaluation and support modes to localize faults for precise steering. Wafer-level and packaged testing sequences encompass retention time verification, speed binning, and redundancy repair trials, with BIRA evaluating post-BIST fault maps to determine salvaged yield; unrepairable dies are marked for rejection, while simulations confirm that spare scaling alone does not linearly enhance outcomes without advanced allocation logic. These integrated approaches sustain economic viability for gigabit-scale DRAM production, where defect clustering demands multi-level across global, bank, and local scopes.

Security Vulnerabilities

Data Remanence and Retention Issues

Data remanence in DRAM arises from the gradual discharge of storage capacitors following power removal or attempted erasure, allowing residual charge to represent logical states for a finite period. This persistence stems from inherently low leakage currents in modern CMOS processes, where capacitors can retain charge for seconds at ambient temperatures and longer when chilled, enabling forensic recovery of sensitive data such as encryption keys. Unlike non-volatile memories, DRAM's volatility is not absolute, as demonstrated in empirical tests showing bit error rates below 0.1% for up to 30 seconds post-power-off at 25°C in DDR2 modules. Retention issues exacerbate security risks through temperature-dependent decay dynamics, where charge loss accelerates exponentially with heat—retention time roughly halves for every 10–15°C rise due to increased subthreshold and gate-induced drain leakage in access transistors. In operational contexts, DRAM cells require refresh cycles every 64 ms to prevent from these mechanisms, but post-power-off remanence defies expectations of immediate volatility. The 2008 cold boot attack exploited this by spraying canned air to cool modules to near-freezing, then transferring chips to a reader ; tests on 37 DDR2 DIMMs recovered full rows with for 1–5 minutes at -20°C, and partial data up to 10 minutes in chilled states, directly extracting and keys. Modern DRAM generations introduce partial mitigations like address and scramblers in DDR4, intended to randomize bit patterns and hinder pattern-based recovery, yet analyses confirm vulnerabilities persist. A 2017 IEEE study on DDR4 modules showed that states could be reverse-engineered via error-correcting codes and statistical analysis of multiple cold boot samples, achieving over 90% key recovery rates despite . Retention variability across cells—ranging from 10 ms to over 1 second in unrefreshed states—further complicates secure erasure, as uneven leakage can leave mosaics of recoverable even after overwriting. These issues underscore causal reliance on physical leakage physics rather than assumed instant volatility, with indicating that low-temperature remains a practical for physical memory attacks.

Rowhammer Attacks and Bit Flipping

Rowhammer attacks exploit a read-disturbance in DRAM where repeated activations of a specific row, known as the aggressor row, induce bit flips in adjacent victim rows without directly accessing them. This phenomenon arises from the dense packing of memory cells, where the electrical disturbances from frequent row activations—such as voltage spikes on wordlines or —accelerate charge leakage in neighboring capacitors, potentially altering stored bit values if the charge drops below the amplifier's detection threshold. The effect was first systematically characterized in a 2014 study by researchers including Yoongu Kim, demonstrating bit error rates exceeding 200 flips per minute in vulnerable DDR3 modules under aggressive access patterns exceeding 100,000 activations per row. Bit flipping in Rowhammer occurs primarily due to two causal mechanisms: solid-angle effects, where from the aggressor row's wordline disturb victim cell capacitors, and effects from repeated charge pumping, though the former dominates in modern scaled DRAM geometries below 20 nm. Experiments on commodity DDR3 and DDR4 chips from 2010 to 2016 showed that single-sided hammering (targeting one adjacent row) could flip bits with probabilities up to 1 in 10^5 accesses in worst-case cells, while double-sided hammering—alternating between two aggressor rows flanking a victim—amplifies flips by concentrating disturbances, achieving deterministic errors in as few as 54,000 cycles on susceptible hardware. These flips manifest as 0-to-1 or 1-to-0 transitions, with 1-to-0 being more common due to charge loss in undercharged cells, and vulnerability varying by DRAM manufacturer, density, and refresh intervals; for instance, certain modules exhibited up to 64x higher error rates than others under identical conditions. The security implications of Rowhammer extend to privilege escalation, data exfiltration, and denial-of-service, as attackers can craft software to hammer rows from user space, bypassing isolation in virtualized or multi-tenant environments. Notable demonstrations include the 2016 Drammer attack on ARM devices, which flipped bits to gain root privileges via Linux kernel pointer corruption, succeeding on 18 of 19 tested smartphones with error rates as low as one activation cycle per bit flip in optimized scenarios. Further exploits, such as those corrupting page table entries to leak cryptographic keys or manipulate JavaScript engines in browsers, highlight how bit flips enable cross-VM attacks in cloud settings, with real-world success rates exceeding 90% on unmitigated DDR4 systems when targeting ECC-weak spots. Despite mitigations like increased refresh rates, variants such as TRRespass (2019) evade them by exploiting timing-based row remapping, underscoring persistent risks in scaled DRAM where cell interference scales inversely with feature size.

Mitigation Strategies and Hardware Protections

Target Row Refresh (TRR) is a primary hardware mitigation deployed in modern DDR4 and DDR5 DRAM modules to counter Rowhammer-induced bit flips, where the DRAM controller or on-chip logic monitors access patterns to a row and proactively refreshes adjacent victim rows upon detecting excessive activations, typically exceeding a threshold of hundreds to thousands of accesses within a short window. Manufacturers such as and Micron integrate TRR variants in their chips, often combining per-bank or per-subarray counters with probabilistic or deterministic refresh scheduling to balance against performance overheads of 1-5% in refresh latency. Despite its effectiveness against classical Rowhammer patterns, advanced attacks like TRRespass demonstrate that TRR implementations can be evaded by exploiting refresh interval timing or multi-bank hammering, prompting refinements such as ProTRR, which uses principled counter designs for provable guarantees under bounded overhead. Error-correcting code (ECC) DRAM provides an additional layer of protection by detecting and correcting single- or multi-bit errors induced by , with server-grade modules using on-die or module-level ECC to mask flips that evade refresh-based mitigations, though it increases cost and latency by 5-10% and offers limited resilience against multi-bit bursts. In-DRAM trackers, as explored in recent research, employ lightweight bloom-filter-like structures or counter arrays within the DRAM periphery to identify aggressor rows with minimal area overhead (under 1% of die space) and trigger targeted refreshes, outperforming traditional CPU-side monitoring by reducing false positives and enabling to denser DDR5 hierarchies. Proposals like DEACT introduce deactivation counters that or isolate over-accessed rows entirely, providing deterministic defense without relying on probabilistic refresh, though adoption remains limited due to compatibility concerns with existing standards. For data remanence vulnerabilities, where residual charge in DRAM capacitors enables recovery of cleared data for seconds to minutes post-power-off—especially at sub-zero temperatures—hardware protections emphasize rapid discharge circuits and retention-time-aware refresh optimizations integrated into the . Techniques such as variable refresh rates, calibrated via on-chip retention monitors, mitigate retention failures by refreshing weak cells more frequently while extending intervals for stable ones, reducing overall power draw by up to 20% without compromising security against exploitation. System-level hardware like secure enclaves (e.g., SGX) incorporate memory encryption and integrity checks to render remanent data useless even if extracted, though these rely on processor integration rather than standalone DRAM features. Comprehensive defenses often combine these with post-manufacture testing for retention variability, ensuring modules meet standards for minimum 64ms retention at 85°C, thereby minimizing risks in volatile environments.

Variants and Technological Evolutions

Asynchronous DRAM Variants

Asynchronous DRAM variants utilize control signals like Row Address Strobe (RAS) and Column Address Strobe (CAS) to manage access timing without reliance on a system clock, enabling compatibility with varying processor speeds in early computing systems. These variants evolved to address performance limitations of basic DRAM by optimizing within the same row, reducing latency for page-mode operations. Key types include Fast Page Mode (FPM), Extended Data Out (), and Burst EDO (BEDO), which progressively improved throughput through architectural refinements in data output and addressing mechanisms. Fast Page Mode (FPM) DRAM enhances standard DRAM by latching a row address once via RAS, then allowing multiple column addresses via repeated CAS cycles without reasserting RAS, minimizing overhead for accesses within the same page. This mode achieved typical timings of 6-3-3-3, where initial access latency is higher but subsequent page hits are faster, making it the dominant type in personal computers from the late through the mid-1990s. FPM provided measurable speed gains over non-page-mode DRAM by exploiting spatial locality in accesses, though it required wait states at higher bus speeds like 33 MHz. Extended Data Out () DRAM builds on FPM by maintaining valid output data even after CAS deasserts, permitting the next memory cycle's address setup to overlap with data latching, thus eliminating certain wait states. This results in approximately 30% higher peak data rates compared to equivalent FPM modules, with support for bus speeds up to 66 MHz without added latency in many configurations. DRAM, introduced in the mid-1990s, offered with FPM systems while enabling tighter timings like 5-2-2-2, though full benefits required support. Burst (BEDO) DRAM extends EDO functionality with a burst mode that internally generates up to three additional addresses following the initial one, processing four locations in a single sequence with timings such as 5-1-1-1. This pipelined approach reduced cycle times by avoiding repeated CAS assertions for sequential bursts, potentially doubling performance over FPM and improving 50% on standard EDO in supported setups. Despite these advantages, BEDO saw limited adoption in the late due to insufficient and integration, overshadowed by emerging synchronous DRAM technologies.

Synchronous DRAM Generations

Synchronous dynamic random-access memory (SDRAM) synchronizes internal operations with an external , enabling burst modes, pipelining, and command queuing for improved throughput over asynchronous DRAM. Initial single data rate (SDR) SDRAM, which transfers data only on the clock's rising edge, emerged commercially in 1993 from manufacturers like and was standardized by under JESD79 in 1997, supporting clock speeds up to 133 MHz and capacities starting at 16 Mb per chip. The shift to (DDR) SDRAM doubled bandwidth by capturing data on both clock edges, with prototypes demonstrated by in 1996 and JEDEC ratification of the DDR1 (JESD79-2) standard in June 2000 at 2.5 V operating voltage, initial speeds of 200-400 MT/s, and a 2n prefetch architecture. , standardized in September 2003 under JESD79-2B at 1.8 V, introduced a 4n prefetch, on-die termination (ODT) for better , and speeds up to 800 MT/s, while reducing power through differential strobe signaling. DDR3, ratified by JEDEC in 2007 (JESD79-3) at 1.5 V (later 1.35 V low-voltage variant), extended prefetch to 8n, added fly-by topology for reduced latency in multi-rank modules, and achieved speeds up to 2133 MT/s, prioritizing power efficiency with features like auto self-refresh. DDR4, introduced in 2014 via JESD79-4 at 1.2 V, incorporated bank groups for parallel access, further latency optimizations, and data rates exceeding 3200 MT/s, enabling higher densities up to 128 Gb per die through 3D stacking precursors. DDR5, finalized by in July 2020 (JESD79-5) at 1.1 V, introduces on-die (ECC) for reliability, decision feedback equalization for at speeds over 8400 MT/s, ICs (PMICs) for per-rank , and support for densities up to 2 Tb per module, addressing scaling challenges in .
GenerationJEDEC Standard YearVoltage (V)Max Data Rate (MT/s)Prefetch BitsKey Innovations
SDR19973.31331nClock synchronization, burst mode
DDR120002.54002nDual-edge transfer, DLL for timing
DDR220031.88004nODT, prefetch increase
DDR320071.521338nFly-by CK/ADDR, ZQ calibration
DDR420141.23200+8nBank groups, gear-down mode
DDR520201.18400+16nOn-die ECC, PMIC, CA parity
Each generation has prioritized backward incompatibility for architectural advances, with DDR5 emphasizing resilience and amid node shrinks below 10 nm, though adoption lags in markets due to and maturity as of 2025.

Specialized Types: Graphics, Mobile, and High-Bandwidth

Graphics (GDDR) synchronous represents a specialized DRAM variant tailored for high-throughput demands in graphics processing units (GPUs), prioritizing bandwidth over latency through features like prefetch buffering and on-die correction. Initial GDDR iterations, such as GDDR1, emerged in 2000 to address GPU bandwidth needs surpassing those of standard . Subsequent generations, including GDDR5 based on DDR3 architecture and GDDR6 aligned with DDR4, have scaled pin data rates progressively; GDDR6 achieves up to 24 Gbps per pin, yielding device bandwidths around 1.1 TB/s in optimized configurations. The latest GDDR7 standard, published by in March 2024, doubles per-device bandwidth to 192 GB/s via PAM3 signaling, supporting escalating requirements for AI-accelerated rendering and high-resolution displays. Low-Power Double Data Rate (LPDDR) DRAM adapts synchronous DRAM principles for battery-limited mobile and embedded systems, incorporating lower core and I/O voltages—such as 0.6 V for LPDDR4X—to minimize active and standby power draw while maintaining competitive speeds. The LPDDR5 specification, updated by JEDEC in 2019, enables I/O rates up to 6400 MT/s, a 50% increase over initial LPDDR4, with built-in deep sleep modes and adaptive voltage scaling for 5-10% gains in battery life relative to predecessors. LPDDR5X extensions further enhance efficiency by up to 24% through refined clocking and channel architectures, supporting capacities to 64 GB for applications like 8K video processing in smartphones and automotive infotainment. High Bandwidth Memory (HBM) employs vertical stacking of multiple DRAM dies using through-silicon vias (TSVs) and a base logic die, creating ultra-wide interfaces—typically 1024-bit or 2048-bit per stack—for parallel data access in , AI accelerators, and premium GPUs. , finalized by in early 2022, provides stack bandwidths up to 819 GB/s at 6.4 Gbps per pin, with capacities reaching 64 GB via 16-high configurations, enabling terabyte-scale addressing for data-intensive workloads. Enhanced variants, introduced commercially around 2023, extend pin speeds to over 9 Gb/s, delivering more than 1.2 TB/s per stack and 2.5 times the of prior generations through refined error correction and . This architecture trades manufacturing complexity for superior density and latency reduction compared to discrete GDDR dies.

Challenges, Limitations, and Comparisons

Scaling and Physical Limits

As DRAM technology has scaled from micron-scale features in the to sub-20 nm nodes by the , has increased exponentially, with bit cell sizes shrinking from 6F² to 4F² geometries, where F represents the minimum feature size. However, planar DRAM scaling confronts fundamental physical constraints around 10-12 nm, beyond which continued lateral shrinkage yields due to quantum tunneling, increased leakage currents, and insufficient signal margins. The core limitation stems from the DRAM cell's reliance on a to store charge representing binary states, typically requiring 10-20 fF of for reliable sensing, yet scaling reduces volume, dropping to projected 5-6 fF per cell in advanced nodes like D1c. This exacerbates charge leakage through thinner dielectrics, governed by exponential increases in tunneling currents as oxide thickness approaches atomic scales (e.g., below 5 nm equivalent thickness), shortening retention times from milliseconds to microseconds and necessitating more frequent refreshes—up to every 32 ms in modern devices—which consume power and bandwidth. Access transistors face short-channel effects, including drain-induced barrier lowering and variability, which degrade subthreshold swing and increase off-state leakage, further eroding the charge-to-noise ratio essential for distinguishing logic states amid thermal noise and read disturb errors. Interference between adjacent cells intensifies in denser arrays, with bitline coupling and wordline resistance rising, limiting effective scaling below 10 nm half-pitch. These barriers have slowed DRAM density gains to 20-30% per generation since the 1x nm nodes (circa 2016-2018), compared to historical 50-100% doublings, compelling innovations like high-k dielectrics and metal electrodes, yet physical realities—such as the Poisson equation dictating field nonuniformity in scaled capacitors—impose thermodynamic and electrostatic limits that preclude indefinite planar extension without reliability failures.

Power, Density, and Performance Trade-offs

As DRAM technology scales to higher densities, cell sizes shrink—reaching capacitances below 10 fF per cell in recent nodes like D1z and D1a—enabling greater storage capacity per die but introducing challenges with charge retention and leakage currents. This reduction in capacitance shortens intervals to around 32-64 ms under typical conditions, compelling more frequent refresh operations to prevent bit errors. Consequently, refresh mechanisms, such as auto-refresh in DDRx standards, account for 25-30% of total DRAM in high-density devices exceeding 32 Gb, with background power rising proportionally to capacity and row count per bank. Power efficiency deteriorates further as increases because static leakage grows with smaller transistors, while dynamic power from row activations and column accesses scales less favorably than capacity gains. Historical scaling delivered 100-fold increases per decade through the , but recent advancements have slowed to roughly twofold over the past ten years, partly due to these power walls, where refresh overhead dominates and limits net per bit improvements. Lowering supply voltages—such as from 1.5 V in DDR3 to 1.2 V in DDR4—helps curb dynamic power but exacerbates retention variability, often necessitating voltage boosts or adaptive refresh to maintain reliability, thereby trading off standby efficiency for operational stability. Performance trade-offs manifest in increased latency from refresh-induced bus contention, degrading effective throughput by over 30% in dense, high-speed configurations like 32 Gb modules. While planar scaling boosts at the cost of speed limits from interconnect resistance and , vertical stacking in variants like HBM achieves higher effective and bandwidth (e.g., up to 1.5 TB/s projected for HBM4) but elevates and management demands, with interface comprising over 95% of total in HBM stacks. These dynamics compel designers to balance metrics, such as prioritizing low-power modes like self-refresh for mobile applications, which reduce idle power by disabling I/O but cap peak performance.

Versus SRAM, Flash, and Emerging Memories

DRAM employs a single and per bit (1T1C), enabling bit densities far exceeding SRAM's 6- (6T) cells, which limits SRAM to approximately 30 Mbit/mm² even at advanced 3-nm logic nodes. This density advantage allows DRAM to achieve cost-effective capacities in the range for system , with per-bit costs around $50/GB, compared to SRAM's prohibitive expense—often thousands of dollars per equivalent capacity—due to its complexity and lower scalability. SRAM, however, delivers sub-10 ns access times without refresh overhead, versus DRAM's 10-60 ns latencies plus periodic row activation every 64 ms to combat charge leakage, making SRAM preferable for low-latency applications like processor caches despite its higher static power draw from continuous biasing. In contrast to non-volatile NAND Flash, DRAM is inherently volatile, requiring constant power to retain data via charge, whereas Flash uses floating-gate or charge-trap mechanisms for persistence without supply. DRAM provides read/write speeds orders of magnitude faster—typically 100 times that of NAND—suited for frequent, byte-addressable operations in , but it lacks Flash's block-level endurance for archival storage, where NAND cells withstand 10³-10⁵ program/erase cycles before degradation. Flash achieves higher sequential densities for at lower per-bit costs over time, though its erase-before-write latency and wear-leveling overheads render it unsuitable for DRAM's role in . Emerging non-volatile memories (eNVMs) like spin-transfer torque MRAM (STT-MRAM), resistive RAM (ReRAM), and phase-change memory (PCM) aim to supplant DRAM by combining non-volatility, SRAM-like speeds (sub-10 ns), and densities approaching or exceeding DRAM's, while eliminating refresh power—potentially halving system energy in data centers. As of 2025, ReRAM leads in cost-effective scalability for embedded and storage-class applications due to simpler fabrication, with MRAM targeting high-end cache replacements via unlimited (>10¹² cycles) and PCM suited for dense, multilevel cells despite higher write voltages. These technologies remain niche, with yields and integration challenges delaying widespread DRAM substitution until beyond 2030, though they address DRAM's scaling barriers from shrinkage and leakage at sub-10 nm nodes.

Applications, Market Dynamics, and Future Directions

Role in Computing Systems

Dynamic random-access memory (DRAM) serves as the main memory in most computing systems, storing data and program instructions that the () accesses during execution. This type provides with latencies typically ranging from 50 to 100 nanoseconds, positioning it between faster on-chip caches and slower secondary storage in the . By holding active workloads close to the processor, DRAM enables efficient program execution, multitasking, and , with capacities scaling to gigabytes or terabytes in modern configurations. In personal computers and workstations, DRAM modules, often in dual in-line memory module () form factors, form the bulk of system RAM, directly interfacing with the CPU via memory controllers to support operating systems, applications, and temporary file storage. Servers and data centers rely on high-density DRAM for handling large-scale computations, , and databases, where error-correcting code (ECC) variants mitigate bit errors in mission-critical environments; average DRAM per server grew by 12.1% year-over-year in 2023 to accommodate AI and workloads. Mobile devices and embedded systems employ specialized low-power DRAM, such as variants, optimized for energy efficiency and compact integration, powering smartphones, tablets, IoT gadgets, and with capacities tailored to real-time processing needs. Across these platforms, DRAM's cost-effectiveness and underpin , though its refresh requirements necessitate continuous power to retain , distinguishing it from non-volatile alternatives.

Manufacturing, Economics, and Supply Chain

DRAM manufacturing involves fabricating integrated circuits on wafers through a sequence of processes including for patterning, thin-film deposition, , doping, and chemical mechanical planarization to create multilayer structures with millions of memory cells per die. Subsequent packaging and assembly utilize silver in solder alloys such as Sn-Ag-Cu (SAC305) to improve joint reliability and thermal fatigue resistance, as well as in silver-filled conductive epoxy adhesives for die attach to bond the silicon die to the substrate, providing high thermal and electrical conductivity. This fabrication process represents the primary bottleneck in producing RAM modules, as module assembly—involving packaging chips onto printed circuit boards—is simpler and less capacity-limiting. Advanced nodes, such as Micron's 1α (1-alpha) process introduced in 2024, incorporate (EUV) for five or more layers to enable sub-10nm feature sizes, alongside innovations like low-contact-resistivity schemes and shallow doping for performance. Wafer fabrication facilities (fabs) operate in cleanrooms to minimize , starting with 300mm wafers sliced from ingots, followed by up to 20 or more patterned layers to form capacitors and access s in a 1T1C configuration. The DRAM industry is an oligopoly dominated by three leading firms—Samsung Electronics, SK Hynix, and Micron Technology—which collectively control over 95% of global production capacity, with recent 2025 market shares reflecting their leadership: SK Hynix at approximately 34%, Samsung at 33%, and Micron at 25% by revenue in key quarters. although market shares in specialized segments like HBM are more volatile due to dependencies on specific platform launches, ramp timings, and supply commitments. In Q2 2025, held 38.2% market share by revenue, followed by at approximately 33% and Micron trailing, reflecting SK hynix's gains from high-bandwidth memory (HBM) demand for AI applications. Global DRAM revenue reached $115.89 billion in 2024, projected to grow to $121.83 billion in 2025 amid AI-driven surges, though the market exhibits high cyclicality with boom-bust cycles due to volatile demand—particularly from servers, data centers, PCs, mobiles, and —and slow supply adjustments, as new fab capacity takes 2-3 years to online; these cycles are intensified by explosive demand for HBM in AI data centers, which boosts HBM production and prompts manufacturers like Samsung, SK Hynix, and Micron to prioritize HBM over consumer DRAM such as DDR5 and GDDR due to HBM's higher gross margins (approximately 50-60% versus 35-40% for standard DRAM) and consumption of advanced manufacturing capacity for AI applications, alongside the ability of advanced DRAM fabs to switch production lines between HBM, which requires extra stacking and processing steps, and commodity products like DDR5, with shifting to standard DRAM being relatively straightforward to enable quick responses to market shifts; the gradual phase-out of older standards like DDR4 leading to rapid inventory depletion and cautious overall capacity expansion owing to long build times for new production lines (2-3 years), with supply growth lagging behind demand, leading to shortages and price surges for PC memory while creating supply-demand imbalances and driving quarterly price increases of 10%-50%. The commodity-like traits of DRAM have spurred attempts to develop futures trading mechanisms to hedge against price volatility, though no sustained markets have emerged due to challenges in contract standardization. These financial dynamics, including revenue ratios tied to memory segments and gross margins, significantly influence investment valuations for the major producers' tech stocks. Prices fluctuate sharply; for instance, during the transition from DDR4 to DDR5, DDR4 prices surged over 40% and overtook DDR5 levels in mid-2025 as manufacturers delayed phase-outs and reallocated capacity to HBM, exacerbating shortages amid constrained expansions of new production capacity, with conventional DRAM contract prices rising 8-13% quarter-over-quarter in Q3 2025 and overall increases including HBM reaching 15-18%, fueled by supply tightness and AI server demand outpacing bit shipment growth of 11-17% annually. Projections for 2026 indicate further escalation, with DRAM contract prices expected to rise by up to 50% in Q1 due to aggressive shortages driven by surging AI demand, with cumulative increases reaching 50-100% in contract and spot prices across late 2025 and early 2026 as AI infrastructure demand strains supply; Citibank research has revised its DRAM outlook upward, forecasting an 88% increase in average selling prices for 2026 compared to a prior estimate of 53%. This price surge is anticipated to persist throughout the year, potentially leading to 15-20% hikes in PC prices starting from the second half of 2026, as rising DRAM costs increase assembly bill-of-materials expenses—where memory can exceed 20% of total—and elevate end-user prices by 5-15%, pressuring demand particularly in mid-range and budget consumer PC segments amid supply shifts to higher-margin AI products. According to TrendForce, memory prices are projected to rise sharply again in the first quarter of 2026, impacting consumer markets such as smartphones and notebooks; for major buyers like Apple, the expiration of long-term DRAM supply contracts with suppliers such as Samsung and SK hynix in early 2026 exposes them to elevated spot-market rates amid ongoing shortages, potentially increasing costs and leading to higher prices for devices including iPhones and MacBooks. Industry executive Raja Koduri has observed that if the DRAM-to-SRAM price-per-byte ratio falls below 5x due to such irrational price increases, AI system designs may shift toward SRAM despite requiring 3-4 years for redesigning memory subsystems. Supply chains for DRAM are heavily concentrated in East Asia, with South Korea hosting the majority of advanced fabs for Samsung and SK hynix, while Micron maintains facilities in the US (e.g., Idaho, Virginia), Singapore, and Japan. This geographic focus creates vulnerabilities to geopolitical tensions, including US-China trade restrictions and risks in the Taiwan Strait, though DRAM production is less Taiwan-dependent than logic chips, relying more on Korean capacity for leading-edge nodes. Key dependencies include specialized equipment like EUV lithography tools from ASML (Netherlands), high-purity chemicals, and silicon wafers from Japan and the US, with disruptions—such as natural disasters or export controls—amplifying cyclical shortages, as evidenced by 2025 price hikes of up to 30% amid AI-induced demand imbalances. Efforts to diversify, including US subsidies under the CHIPS Act for domestic fabs, aim to mitigate these risks but face delays in scaling advanced DRAM production outside Asia.

AI-Driven Advancements and Emerging Technologies

The surge in workloads, particularly large-scale model training and inference, has accelerated DRAM innovations by necessitating higher bandwidth and capacity to mitigate data bottlenecks. This intense demand has also contributed to a projected memory chip price surge in 2026, with DRAM prices expected to rise by up to 50% in Q1 due to aggressive shortages driven by AI accelerators, potentially doubling server memory costs amid strained supply chains. High-bandwidth memory (HBM), which stacks multiple DRAM dies vertically with through-silicon vias for parallel data access, has seen explosive growth, with revenues projected to double from $17 billion in 2024 to $34 billion in 2025, driven primarily by AI accelerators like GPUs. HBM3E variants, offering up to 1.2 TB/s bandwidth per stack, enable efficient handling of terabyte-scale datasets in AI systems, outperforming traditional GDDR in latency-sensitive tasks. Graphics double data rate (GDDR) has evolved with AI demands, as exemplified by GDDR7, which introduced in 2024 with signaling rates exceeding 32 Gbps per pin, doubling the bandwidth of GDDR6X to support faster AI model processing and reducing energy per bit transferred. This incorporates PAM3 signaling for higher density, directly addressing the exponential needs of generative AI, where movement constitutes up to 70% of energy costs in conventional von Neumann systems. Processing-in-memory (PIM) represents a , embedding compute logic directly within DRAM arrays to execute operations like matrix multiplications near , slashing latency and power for AI algorithms. Samsung's HBM-PIM, announced in 2023 and advancing through 2025 prototypes, integrates accelerator units into HBM stacks, achieving up to 2.4 TFLOPS per stack for sparse AI workloads while maintaining DRAM's density advantages over discrete processors. Research prototypes, such as PIM-DRAM, demonstrate intra-bank accumulation for vector operations, accelerating inference by factors of 10-20x in bandwidth-bound scenarios. AI algorithms are also applied upstream in DRAM development, optimizing layout synthesis and defect detection; Samsung reported in 2025 that machine learning models improved DRAM yield by 15% through in fabrication. Emerging architectures like NEO Semiconductor's X-HBM, unveiled in August 2025, propose 32K-bit-wide interfaces in 3D-stacked DRAM for AI chips, targeting sub-picosecond access times to overcome scaling limits in 2D arrays. Similarly, imec's 3D charge-coupled-device buffers with IGZO channels offer high-density retention for AI , with prototypes showing 10x density over conventional SRAM caches. These technologies collectively address the "memory wall," where AI's data-intensive nature outpaces gains in compute density.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.