Hubbry Logo
search
logo
2309383

Flash memory

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia

A disassembled USB flash drive in 2005. The chip on the left is flash memory. The controller is on the right.

Flash memory is an electronic non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for the NOR and NAND logic gates. Both use the same cell design, consisting of floating-gate MOSFETs. They differ at the circuit level, depending on whether the state of the bit line or word lines is pulled high or low; in NAND flash, the relationship between the bit line and the word lines resembles a NAND gate; in NOR flash, it resembles a NOR gate.

Flash memory, a type of floating-gate memory, was invented by Fujio Masuoka at Toshiba in 1980 and is based on EEPROM technology. Toshiba began marketing flash memory in 1987.[1] EPROMs had to be erased completely before they could be rewritten. NAND flash memory, however, may be erased, written, and read in blocks (or pages), which generally are much smaller than the entire device. NOR flash memory allows a single machine word to be written – to an erased location – or read independently. A flash memory device typically consists of one or more flash memory chips (each holding many flash memory cells), along with a separate flash memory controller chip.

The NAND type is found mainly in memory cards, USB flash drives, solid-state drives (those produced since 2009), feature phones, smartphones, and similar products, for general storage and transfer of data. NAND or NOR flash memory is also often used to store configuration data in digital products, a task previously made possible by EEPROM or battery-powered static RAM. A key disadvantage of flash memory is that it can endure only a relatively small number of write cycles in a specific block.[2]

NOR flash is known for its direct random access capabilities, making it apt for executing code directly. Its architecture allows for individual byte access, facilitating faster read speeds compared to NAND flash. NAND flash memory operates with a different architecture, relying on a serial access approach. This makes NAND suitable for high-density data storage, but less efficient for random access tasks. NAND flash is often employed in scenarios where cost-effective, high-capacity storage is crucial, such as in USB drives, memory cards, and solid-state drives (SSDs).

The primary differentiator lies in their use cases and internal structures. NOR flash is optimal for applications requiring quick access to individual bytes, as in embedded systems for program execution. NAND flash, on the other hand, shines in scenarios demanding cost-effective, high-capacity storage with sequential data access.

Flash memory[3] is used in computers, PDAs, digital audio players, digital cameras, mobile phones, synthesizers, video games, scientific instrumentation, industrial robotics, and medical electronics. Flash memory has a fast read access time but is not as fast as static RAM or ROM. In portable devices, it is preferred to use flash memory because of its mechanical shock resistance, since mechanical drives are more prone to mechanical damage.[4]

Because erase cycles are slow, the large block sizes used in flash memory erasing give it a significant speed advantage over non-flash EEPROM when writing large amounts of data. As of 2019, flash memory costs much less than byte-programmable EEPROM and has become the dominant memory type wherever a system required a significant amount of non-volatile solid-state storage. EEPROMs, however, are still used in applications that require only small amounts of storage, e.g. in SPD implementations on computer-memory modules.[5][6]

Flash memory packages can use die stacking with through-silicon vias and several dozen layers of 3D TLC NAND cells (per die) simultaneously to achieve capacities of up to 1 tebibyte per package using 16 stacked dies and an integrated flash controller as a separate die inside the package.[7][8][9][10]

History

[edit]

Background

[edit]

The origins of flash memory can be traced to the development of the floating-gate MOSFET (FGMOS), also known as the floating-gate transistor.[11][12] The original MOSFET was invented at Bell Labs between 1959 and 1960.[13][14] Dawon Kahng went on to develop a variation, the floating-gate MOSFET, with Taiwanese-American engineer Simon Min Sze at Bell Labs in 1967.[15] They proposed that it could be used as floating-gate memory cells for storing a form of programmable read-only memory (PROM) that is both non-volatile and re-programmable.[15]

Early types of floating-gate memory included EPROM (erasable PROM) and EEPROM (electrically erasable PROM) in the 1970s.[15] However, early floating-gate memory required engineers to build a memory cell for each bit of data, which proved to be cumbersome,[16] slow,[17] and expensive, restricting floating-gate memory to niche applications in the 1970s, such as military equipment and the earliest experimental mobile phones.[11]

Modern EEPROM based on Fowler-Nordheim tunnelling to erase data was invented by Bernward and patented by Siemens in 1974.[18] It was further developed between 1976 and 1978 by Eliyahou Harari at Hughes Aircraft Company, as well as by George Perlegos and others at Intel.[19][20]

Invention and commercialization

[edit]

Fujio Masuoka invented flash memory at Toshiba in 1980.[16][21][22] The improvement between EEPROM and flash is that flash is programmed in blocks while EEPROM is programmed in bytes. According to Toshiba, the name "flash" was suggested by Masuoka's colleague, Shōji Ariizumi, because the erasure process of the memory contents reminded him of the flash of a camera.[23] Masuoka and colleagues presented the invention of NOR flash in 1984,[24][25] and then NAND flash at the IEEE 1987 International Electron Devices Meeting (IEDM) held in San Francisco.[26]

Toshiba commercially launched NAND flash memory in 1987.[1][15] Intel Corporation introduced the first commercial NOR type flash chip in 1988.[27] NOR-based flash has long erase and write times, but provides full address and data buses, allowing random access to any memory location. This makes it a suitable replacement for older read-only memory (ROM) chips, which are used to store program code that rarely needs to be updated, such as a computer's BIOS or the firmware of set-top boxes. Its endurance may be from as little as 100 erase cycles for an on-chip flash memory,[28] to a more typical 10,000 or 100,000 erase cycles, up to 1,000,000 erase cycles.[29] NOR-based flash was the basis of early flash-based removable media; CompactFlash was originally based on it, although later cards moved to less expensive NAND flash.

NAND flash has reduced erase and write times, and requires less chip area per cell, thus allowing greater storage density and lower cost per bit than NOR flash. However, the I/O interface of NAND flash does not provide a random-access external address bus. Rather, data must be read on a block-wise basis, with typical block sizes of hundreds to thousands of bits. This makes NAND flash unsuitable as a drop-in replacement for program ROM, since most microprocessors and microcontrollers require byte-level random access. In this regard, NAND flash is similar to other secondary data storage devices, such as hard disks and optical media, and is thus highly suitable for use in mass-storage devices, such as memory cards and solid-state drives (SSD). For example, SSDs store data using multiple NAND flash memory chips.

The first NAND-based removable memory card format was SmartMedia, released in 1995. Many others followed, including MultiMediaCard, Secure Digital, Memory Stick, and xD-Picture Card.

Later developments

[edit]

A new generation of memory card formats, including RS-MMC, miniSD and microSD, feature extremely small form factors. For example, the microSD card has an area of just over 1.5 cm2, with a thickness of less than 1 mm.

NAND flash has achieved significant levels of memory density as a result of several major technologies that were commercialized during the late 2000s to early 2010s.[30]

NOR flash was the most common type of Flash memory sold until 2005, when NAND flash overtook NOR flash in sales.[31]

Multi-level cell (MLC) technology stores more than one bit in each memory cell. NEC demonstrated multi-level cell (MLC) technology in 1998, with an 80 Mb flash memory chip storing 2 bits per cell.[32] STMicroelectronics also demonstrated MLC in 2000, with a 64 MB NOR flash memory chip.[33] In 2009, Toshiba and SanDisk introduced NAND flash chips with QLC technology storing 4 bits per cell and holding a capacity of 64 Gb.[34][35] Samsung Electronics introduced triple-level cell (TLC) technology storing 3-bits per cell, and began mass-producing NAND chips with TLC technology in 2010.[36]

Charge trap flash

[edit]

Charge trap flash (CTF) technology replaces the polysilicon floating gate, which is sandwiched between a blocking gate oxide above and a tunneling oxide below it, with an electrically insulating silicon nitride layer; the silicon nitride layer traps electrons. In theory, CTF is less prone to electron leakage, providing improved data retention.[37][38][39][40][41][42]

Because CTF replaces the polysilicon with an electrically insulating nitride, it allows for smaller cells and higher endurance (lower degradation or wear). However, electrons can become trapped and accumulate in the nitride, leading to degradation. Leakage is exacerbated at high temperatures since electrons become more excited with increasing temperatures. CTF technology, however, still uses a tunneling oxide and blocking layer, which are the weak points of the technology, since they can still be damaged in the usual ways (the tunnel oxide can be degraded due to extremely high electric fields and the blocking layer due to Anode Hot Hole Injection (AHHI).[43][44]

Degradation or wear of the oxides is the reason why flash memory has limited endurance. Data retention goes down (the potential for data loss increases) with increasing degradation, since the oxides lose their electrically-insulating characteristics as they degrade. The oxides must insulate against electrons to prevent them from leaking, which would cause data loss.

In 1991, NEC researchers, including N. Kodama, K. Oyama and Hiroki Shirai, described a type of flash memory with a charge-trap method.[45] In 1998, Boaz Eitan of Saifun Semiconductors (later acquired by Spansion) patented a flash memory technology named NROM that took advantage of a charge trapping layer to replace the conventional floating gate used in conventional flash memory designs.[46] In 2000, an Advanced Micro Devices (AMD) research team led by Richard M. Fastow, Egyptian engineer Khaled Z. Ahmed and Jordanian engineer Sameer Haddad (who later joined Spansion) demonstrated a charge-trapping mechanism for NOR flash memory cells.[47] CTF was later commercialized by AMD and Fujitsu in 2002.[48] 3D V-NAND (vertical NAND) technology stacks NAND flash memory cells vertically within a chip using 3D charge trap flash (CTP) technology. 3D V-NAND technology was first announced by Toshiba in 2007,[49] and the first device, with 24 layers, was commercialized by Samsung Electronics in 2013.[50][51]

3D integrated circuit technology

[edit]

3D integrated circuit (3D IC) technology stacks integrated circuit (IC) chips vertically into a single 3D IC package.[30] Toshiba introduced 3D IC technology to NAND flash memory in April 2007, when they debuted a 16 GB eMMC compliant (product number THGAM0G7D8DBAI6, often abbreviated THGAM on consumer websites) embedded NAND flash memory package, which was manufactured with eight stacked 2 GB NAND flash chips.[52] In September 2007, Hynix Semiconductor (now SK Hynix) introduced 24-layer 3D IC technology, with a 16 GB flash memory package that was manufactured with 24 stacked NAND flash chips using a wafer bonding process.[53] Toshiba also used an eight-layer 3D IC for their 32 GB THGBM flash package and in 2008.[54] In 2010, Toshiba used a 16-layer 3D IC for their 128 GB THGBM2 flash package, which was manufactured with 16 stacked 8 GB chips.[55] In the 2010s, 3D ICs came into widespread commercial use for NAND flash memory in mobile devices.[30]

In 2016, Micron and Intel introduced a technology known as CMOS Under the Array/CMOS Under Array (CUA), Core over Periphery (COP), Periphery Under Cell (PUA), or Xtacking,[56] in which the control circuitry for the flash memory is placed under or above the flash memory cell array. This has allowed for an increase in the number of planes or sections a flash memory chip has, increasing from two planes to four, without increasing the area dedicated to the control or periphery circuitry. This increases the number of IO operations per flash chip or die, but it also introduces challenges when building capacitors for charge pumps used to write to the flash memory.[57][58][59] Some flash dies have as many as 6 planes.[60]

As of August 2017, microSD cards with a capacity up to 400 GB (400 billion bytes) were available.[61][62] Samsung combined 3D IC chip stacking with its 3D V-NAND and TLC technologies to manufacture its 512 GB KLUFG8R1EM flash memory package with eight stacked 64-layer V-NAND chips.[8] In 2019, Samsung produced a 1024 GB flash package, with eight stacked 96-layer V-NAND package and with QLC technology.[63][64]

In 2025, researchers announced experimental success with a device a 400-picosecond write time.[65]

Principles of operation

[edit]
A flash memory cell

Flash memory stores information in an array of memory cells made from floating-gate transistors. In single-level cell (SLC) devices, each cell stores only one bit of information. Multi-level cell (MLC) devices, including triple-level cell (TLC) devices, can store more than one bit per cell.

The floating gate may be conductive (typically polysilicon in most kinds of flash memory) or non-conductive (as in SONOS flash memory).[66]

Floating-gate MOSFET

[edit]

In flash memory, each memory cell resembles a standard metal–oxide–semiconductor field-effect transistor (MOSFET) except that the transistor has two gates instead of one. The cells can be seen as an electrical switch in which current flows between two terminals (source and drain) and is controlled by a floating gate (FG) and a control gate (CG). The CG is similar to the gate in other MOS transistors, but below this is the FG, which is insulated all around by an oxide layer. The FG is interposed between the CG and the MOSFET channel. Because the FG is electrically isolated by its insulating layer, electrons placed on it are trapped. When the FG is charged with electrons, this charge screens the electric field from the CG, thus increasing the threshold voltage (VT) of the cell. This means that the VT of the cell can be changed between the uncharged FG threshold voltage (VT1) and the higher charged FG threshold voltage (VT2) by changing the FG charge. In order to read a value from the cell, an intermediate voltage (VI) between VT1 and VT2 is applied to the CG. If the channel conducts at VI, the FG must be uncharged (if it were charged, there would not be conduction because VI is less than VT2). If the channel does not conduct at the VI, it indicates that the FG is charged. The binary value of the cell is sensed by determining whether there is current flowing through the transistor when VI is asserted on the CG. In a multi-level cell device, which stores more than one bit per cell, the amount of current flow is sensed (rather than simply its presence or absence), in order to determine more precisely the level of charge on the FG.

Floating gate MOSFETs are so named because there is an electrically insulating tunnel oxide layer between the floating gate and the silicon, so the gate "floats" above the silicon. The oxide keeps the electrons confined to the floating gate. Degradation or wear (and the limited endurance of floating gate Flash memory) occurs due to the extremely high electric field (10 million volts per centimeter) experienced by the oxide. Such high voltage densities can break atomic bonds over time in the relatively thin oxide, gradually degrading its electrically insulating properties and allowing electrons to be trapped in and pass through freely (leak) from the floating gate into the oxide, increasing the likelihood of data loss since the electrons (the quantity of which is used to represent different charge levels, each assigned to a different combination of bits in MLC Flash) are normally in the floating gate. This is why data retention goes down and the risk of data loss increases with increasing degradation.[67][68][41][69][70] The silicon oxide in a cell degrades with every erase operation. The degradation increases the amount of negative charge in the cell over time due to trapped electrons in the oxide and negates some of the control gate voltage. Over time, this also makes erasing the cell slower; to maintain the performance and reliability of the NAND chip, the cell must be retired from use. Endurance also decreases with the number of bits in a cell. With more bits in a cell, the number of possible states (each represented by a different voltage level) in a cell increases and is more sensitive to the voltages used for programming. Voltages may be adjusted to compensate for degradation of the silicon oxide, and as the number of bits increases, the number of possible states also increases and thus the cell is less tolerant of adjustments to programming voltages, because there is less space between the voltage levels that define each state in a cell.[71]

Fowler–Nordheim tunneling

[edit]

The process of moving electrons from the control gate and into the floating gate is called Fowler–Nordheim tunneling, and it fundamentally changes the characteristics of the cell by increasing the MOSFET's threshold voltage. This, in turn, changes the drain-source current that flows through the transistor for a given gate voltage, which is ultimately used to encode a binary value. The Fowler-Nordheim tunneling effect is reversible, so electrons can be added to or removed from the floating gate, processes traditionally known as writing and erasing.[72]

Internal charge pumps

[edit]

Despite the need for relatively high programming and erasing voltages, virtually all flash chips today require only a single supply voltage and produce the high voltages that are required using on-chip charge pumps.

Over half the energy used by a 1.8 V-NAND flash chip is lost in the charge pump itself. Since boost converters are inherently more efficient than charge pumps, researchers developing low-power SSDs have proposed returning to the dual Vcc/Vpp supply voltages used on all early flash chips, driving the high Vpp voltage for all flash chips in an SSD with a single shared external boost converter.[73][74][75][76][77][78][79][80]

In spacecraft and other high-radiation environments, the on-chip charge pump is the first part of the flash chip to fail, although flash memories will continue to work – in read-only mode – at much higher radiation levels.[81]

NOR flash

[edit]
NOR flash memory wiring and structure on silicon

In both NOR and NAND flash memories, the cells are arranged in a grid. We can think of the memory as consisting of "words" of a certain number of bits (or cells), with each word being confined to a particular column of the grid, and the bits being in different rows. All the bits of a particular word are linked by a wordline, a conductor connecting to the control gates of all the bits of that word. All the first bits of a certain number of adjacent words (columns) are linked by a bitline, as are all the second bits and so on. The bitlines connect to one of the terminals (source or drain) of the cells. By manipulating the voltages on the wordlines one can read a certain bit by measuring the voltage on the corresponding bitline. The way to do this depends on whether the memory chip is a NOR or a NAND flash.

In NOR flash, each cell has one end connected directly to ground, and the other end connected directly to a bit line. This arrangement is called "NOR flash" because it acts like a NOR gate  – if any of the word lines (connected to the CG of the cells) is brought high, the corresponding storage transistor may act to pull the output bit line low, but this depends on the charge in the floating gate. Since several words are connected by the bit line, the output does not depend on only two (the bitline staying high if neither the first NOR the second wordline is high) but on all (the bitline remaining high if NONE of the wordlines is high). So to read a bit of a certain word, all the wordlines except that of the desired word are put low.

NOR flash continues to be the technology of choice for embedded applications requiring a discrete non-volatile memory device.[citation needed] The low read latencies characteristic of NOR devices allow for both direct code execution and data storage in a single memory product.[82]

Programming

[edit]
Programming a NOR memory cell (setting it to logical 0), via hot-electron injection
Erasing a NOR memory cell (setting it to logical 1), via quantum tunneling

A single-level NOR flash cell in its default state is logically equivalent to a binary "1" value, because current will flow through the channel under application of an appropriate voltage to the control gate, so that the bitline voltage is pulled down. A NOR flash cell can be programmed, or set to a binary "0" value, by the following procedure:

  • an elevated on-voltage (typically >5 V) is applied to the CG
  • the channel is now turned on, so electrons can flow from the source to the drain (assuming an NMOS transistor)
  • the source-drain current is sufficiently high to cause some high energy electrons to jump through the insulating layer onto the FG, via a process called hot-electron injection.

Erasing

[edit]

To erase a NOR flash cell (resetting it to the "1" state), a large voltage of the opposite polarity is applied between the CG and source terminal, pulling the electrons off the FG through Fowler–Nordheim tunneling (FN tunneling).[83] This is known as Negative gate source source erase. Newer NOR memories can erase using negative gate channel erase, which biases the wordline on a NOR memory cell block and the P-well of the memory cell block to allow FN tunneling to be carried out, erasing the cell block. Older memories used source erase, in which a high voltage was applied to the source and then electrons from the FG were moved to the source.[84][85] Modern NOR flash memory chips are divided into erase segments (often called blocks or sectors). The erase operation can be performed only on a block-wise basis; all the cells in an erase segment must be erased together.[86] Programming of NOR cells, however, generally can be performed one byte or word at a time.

NAND flash memory wiring and structure on silicon

NAND flash

[edit]

NAND flash also uses a grid of floating-gate transistors (see above), but they are connected in a way that resembles a NAND gate: the transistors corresponding to a given bit of several words are connected in series, and the bitline is pulled low if all the word lines are pulled high (above the transistors' VT). To read the bit of a particular word, its wordline is put low and all the other wordlines are put high, and then the bitline will reflect the state of the floating gate of the desired cell. These groups are then connected via some additional transistors to a NOR-style bit line array in the same way that single transistors are linked in NOR flash.

Compared to NOR flash, replacing single transistors with serial-linked groups adds an extra level of addressing. Whereas NOR flash might address memory by page then word, NAND flash might address it by page, word and bit. Bit-level addressing suits bit-serial applications (such as hard disk emulation), which access only one bit at a time. Execute-in-place applications, on the other hand, require every bit in a word to be accessed simultaneously. This requires word-level addressing. In any case, both bit and word addressing modes are possible with either NOR or NAND flash.

To read data, first the desired group is selected (in the same way that a single transistor is selected from a NOR array). Next, most of the word lines are pulled up above VT2, while one of them is pulled up to VI. The series group will conduct (and pull the bit line low) if the selected bit has not been programmed.

Despite the additional transistors, the reduction in ground wires and bit lines allows a denser layout and greater storage capacity per chip. (The ground wires and bit lines are actually much wider than the lines in the diagrams.) In addition, NAND flash is typically permitted to contain a certain number of faults (NOR flash, as is used for a BIOS ROM, is expected to be fault-free). Manufacturers try to maximize the amount of usable storage by shrinking the size of the transistors or cells, however the industry can avoid this and achieve higher storage densities per die by using 3D NAND, which stacks cells on top of each other.

NAND flash cells are read by analysing their response to various voltages.[69]

Writing and erasing

[edit]

NAND flash uses tunnel injection for writing and tunnel release for erasing. NAND flash memory forms the core of the removable USB storage devices known as USB flash drives, as well as most memory card formats and solid-state drives available today.

The hierarchical structure of NAND flash starts at a cell level which establishes strings, then pages, blocks, planes and ultimately a die. A string is a series of connected NAND cells in which the source of one cell is connected to the drain of the next one. Depending on the NAND technology, a string typically consists of 32 to 128 NAND cells. Strings are organised into pages which are then organised into blocks in which each string is connected to a separate line called a bitline. All cells with the same position in the string are connected through the control gates by a wordline. A plane contains a certain number of blocks that are connected through the same bitline. A flash die consists of one or more planes, and the peripheral circuitry that is needed to perform all the read, write, and erase operations.

The architecture of NAND flash means that data can be read and programmed (written) in pages, typically between 4 KiB and 16 KiB in size, but can only be erased at the level of entire blocks consisting of multiple pages. When a block is erased, all the cells are logically set to 1. Data can only be programmed in one pass to a page in a block that was erased. The programming process is set one or more cells from 1 to 0. Any cells that have been set to 0 by programming can only be reset to 1 by erasing the entire block. This means that before new data can be programmed into a page that already contains data, the current contents of the page plus the new data must all be copied to a new, erased page. If a suitable erased page is available, the data can be written to it immediately. If no erased page is available, a block must be erased before copying the data to a page in that block. The old page is then marked as invalid and is available for erasing and reuse.[87] This is different from operating system LBA view, for example, if operating system writes 1100 0011 to the flash storage device (such as SSD), the data actually written to the flash memory may be 0011 1100.

Vertical NAND

[edit]
3D NAND continues scaling beyond 2D.

Vertical NAND (V-NAND) or 3D NAND memory stacks memory cells vertically and uses a charge trap flash architecture. The vertical layers allow larger areal bit densities without requiring smaller individual cells.[88] It is also sold under the trademark BiCS Flash, which is a trademark of Kioxia Corporation (formerly Toshiba Memory Corporation). 3D NAND was first announced by Toshiba in 2007.[49] V-NAND was first commercially manufactured by Samsung Electronics in 2013.[50][51][89][90]

Structure

[edit]

V-NAND uses a charge trap flash geometry (which was commercially introduced in 2002 by AMD and Fujitsu)[48] that stores charge on an embedded silicon nitride film. Such a film is more robust against point defects and can be made thicker to hold larger numbers of electrons. V-NAND wraps a planar charge trap cell into a cylindrical form.[88] As of 2020, 3D NAND flash memories by Micron and Intel instead use floating gates, however, Micron 128 layer and above 3D NAND memories use a conventional charge trap structure, due to the dissolution of the partnership between Micron and Intel. Charge trap 3D NAND flash is thinner than floating gate 3D NAND. In floating gate 3D NAND, the memory cells are completely separated from one another, whereas in charge trap 3D NAND, vertical groups of memory cells share the same silicon nitride material.[91]

An individual memory cell is made up of one planar polysilicon layer containing a hole filled by multiple concentric vertical cylinders. The hole's polysilicon surface acts as the gate electrode. The outermost silicon dioxide cylinder acts as the gate dielectric, enclosing a silicon nitride cylinder that stores charge, in turn enclosing a silicon dioxide cylinder as the tunnel dielectric that surrounds a central rod of conducting polysilicon which acts as the conducting channel.[88]

Memory cells in different vertical layers do not interfere with each other, as the charges cannot move vertically through the silicon nitride storage medium, and the electric fields associated with the gates are closely confined within each layer. The vertical collection is electrically identical to the serial-linked groups in which conventional NAND flash memory is configured.[88] There is also string stacking, which builds several 3D NAND memory arrays or "plugs"[92] separately, but stacked together to create a product with a higher number of 3D NAND layers on a single die. Often, two or 3 arrays are stacked. The misalignment between plugs is in the order of 30 to 10nm.[57][93][94]

Construction

[edit]

Growth of a group of V-NAND cells begins with an alternating stack of conducting (doped) polysilicon layers and insulating silicon dioxide layers.[88]

The next step is to form a cylindrical hole through these layers. In practice, a 128 Gbit V-NAND chip with 24 layers of memory cells requires about 2.9 billion such holes. Next, the hole's inner surface receives multiple coatings, first silicon dioxide, then silicon nitride, then a second layer of silicon dioxide. Finally, the hole is filled with conducting (doped) polysilicon.[88]

Performance

[edit]

As of 2013, V-NAND flash architecture allows read and write operations twice as fast as conventional NAND and can last up to 10 times as long, while consuming 50 percent less power. They offer comparable physical bit density using 10-nm lithography but may be able to increase bit density by up to two orders of magnitude, given V-NAND's use of up to several hundred layers.[88] As of 2020, V-NAND chips with 160 layers are under development by Samsung.[95] As the number of layers increases, the capacity and endurance of flash memory may be increased.

Cost

[edit]
Minimum bit cost of 3D NAND from non-vertical sidewall. The top opening widens with more layers, counteracting the increase in bit density.

The wafer cost of a 3D NAND is comparable with scaled down (32 nm or less) planar NAND flash.[96] However, with planar NAND scaling stopping at 16 nm, the cost per bit reduction can continue by 3D NAND starting with 16 layers. However, due to the non-vertical sidewall of the hole etched through the layers; even a slight deviation leads to a minimum bit cost, i.e., minimum equivalent design rule (or maximum density), for a given number of layers; this minimum bit cost layer number decreases for smaller hole diameter.[97]

Limitations

[edit]

Block erasure

[edit]

One limitation of flash memory is that it can be erased only a block at a time. This generally sets all bits in the block to 1. Starting with a freshly erased block, any location within that block can be programmed. However, once a bit has been set to 0, only by erasing the entire block can it be changed back to 1. In other words, flash memory (specifically NOR flash) offers random-access read and programming operations but does not offer arbitrary random-access rewrite or erase operations. A location can, however, be rewritten as long as the new value's 0 bits are a superset of the over-written values. For example, a nibble value may be erased to 1111, then written as 1110. Successive writes to that nibble can change it to 1010, then 0010, and finally 0000. Essentially, erasure sets all bits to 1, and programming can only clear bits to 0.[98] Some file systems designed for flash devices make use of this rewrite capability, for example YAFFS1, to represent sector metadata. Other flash file systems, such as YAFFS2, never make use of this "rewrite" capability – they do a lot of extra work to meet a "write once rule".

Although data structures in flash memory cannot be updated in completely general ways, this allows members to be "removed" by marking them as invalid. This technique may need to be modified for multi-level cell devices, where one memory cell holds more than one bit.

Common flash devices such as USB flash drives and memory cards provide only a block-level interface, or flash translation layer (FTL), which writes to a different cell each time to wear-level the device. This prevents incremental writing within a block; however, it does help the device from being prematurely worn out by intensive write patterns.

Data retention

[edit]
45nm NOR flash memory example of data retention varying with temperatures

Data stored on flash cells is steadily lost due to electron detrapping[definition needed]. The rate of loss increases exponentially as the absolute temperature increases. For example: For a 45 nm NOR flash, at 1000 hours, the threshold voltage (Vt) loss at 25°C is about half that at 90°C.[99]

Memory wear

[edit]

Another limitation is that flash memory has a finite number of program–erase cycles (typically written as P/E cycles).[100][101] Micron Technology and Sun Microsystems announced an SLC NAND flash memory chip rated for 1,000,000 P/E cycles on 17 December 2008.[102]

The guaranteed cycle count may apply only to block zero (as is the case with TSOP NAND devices), or to all blocks (as in NOR). This effect is mitigated in some chip firmware or file system drivers by counting the writes and dynamically remapping blocks in order to spread write operations between sectors; this technique is called wear leveling. Another approach is to perform write verification and remapping to spare sectors in case of write failure, a technique called bad block management (BBM). For portable consumer devices, these wear out management techniques typically extend the life of the flash memory beyond the life of the device itself, and some data loss may be acceptable in these applications. For high-reliability data storage, however, it is not advisable to use flash memory that would have to go through a large number of programming cycles. This limitation also exists for "read-only" applications such as thin clients and routers, which are programmed only once or at most a few times during their lifetimes, due to read disturb (see below).

In December 2012, Taiwanese engineers from Macronix revealed their intention to announce at the 2012 IEEE International Electron Devices Meeting that they had figured out how to improve NAND flash storage read/write cycles from 10,000 to 100 million cycles using a "self-healing" process that used a flash chip with "onboard heaters that could anneal small groups of memory cells."[103] The built-in thermal annealing was to replace the usual erase cycle with a local high temperature process that not only erased the stored charge, but also repaired the electron-induced stress in the chip, giving write cycles of at least 100 million.[104] The result was to be a chip that could be erased and rewritten over and over, even when it should theoretically break down. As promising as Macronix's breakthrough might have been for the mobile industry, however, there were no plans for a commercial product featuring this capability to be released any time in the near future.[105]

Read disturb

[edit]

The method used to read NAND flash memory can cause nearby cells in the same memory block to change over time (become programmed). This is known as read disturb. The threshold number of reads is generally in the hundreds of thousands of reads between intervening erase operations. If reading continually from one cell, that cell will not fail but rather one of the surrounding cells will on a subsequent read. To avoid the read disturb problem the flash controller will typically count the total number of reads to a block since the last erase. When the count exceeds a target limit, the affected block is copied over to a new block, erased, then released to the block pool. The original block is as good as new after the erase. If the flash controller does not intervene in time, however, a read disturb error will occur with possible data loss if the errors are too numerous to correct with an error-correcting code.[106][107][108]

X-ray effects

[edit]

Most flash ICs come in ball grid array (BGA) packages, and even the ones that do not are often mounted on a PCB next to other BGA packages. After PCB assembly, boards with BGA packages are often X-rayed to see if the balls are making proper connections to the proper pad, or if the BGA needs rework. These X-rays can erase programmed bits in a flash chip (convert programmed "0" bits into erased "1" bits). Erased bits ("1" bits) are not affected by X-rays.[109][110]

Some manufacturers are now making X-ray-proof SD[111] and USB[112] memory devices.

Low-level access

[edit]

The low-level interface to flash memory chips differs from those of other memory types such as DRAM, ROM, and EEPROM, which support bit-alterability (both zero to one and one to zero) and random access via externally accessible address buses.

NOR memory has an external address bus for reading and programming. For NOR memory, reading and programming are random-access, and unlocking and erasing are block-wise. For NAND memory, reading and programming are page-wise, and unlocking and erasing are block-wise.

NOR memories

[edit]
NOR flash by Intel

Reading from NOR flash is similar to reading from random-access memory, provided the address and data bus are mapped correctly. Because of this, most microprocessors can use NOR flash memory as execute in place (XIP) memory,[113] meaning that programs stored in NOR flash can be executed directly from the NOR flash without needing to be copied into RAM first. NOR flash may be programmed in a random-access manner similar to reading. Programming changes bits from a logical one to a zero. Bits that are already zero are left unchanged. Erasure must happen a block at a time, and resets all the bits in the erased block back to one. Typical block sizes are 64, 128, or 256 KiB.

Bad block management is a relatively new feature in NOR chips. In older NOR devices not supporting bad block management, the software or device driver controlling the memory chip must correct for blocks that wear out, or the device will cease to work reliably.

The specific commands used to lock, unlock, program, or erase NOR memories differ for each manufacturer. To avoid needing unique driver software for every device made, special Common Flash Memory Interface (CFI) commands allow the device to identify itself and its critical operating parameters.

Besides its use as random-access ROM, NOR flash can also be used as a storage device, by taking advantage of random-access programming. Some devices offer read-while-write functionality so that code continues to execute even while a program or erase operation is occurring in the background. For sequential data writes, NOR flash chips typically have slow write speeds, compared with NAND flash.

Typical NOR flash does not need an error correcting code.[114]

NAND memories

[edit]

NAND flash architecture was introduced by Toshiba in 1989.[115] These memories are accessed much like block devices, such as hard disks. Each block consists of a number of pages. The pages are typically 512,[116] 2,048, or 4,096 bytes in size. Associated with each page are a few bytes (typically 1/32 of the data size) that can be used for storage of an error correcting code (ECC) checksum.

Typical block sizes include:

  • 32 pages of 512+16 bytes each for a block size (effective) of 16 KiB
  • 64 pages of 2,048+64 bytes each for a block size of 128 KiB[117]
  • 64 pages of 4,096+128 bytes each for a block size of 256 KiB[118]
  • 128 pages of 4,096+128 bytes each for a block size of 512 KiB
  • 2048 pages of 16,386+128 bytes each for a block size of 32768 KiB[119]

Modern NAND flash may have erase block size between 1 MiB to 128 MiB.[120] While reading and programming is performed on a page basis, erasure can only be performed on a block basis.[121] Since changing a cell from 0 to 1 requires erasing an entire block instead of just modifying some pages, making changes to the data of a block may in reality be a read-erase-write process, where the new data is actually moved to another block. In addition, on a NVM Express Zoned Namespaces SSD, it usually uses flash block size as the zone size.

NAND devices also require bad block management by the device driver software or by the flash memory controller chip. Some SD cards, for example, include controller circuitry to perform bad block management and wear leveling. When a logical block is accessed by high-level software, it is mapped to a physical block by the device driver or controller. A number of blocks on the flash chip may be set aside for storing mapping tables to deal with bad blocks, or the system may simply check each block at power-up to create a bad block map in RAM. The overall memory capacity gradually shrinks as more blocks are marked as bad.

NAND relies on ECC to compensate for bits that may spontaneously fail during normal device operation. A typical ECC will correct a one-bit error in each 2048 bits (256 bytes) using 22 bits of ECC, or a one-bit error in each 4096 bits (512 bytes) using 24 bits of ECC.[122] If the ECC cannot correct the error during read, it may still detect the error. When doing erase or program operations, the device can detect blocks that fail to program or erase and mark them bad. The data is then written to a different, good block, and the bad block map is updated.

Hamming codes are the most commonly used ECC for SLC NAND flash. Reed–Solomon codes and BCH codes (Bose–Chaudhuri–Hocquenghem codes) are commonly used ECC for MLC NAND flash. Some MLC NAND flash chips internally generate the appropriate BCH error correction codes.[114]

Most NAND devices are shipped from the factory with some bad blocks. These are typically marked according to a specified bad block marking strategy. By allowing some bad blocks, manufacturers achieve far higher yields than would be possible if all blocks had to be verified to be good. This significantly reduces NAND flash costs and only slightly decreases the storage capacity of the parts.

When executing software from NAND memories, virtual memory strategies are often used: memory contents must first be paged or copied into memory-mapped RAM and executed there (leading to the common combination of NAND + RAM). A memory management unit (MMU) in the system is helpful, but this can also be accomplished with overlays. For this reason, some systems will use a combination of NOR and NAND memories, where a smaller NOR memory is used as software ROM and a larger NAND memory is partitioned with a file system for use as a non-volatile data storage area.

NAND sacrifices the random-access and execute-in-place advantages of NOR. NAND is best suited to systems requiring high capacity data storage. It offers higher densities, larger capacities, and lower cost. It has faster erases, sequential writes, and sequential reads.

Standardization

[edit]

A group called the Open NAND Flash Interface Working Group (ONFI) has developed a standardized low-level interface for NAND flash chips. This allows interoperability between conforming NAND devices from different vendors. The ONFI specification version 1.0[123] was released on 28 December 2006. It specifies:

  • A standard physical interface (pinout) for NAND flash in TSOP-48, WSOP-48, LGA-52, and BGA-63 packages
  • A standard command set for reading, writing, and erasing NAND flash chips
  • A mechanism for self-identification (comparable to the serial presence detection feature of SDRAM memory modules)

The ONFI group is supported by major NAND flash manufacturers, including Hynix, Intel, Micron Technology, and Numonyx, as well as by major manufacturers of devices incorporating NAND flash chips.[124]

Two major flash device manufacturers, Toshiba and Samsung, have chosen to use an NAND flash interface of their own design known as Toggle Mode (and now Toggle). This interface isn't pin-to-pin compatible with the ONFI specification. The result is that a product designed for one vendor's devices may not be able to use another vendor's devices.[125]

A group of vendors, including Intel, Dell, and Microsoft, formed a Non-Volatile Memory Host Controller Interface (NVMHCI) Working Group.[126] The goal of the group is to provide standard software and hardware programming interfaces for nonvolatile memory subsystems, including the "flash cache" device connected to the PCI Express bus.

Distinction between NOR and NAND flash

[edit]

NOR and NAND flash differ in two important ways:

  • The connections of the individual memory cells are different.[127]
  • The interface provided for reading and writing the memory is different; NOR allows random access[128] as it can be either byte-addressable or word-addressable, with words being for example 32 bits long,[129][130][131] while NAND allows only page access.[132]

NOR[133] and NAND flash get their names from the structure of the interconnections between memory cells.[134] In NOR flash, cells are connected in parallel to the bit lines, allowing cells to be read and programmed individually.[135] The parallel connection of cells resembles the parallel connection of transistors in a CMOS NOR gate.[136] In NAND flash, cells are connected in series,[135] resembling a CMOS NAND gate. The series connections consume less space than parallel ones, reducing the cost of NAND flash.[135] It does not, by itself, prevent NAND cells from being read and programmed individually.[citation needed]

Each NOR flash cell is larger than a NAND flash cell – 10 F2 vs 4 F2 – [vague] even when using exactly the same semiconductor device fabrication and so each transistor, contact, etc. is exactly the same size – because NOR flash cells require a separate metal contact for each cell.[137][138]

Because of the series connection and removal of wordline contacts, a large grid of NAND flash memory cells will occupy perhaps only 60% of the area of equivalent NOR cells[139] (assuming the same CMOS process resolution, for example, 130 nm, 90 nm, or 65 nm). NAND flash's designers realized that the area of a NAND chip, and thus the cost, could be further reduced by removing the external address and data bus circuitry. Instead, external devices could communicate with NAND flash via sequential-accessed command and data registers, which would internally retrieve and output the necessary data. This design choice made random-access of NAND flash memory impossible, but the goal of NAND flash was to replace mechanical hard disks, not to replace ROMs.

The first GSM phones and many feature phones had NOR flash memory, from which processor instructions could be executed directly in an execute-in-place architecture and allowed for short boot times. With smartphones, NAND flash memory was adopted as it has larger storage capacities and lower costs, but causes longer boot times because instructions cannot be executed from it directly, and must be copied to RAM memory first before execution.[140]

Attribute NAND NOR
Main application File storage Code execution
Storage capacity Higher Lower
Cost per bit Lower Higher
Active power Lower Higher
Standby power Higher Lower
Write speed Faster Slower
Random read speed Slower Faster
Execute in place[141] (XIP) No Yes
Reliability Lower Higher
Required flash memory controller Usually Yes No

Write endurance

[edit]

The write endurance of SLC floating-gate NOR flash is typically equal to or greater than that of NAND flash, while MLC NOR and NAND flash have similar endurance capabilities. Examples of endurance cycle ratings listed in datasheets for NAND and NOR flash, as well as in storage devices using flash memory, are provided.[142]

Type of flash
memory
Endurance rating
(erases per block)
Example(s) of flash memory or storage device
SLC NAND 50,000–100,000 Samsung OneNAND KFW4G16Q2M, Toshiba SLC NAND flash chips,[143][144][145][146][147] Transcend SD500, Fujitsu S26361-F3298
MLC NAND 5,000–10,000 for
medium-capacity;
1,000 to 3,000 for
high-capacity[148]
Samsung K9G8G08U0M (example for medium-capacity applications), Memblaze PBlaze4,[149] ADATA SU900, Mushkin Reactor
TLC NAND 1,000 Samsung SSD 840
QLC NAND Unknown SanDisk X4 NAND flash SD cards[150][151][152][153]
3D SLC NAND >100,000 Samsung Z-NAND[154]
3D MLC NAND 6,000–40,000 Samsung SSD 850 PRO, Samsung SSD 845DC PRO,[155][156] Samsung 860 PRO
3D TLC NAND 1,500–5,000 Samsung SSD 850 EVO, Samsung SSD 845DC EVO, Crucial MX300[157][158][159],Memblaze PBlaze5 900, Memblaze PBlaze5 700, Memblaze PBlaze5 910/916, Memblaze PBlaze5 510/516,[160][161][162][163] ADATA SX 8200 PRO (also being sold under "XPG Gammix" branding, model S11 PRO)
3D QLC NAND 100–1,500 Samsung SSD 860 QVO SATA, Intel SSD 660p, Micron 5210 ION, Crucial P1, Samsung SSD BM991 NVMe[164][165][166][167][168][169][170][171]
3D PLC NAND Unknown In development by SK Hynix (formerly Intel)[172] and Kioxia (formerly Toshiba Memory).[148]
SLC (floating-
gate) NOR
100,000–1,000,000 Numonyx M58BW (Endurance rating of 100,000 erases per block);
Spansion S29CD016J (Endurance rating of 1,000,000 erases per block)
MLC (floating-
gate) NOR
100,000 Numonyx J3 flash
3D SLC NOR >1,000,000
3D MLC NOR 100,000-1,000,000

However, by applying certain algorithms and design paradigms such as wear leveling and memory over-provisioning, the endurance of a storage system can be tuned to serve specific requirements.[173]

In order to compute the longevity of the NAND flash, one must account for the size of the memory chip, the type of memory (e.g. SLC/MLC/TLC), and use pattern. Industrial NAND and server NAND are in demand due to their capacity, longer endurance and reliability in sensitive environments.

As the number of bits per cell increases, performance and life of NAND flash may degrade, increasing random read times to 100μs for TLC NAND which is 4 times the time required in SLC NAND, and twice the time required in MLC NAND, for random reads.[71]

Flash file systems

[edit]

Because of the particular characteristics of flash memory, it is best used with either a controller to perform wear leveling and error correction or specifically designed flash file systems, which spread writes over the media and deal with the long erase times of NOR flash blocks. The basic concept behind flash file systems is the following: when the flash store is to be updated, the file system will write a new copy of the changed data to a fresh block, remap the file pointers, then erase the old block later when it has time.

In practice, flash file systems are used only for memory technology devices (MTDs), which are embedded flash memories that do not have a controller. Removable flash memory cards, SSDs, eMMC/eUFS chips and USB flash drives have built-in controllers to perform wear leveling and error correction so use of a specific flash file system may not add benefit.

Capacity

[edit]

Multiple chips are often arrayed or die stacked to achieve higher capacities[174] for use in consumer electronic devices such as multimedia players or GPSs. The capacity scaling (increase) of flash chips used to follow Moore's law because they are manufactured with many of the same integrated circuits techniques and equipment. Since the introduction of 3D NAND, scaling is no longer necessarily associated with Moore's law since ever smaller transistors (cells) are no longer used.

Consumer flash storage devices typically are advertised with usable sizes expressed as a small integer power of two (2, 4, 8, etc.) and a conventional designation of megabytes (MB) or gigabytes (GB); e.g., 512 MB, 8 GB. This includes SSDs marketed as hard drive replacements, in accordance with traditional hard drives, which use decimal prefixes.[175] Thus, an SSD marked as "64 GB" is at least 64 × 10003 bytes (64 GB). Most users will have slightly less capacity than this available for their files, due to the space taken by file system metadata and because some operating systems report SSD capacity using binary prefixes which are somewhat larger than conventional prefixes .

The flash memory chips inside them are sized in strict binary multiples, but the actual total capacity of the chips is not usable at the drive interface. It is considerably larger than the advertised capacity in order to allow for distribution of writes (wear leveling), for sparing, for error correction codes, and for other metadata needed by the device's internal firmware.

In 2005, Toshiba and SanDisk developed a NAND flash chip capable of storing 1 GB of data using multi-level cell (MLC) technology, capable of storing two bits of data per cell. In September 2005, Samsung Electronics announced that it had developed the world's first 2 GB chip.[176]

In March 2006, Samsung announced flash hard drives with capacity of 4 GB, essentially the same order of magnitude as smaller laptop hard drives, and in September 2006, Samsung announced an 8 GB chip produced using a 40 nm manufacturing process.[177] In January 2008, SanDisk announced availability of their 16 GB MicroSDHC and 32 GB SDHC Plus cards.[178][179]

More recent flash drives (as of 2012) have much greater capacities, holding 64, 128, and 256 GB.[180]

A joint development at Intel and Micron will allow the production of 32-layer 3.5 terabyte (TB[clarification needed]) NAND flash sticks and 10 TB standard-sized SSDs. The device includes 5 packages of 16 × 48 GB TLC dies, using a floating gate cell design.[181]

Flash chips continue to be manufactured with capacities under or around 1 MB (e.g. for BIOS-ROMs and embedded applications).

In July 2016, Samsung announced the 4 TB [clarification needed] Samsung 850 EVO which utilizes their 256 Gbit 48-layer TLC 3D V-NAND.[182] In August 2016, Samsung announced a 32 TB 2.5-inch SAS SSD based on their 512 Gbit 64-layer TLC 3D V-NAND. Further, Samsung expects to unveil SSDs with up to 100 TB of storage by 2020.[183]

Transfer rates

[edit]

Flash memory devices are typically much faster at reading than writing.[184] Performance also depends on the quality of storage controllers, which become more critical when devices are partially full.[vague][184] Even when the only change to manufacturing is die-shrink, the absence of an appropriate controller can result in degraded speeds.[185]

Applications

[edit]

Serial flash

[edit]
Serial Flash: Silicon Storage Tech SST25VF080B

Serial flash is a small, low-power flash memory that provides only serial access to the data - rather than addressing individual bytes, the user reads or writes large contiguous groups of bytes in the address space serially. Serial Peripheral Interface Bus (SPI) is a typical protocol for accessing the device. When incorporated into an embedded system, serial flash requires fewer wires on the PCB than parallel flash memories, since it transmits and receives data one bit at a time. This may permit a reduction in board space, power consumption, and total system cost.

There are several reasons why a serial device, with fewer external pins than a parallel device, can significantly reduce overall cost:

  • Many ASICs are pad-limited, meaning that the size of the die is constrained by the number of wire bond pads, rather than the complexity and number of gates used for the device logic. Eliminating bond pads thus permits a more compact integrated circuit, on a smaller die; this increases the number of dies that may be fabricated on a wafer, and thus reduces the cost per die.
  • Reducing the number of external pins also reduces assembly and packaging costs. A serial device may be packaged in a smaller and simpler package than a parallel device.
  • Smaller and lower pin-count packages occupy less PCB area.
  • Lower pin-count devices simplify PCB routing.

There are two major SPI flash types. The first type is characterized by small blocks and one internal SRAM block buffer allowing a complete block to be read to the buffer, partially modified, and then written back (for example, the Atmel AT45 DataFlash or the Micron Technology Page Erase NOR Flash). The second type has larger sectors where the smallest sectors typically found in this type of SPI flash are 4 KB, but they can be as large as 64 KB. Since this type of SPI flash lacks an internal SRAM buffer, the complete block must be read out and modified before being written back, making it slow to manage. However, the second type is cheaper than the first and is therefore a good choice when the application is code shadowing.

The two types are not easily exchangeable, since they do not have the same pinout, and the command sets are incompatible.

Most FPGAs are based on SRAM configuration cells and require an external configuration device, often a serial flash chip, to reload the configuration bitstream every power cycle.[186]

Firmware storage

[edit]

With the increasing speed of modern CPUs, parallel flash devices are often much slower than the memory bus of the computer they are connected to. Conversely, modern SRAM offers access times below 10 ns, while DDR2 SDRAM offers access times below 20 ns. Because of this, it is often desirable to shadow code stored in flash into RAM; that is, the code is copied from flash into RAM before execution, so that the CPU may access it at full speed. Device firmware may be stored in a serial flash chip, and then copied into SDRAM or SRAM when the device is powered-up.[187] Using an external serial flash device rather than on-chip flash removes the need for significant process compromise (a manufacturing process that is good for high-speed logic is generally not good for flash and vice versa). Once it is decided to read the firmware in as one big block it is common to add compression to allow a smaller flash chip to be used. Since 2005, many devices use serial NOR flash to deprecate parallel NOR flash for firmware storage. Typical applications for serial NOR flash include storing firmware for hard drives, BIOS, Option ROM of expansion cards, DSL modems, etc.

Flash memory as a replacement for hard drives

[edit]
An Intel mSATA SSD in 2020

One more recent application for flash memory is as a replacement for hard disks. Flash memory does not have the mechanical limitations and latencies of hard drives, so a solid-state drive (SSD) is attractive in terms of speed, noise, power consumption, and reliability. Flash drives are gaining traction as mobile device secondary storage devices; they are also used as substitutes for hard drives in high-performance desktop computers and some servers with RAID and SAN architectures.

There remain some aspects of flash-based SSDs that make them unattractive. The cost per gigabyte of flash memory remains significantly higher than that of hard disks.[188] Also, flash memory has a finite number of P/E (program/erase) cycles, but this seems to be currently under control since warranties on flash-based SSDs are approaching those of current hard drives.[189] In addition, deleted files on SSDs can remain for an indefinite period of time before being overwritten by fresh data; erasure or shred techniques or software that work well on magnetic hard disk drives have no effect on SSDs, compromising security and forensic examination. However, due to the so-called TRIM command employed by most solid state drives, which marks the logical block addresses occupied by the deleted file as unused to enable garbage collection, data recovery software is not able to restore files deleted from such.

For relational databases or other systems that require ACID transactions, even a modest amount of flash storage can offer vast speedups over arrays of disk drives.[190]

In May 2006, Samsung Electronics announced two flash-memory based PCs, the Q1-SSD and Q30-SSD were expected to become available in June 2006, both of which used 32 GB SSDs, and were at least initially available only in South Korea.[191] The Q1-SSD and Q30-SSD launch was delayed and finally was shipped in late August 2006.[192]

The first flash-memory based PC to become available was the Sony Vaio UX90, announced for pre-order on 27 June 2006 and began to be shipped in Japan on 3 July 2006 with a 16 GB flash memory hard drive.[193] In late September 2006 Sony upgraded the flash-memory in the Vaio UX90 to 32 GB.[194]

A solid-state drive was offered as an option with the first MacBook Air introduced in 2008, and from 2010 onwards, all models were shipped with an SSD. Starting in late 2011, as part of Intel's Ultrabook initiative, an increasing number of ultra-thin laptops are being shipped with SSDs standard.

There are also hybrid techniques such as hybrid drive and ReadyBoost that attempt to combine the advantages of both technologies, using flash as a high-speed non-volatile cache for files on the disk that are often referenced, but rarely modified, such as application and operating system executable files.

On smartphones, the NAND flash products are used as file storage device, for example, eMMC and eUFS.

Flash memory as RAM

[edit]

As of 2012, there are attempts to use flash memory as the main computer memory, DRAM.[195]

Archival or long-term storage

[edit]

Floating-gate transistors in the flash storage device hold charge which represents data. This charge gradually leaks over time, leading to an accumulation of logical errors, also known as "bit rot" or "bit fading".[196]

Data retention

[edit]

It is unclear how long data on flash memory will persist under archival conditions (i.e., benign temperature and humidity with infrequent access with or without prophylactic rewrite). Datasheets of Atmel's flash-based "ATmega" microcontrollers typically promise retention times of 20 years at 85 °C (185 °F) and 100 years at 25 °C (77 °F).[197]

The retention span varies among types and models of flash storage. When supplied with power and idle, the charge of the transistors holding the data is routinely refreshed by the firmware of the flash storage.[196] The ability to retain data varies among flash storage devices due to differences in firmware, data redundancy, and error correction algorithms.[198]

An article from CMU in 2015 states "Today's flash devices, which do not require flash refresh, have a typical retention age of 1 year at room temperature." And that retention time decreases exponentially with increasing temperature. The phenomenon can be modeled by the Arrhenius equation.[199][200]

While flash storage retains data for a longer time if stored at colder temperatures, a higher but not extreme temperature while writing reduces stress and wear on the drive, given that electrons are able to flow more easily, according to Tim Schulte, Pranav Kalavade, and Johnmichael Hands from Intel.[201]

FPGA configuration

[edit]

Some FPGAs are based on flash configuration cells that are used directly as (programmable) switches to connect internal elements together, using the same kind of floating-gate transistor as the flash data storage cells in data storage devices.[186]

Industry

[edit]

One source states that, in 2008, the flash memory industry includes about US$9.1 billion in production and sales. Other sources put the flash memory market at a size of more than US$20 billion in 2006, accounting for more than eight percent of the overall semiconductor market and more than 34 percent of the total semiconductor memory market.[202] In 2012, the market was estimated at $26.8 billion.[203] It can take up to 10 weeks to produce a flash memory chip.[204]

Manufacturers

[edit]

The following were the largest NAND flash memory manufacturers, as of the second quarter of 2023.[205]

  1. Samsung Electronics – 31.4%
  2. Kioxia – 20.6%
  3. Western Digital Corporation – 12.6%
  4. SK Hynix – 18.5%
  5. Micron Technology – 12.3%
  6. Others – 8.7%

Notes: Samsung remains the largest NAND flash memory manufacturer as of Q1 2022.[206]

Kioxia spun out and got renamed of Toshiba in 2018/2019.[207]

SK Hynix acquired Intel's NAND business at the end of 2021.[208]

Shipments

[edit]
Flash memory shipments (est. manufactured units)
Year(s) Discrete flash memory chips Flash memory data capacity (gigabytes) Floating-gate MOSFET memory cells (billions)
1992 26,000,000[209] 3[209] 24[a]
1993 73,000,000[209] 17[209] 139[a]
1994 112,000,000[209] 25[209] 203[a]
1995 235,000,000[209] 38[209] 300[a]
1996 359,000,000[209] 140[209] 1,121[a]
1997 477,200,000+[210] 317+[210] 2,533+[a]
1998 762,195,122[211] 455+[210] 3,642+[a]
1999 12,800,000,000[212] 635+[210] 5,082+[a]
2000–2004 134,217,728,000 (NAND)[213] 1,073,741,824,000 (NAND)[213]
2005–2007 ?
2008 1,226,215,645 (mobile NAND)[214]
2009 1,226,215,645+ (mobile NAND)
2010 7,280,000,000+[b]
2011 8,700,000,000[216]
2012 5,151,515,152 (serial)[217]
2013 ?
2014 ? 59,000,000,000[218] 118,000,000,000+[a]
2015 7,692,307,692 (NAND)[219] 85,000,000,000[220] 170,000,000,000+[a]
2016 ? 100,000,000,000[221] 200,000,000,000+[a]
2017 ? 148,200,000,000[c] 296,400,000,000+[a]
2018 ? 231,640,000,000[d] 463,280,000,000+[a]
2019 ? ? ?
2020 ? ? ?
1992–2020 45,358,454,134+ memory chips 758,057,729,630+ gigabytes 2,321,421,837,044 billion+ cells

In addition to individual flash memory chips, flash memory is also embedded in microcontroller (MCU) chips and system-on-chip (SoC) devices.[225] Flash memory is embedded in ARM chips,[225] which have sold 150 billion units worldwide as of 2019,[226] and in programmable system-on-chip (PSoC) devices, which have sold 1.1 billion units as of 2012.[227] This adds up to at least 151.1 billion MCU and SoC chips with embedded flash memory, in addition to the 45.4 billion known individual flash chip sales as of 2015, totalling at least 196.5 billion chips containing flash memory.

Flash scalability

[edit]

Due to its relatively simple structure and high demand for higher capacity, NAND flash memory is the most aggressively scaled technology among electronic devices. The heavy competition among the top few manufacturers only adds to the aggressiveness in shrinking the floating-gate MOSFET design rule or process technology node.[107] While the expected shrink timeline is a factor of two every three years per the original version of Moore's law, this has recently been accelerated in the case of NAND flash to a factor of two every two years.

ITRS or company 2010 2011 2012 2013 2014 2015 2016 2017 2018
ITRS Flash Roadmap 2011[228] 32 nm 22 nm 20 nm 18 nm 16 nm
Updated ITRS Flash Roadmap[229] 17 nm 15 nm 14 nm
Samsung[228][229][230]
(Samsung 3D NAND)[229]
35–20 nm[36] 27 nm 21 nm
(MLC, TLC)
19–16 nm
19–10 nm (MLC, TLC)[231]
19–10 nm
V-NAND (24L)
16–10 nm
V-NAND (32L)
16–10 nm 12–10 nm 12–10 nm
Micron, Intel[228][229][230] 34–25 nm 25 nm 20 nm
(MLC + HKMG)
20 nm
(TLC)
16 nm 16 nm
3D NAND
16 nm
3D NAND
12 nm
3D NAND
12 nm
3D NAND
Toshiba, WD (SanDisk)[228][229][230] 43–32 nm
24 nm (Toshiba)[232]
24 nm 19 nm
(MLC, TLC)
15 nm 15 nm
3D NAND
15 nm
3D NAND
12 nm
3D NAND
12 nm
3D NAND
SK Hynix[228][229][230] 46–35 nm 26 nm 20 nm (MLC) 16 nm 16 nm 16 nm 12 nm 12 nm

As the MOSFET feature size of flash memory cells reaches the 15–16 nm minimum limit, further flash density increases will be driven by TLC (3 bits/cell) combined with vertical stacking of NAND memory planes. The decrease in endurance and increase in uncorrectable bit error rates that accompany feature size shrinking can be compensated by improved error correction mechanisms.[233] Even with these advances, it may be impossible to economically scale flash to smaller and smaller dimensions as the number of electron holding capacity reduces. Many promising new technologies (such as FeRAM, MRAM, PMC, PCM, ReRAM, and others) are under investigation and development as possible more scalable replacements for flash.[234]

Timeline

[edit]
Date of introduction Chip name Memory Package Capacity
Megabits (Mb), Gigabits (Gb), Terabits (Tb)
Flash type Cell type Layers or
Stacks of Layers
Manufacturer(s) Process Area Ref
1984 ? ? NOR SLC 1 Toshiba ? ? [24]
1985 ? 256 kb NOR SLC 1 Toshiba 2,000 nm ? [33]
1987 ? ? NAND SLC 1 Toshiba ? ? [1]
1989 ? 1 Mb NOR SLC 1 Seeq, Intel ? ? [33]
4 Mb NAND SLC 1 Toshiba 1,000 nm
1991 ? 16 Mb NOR SLC 1 Mitsubishi 600 nm ? [33]
1993 DD28F032SA 32 Mb NOR SLC 1 Intel ? 280 mm² [235][236]
1994 ? 64 Mb NOR SLC 1 NEC 400 nm ? [33]
1995 ? 16 Mb DINOR SLC 1 Mitsubishi, Hitachi ? ? [33][237]
NAND SLC 1 Toshiba ? ? [238]
32 Mb NAND SLC 1 Hitachi, Samsung, Toshiba ? ? [33]
34 Mb Serial SLC 1 SanDisk
1996 ? 64 Mb NAND SLC 1 Hitachi, Mitsubishi 400 nm ? [33]
QLC 1 NEC
128 Mb NAND SLC 1 Samsung, Hitachi ?
1997 ? 32 Mb NOR SLC 1 Intel, Sharp 400 nm ? [239]
NAND SLC 1 AMD, Fujitsu 350 nm
1999 ? 256 Mb NAND SLC 1 Toshiba 250 nm ? [33]
MLC 1 Hitachi 1
2000 ? 32 Mb NOR SLC 1 Toshiba 250 nm ? [33]
64 Mb NOR QLC 1 STMicroelectronics 180 nm
512 Mb NAND SLC 1 Toshiba ? ? [115]
2001 ? 512 Mb NAND MLC 1 Hitachi ? ? [33]
1 Gibit NAND MLC 1 Samsung
1 Toshiba, SanDisk 160 nm ? [240]
2002 ? 512 Mb NROM MLC 1 Saifun 170 nm ? [33]
2 GB NAND SLC 1 Samsung, Toshiba ? ? [241][242]
2003 ? 128 Mb NOR MLC 1 Intel 130 nm ? [33]
1 GB NAND MLC 1 Hitachi
2004 ? 8 GB NAND SLC 1 Samsung 60 nm ? [241]
2005 ? 16 GB NAND SLC 1 Samsung 50 nm ? [36]
2006 ? 32 GB NAND SLC 1 Samsung 40 nm
Apr-07 THGAM 128 GB Stacked NAND SLC Toshiba 56 nm 252 mm² [52]
Sep-07 ? 128 GB Stacked NAND SLC Hynix ? ? [53]
2008 THGBM 256 GB Stacked NAND SLC Toshiba 43 nm 353 mm² [54]
2009 ? 32 GB NAND TLC Toshiba 32 nm 113 mm² [34]
64 GB NAND QLC Toshiba, SanDisk 43 nm ? [34][35]
2010 ? 64 GB NAND SLC Hynix 20 nm ? [243]
TLC Samsung 20 nm ? [36]
THGBM2 1 Tb Stacked NAND QLC Toshiba 32 nm 374 mm² [55]
2011 KLMCG8GE4A 512 GB Stacked NAND MLC Samsung ? 192 mm² [244]
2013 ? ? NAND SLC SK Hynix 16 nm ? [243]
128 GB V-NAND TLC Samsung 10 nm ?
2015 ? 256 GB V-NAND TLC Samsung ? ? [231]
2017 eUFS 2.1 512 GB V-NAND TLC 8 of 64 Samsung ? ? [8]
768 GB V-NAND QLC Toshiba ? ? [245]
KLUFG8R1EM 4 Tb Stacked V-NAND TLC Samsung ? 150 mm² [8]
2018 ? 1 Tb V-NAND QLC Samsung ? ? [246]
1.33 Tb V-NAND QLC Toshiba ? 158 mm² [247][248]
2019 ? 512 GB V-NAND QLC Samsung ? ? [63][64]
1 Tb V-NAND TLC SK Hynix ? ? [249]
eUFS 2.1 1 Tb Stacked V-NAND[250] QLC 16 of 64 Samsung ? 150 mm² [63][64][251]
2023 eUFS 4.0 8 Tb 3D NAND QLC 232 Micron ? ? [252]

See also

[edit]

Explanatory notes

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Flash memory is a type of non-volatile semiconductor memory that retains stored data even without power and can be electrically erased and reprogrammed in blocks rather than byte by byte, distinguishing it from earlier EEPROM technologies.[1] It employs floating-gate transistors, where electrical charges trapped in a floating gate within the transistor's insulation layer determine the stored bit value (typically 0 or 1, though multi-level cells store more).[2] This architecture enables high-density storage, fast block-level erasure (hence the "flash" name, inspired by its rapid clearing like a camera flash), and relatively low cost per bit, making it ideal for applications requiring durable, rewritable data retention.[1] The technology was invented by Fujio Masuoka and his team at Toshiba in the early 1980s as part of a secret project to create a more efficient non-volatile memory.[1] Masuoka first demonstrated a NOR-type flash memory prototype at the 1984 IEEE International Electron Devices Meeting. Intel commercialized NOR flash in 1988, while Toshiba introduced NAND flash in 1989, which offered even higher density due to its serial cell arrangement, sparking rapid market growth fueled by shrinking transistor sizes and applications in digital cameras, mobile devices, and solid-state drives (SSDs).[1] By the 2010s, advancements like 3D NAND stacking allowed terabyte-scale capacities while addressing planar scaling limits. As of 2025, further innovations like 300+ layer 3D NAND have pushed capacities to petabyte levels in enterprise storage.[3][4] Flash memory exists in two main architectures: NOR flash, which connects cells in parallel for fast random access and direct code execution (execute-in-place, or XIP), suiting it for boot code and embedded systems with densities up to several gigabits (as of 2025); and NAND flash, which arranges cells in series for higher density (up to 2 Tb+), lower cost per gigabyte, and faster sequential writes, but requires controllers for error correction, wear leveling, and bad block management due to its block-oriented operations.[5] Although NAND flash has slower read and write speeds compared to volatile DRAM, its non-volatility and high density make it ideal for mass storage and portable applications.[6] NOR offers higher endurance (up to 100,000 program/erase cycles) and is used in automotive systems, wearables, and industrial PCs, while NAND dominates mass storage in SSDs (including NVMe), USB drives (pendrives), microSD cards, smartphones, tablets, and data centers, with variants like SLC (single-level cell) for reliability, MLC/TLC/QLC for density.[5] Despite limitations like finite endurance and electron leakage over time, flash memory's versatility has made it ubiquitous in consumer electronics, enterprise storage, and even space missions.[1]

History

Early concepts and invention

The development of flash memory was motivated by the limitations of earlier non-volatile memories like electrically erasable programmable read-only memory (EEPROM), which allowed byte-by-byte erasure but suffered from slow erasure times, limited endurance cycles (typically around 10,000 to 1 million per cell), and higher manufacturing costs due to requiring two transistors per bit, making it impractical for large-capacity storage applications.[7] Researchers sought a solution that could erase data in larger blocks simultaneously, enabling faster operations, denser cell structures with a single transistor per bit, and cost-effective scaling for high-density non-volatile memory.[8] Fujio Masuoka, a researcher at Toshiba Corporation in Japan, played a pivotal role in inventing flash memory during the 1980s. In 1984, Masuoka and his team developed the first NOR-type flash memory cell, utilizing a floating-gate structure with triple polysilicon technology that allowed electrical erasure of the entire memory array in a single "flash" operation, named for its rapid erasure akin to a camera flash.[8] This prototype was presented at the IEEE International Electron Devices Meeting (IEDM) in December 1984, where Masuoka detailed a 64-kbit device demonstrating non-volatility, in-system rewritability, and compatibility with existing EPROM fabrication processes.[9] Masuoka filed the original patents for this NOR flash technology, establishing the foundational intellectual property for block-erasable non-volatile memory.[9] Building on the NOR design, Masuoka introduced NAND flash memory in 1987 to achieve even higher densities. The NAND structure arranged cells in series, reducing cell area by approximately 30% compared to NOR while enabling ultra-high-density storage suitable for applications beyond 1 Mbit.[10] This innovation was prototyped as a 1-Mbit device and presented at the 1987 IEEE IEDM, highlighting its potential for scalable, low-cost mass storage through serial access and contactless cell arrays.[10] Masuoka also secured patents for the NAND architecture, further advancing the field.[9] Early prototypes of both NOR and NAND flash faced significant technical challenges, including the need for high voltages—around 20 V for programming and erasing via Fowler-Nordheim tunneling—which complicated integration with low-voltage logic circuits and increased power consumption.[11] Additionally, despite the groundbreaking potential, there was initial lack of industry interest; Toshiba provided Masuoka only a modest bonus of a few hundred dollars for his inventions and even attempted to demote him, leading to his resignation in 1994 amid disputes over recognition and royalties.[12][13] These hurdles delayed widespread adoption until subsequent refinements addressed reliability and compatibility issues.

Commercialization and adoption

The commercialization of flash memory began in the late 1980s, marking a pivotal shift toward electrically erasable non-volatile storage suitable for portable electronics. In 1988, Intel introduced the first commercial NOR flash memory chip, a 256-kilobit device that enabled random access and code execution directly from the memory, positioning it as a successor to ultraviolet-erasable EPROMs. This was followed by Toshiba's 1989 release of the first commercial NAND flash chips, starting with a 1-megabit capacity and scaling to 4-megabit by year's end, which emphasized high-density block-based storage for cost-sensitive applications.[1] These initial products were priced at around $20 per 256-kilobit chip, equivalent to roughly $640 per megabyte, reflecting early manufacturing challenges but offering a compelling value over traditional EPROMs due to electrical erasability without specialized equipment.[14] Early adoption in the 1990s focused on consumer and industrial devices where size, power efficiency, and reprogrammability were critical. Flash memory quickly found use in digital cameras, such as the 1995 Casio QV-10, which replaced film with removable flash cards for image storage, and in laptops for BIOS firmware updates, reducing reliance on slower EEPROMs.[15] By the mid-1990s, capacities reached 1-megabit routinely, with pricing dropping to under $10 per chip, enabling broader integration into PDAs and embedded systems.[16] The technology's advantages—faster block erase times (milliseconds versus seconds for byte-level EEPROM operations) and lower cost per bit (due to denser cell structures)—drove its replacement of EPROMs in prototyping and EEPROMs in high-volume storage, with flash achieving up to 10 times the density at half the price by 1995.[17] Key partnerships accelerated standardization and market penetration. In 1988, former Intel engineers founded SanDisk, which collaborated with Toshiba to develop flash-based storage solutions, leading to the 1997 formation of the MultiMediaCard (MMC) standard by SanDisk, Siemens, and later Nokia.[18] The MMC, a compact NAND-based card with initial 2-megabyte capacities, targeted mobile phones and early digital audio players, standardizing interfaces for interchangeable storage and boosting adoption in portable devices.[19] By the late 1990s, these efforts had propelled flash memory into mainstream use, with annual revenues surpassing $1 billion and enabling the portable computing revolution.[20]

Technological evolution

Following the commercialization of flash memory in the late 1990s, a key advancement in the early 2000s was the introduction of charge-trap flash (CTF), which replaced the traditional floating-gate structure with discrete charge-trapping sites in a dielectric layer, such as silicon nitride, to improve reliability by reducing charge leakage and enhancing data retention.[21][22] This shift, first implemented by AMD and Fujitsu in 2002, addressed scaling limitations of floating-gate technology below 45 nm, enabling higher densities while maintaining endurance and minimizing defects' impact on performance.[21][23] To increase storage density without proportionally expanding die size, multi-level cell (MLC) architectures emerged in the 2000s, storing 2 bits per cell by distinguishing four voltage states, which doubled capacity compared to single-level cells (SLC) and gained widespread adoption in consumer devices like SSDs and memory cards.[24] Building on this, triple-level cell (TLC) technology, storing 3 bits per cell with eight voltage levels, was commercialized in the 2010s, starting with Samsung's mass production in 2010, further boosting density for mainstream applications despite trade-offs in write endurance.[25] Quad-level cell (QLC), with 4 bits per cell and 16 states, followed in 2018 through joint efforts by Intel and Micron, enabling terabit-scale chips suitable for archival and read-intensive workloads.[26] Most recently, penta-level cell (PLC) technology, storing 5 bits per cell, was unveiled by SK Hynix in 2024, pushing density limits for high-capacity enterprise storage.[27] A pivotal architectural shift occurred in 2013 with Samsung's introduction of V-NAND, the first mass-produced 3D NAND flash using vertical stacking of memory cells in a charge-trap structure, which overcame planar scaling barriers by layering cells upward rather than shrinking laterally.[28] This vertical channel design improved efficiency and yield, evolving rapidly to exceed 200 layers by 2025 through innovations like multi-tier stacking, advanced etching techniques, and architectures such as YMTC's Xtacking, which reduces process complexity by separating the fabrication of memory arrays and peripherals for bonding, enabling high layer counts primarily with deep ultraviolet (DUV) lithography and multi-patterning for features like channel holes rather than extreme ultraviolet (EUV).[29][30][31] In 2024–2025, advancements tailored flash for AI workloads included Macronix's compute-in-memory 3D NOR flash, which integrates processing logic within the memory array to accelerate edge AI inference by reducing data movement overhead and enabling direct matrix operations.[27] Complementing this, SanDisk developed High Bandwidth Flash (HBF), a NAND-based solution using wafer bonding to achieve HBM-like read bandwidth while providing 8–16 times the capacity, targeting memory-centric AI systems for large-scale model training and inference.[32] However, manufacturing HBF involves advanced packaging complexities, such as through-silicon vias (TSVs) and hybrid bonding, which can lead to yield issues and higher defect rates due to the precision required for stacking multiple dies. These processes also impose scalability constraints on initial production volumes and may require reallocating fabrication resources, potentially contributing to supply chain pressures similar to those in other high-demand stacked memory technologies.[33][34][35]

Operating principles

Core mechanisms

Flash memory relies on the storage of electrical charge in an isolated layer within a metal-oxide-semiconductor field-effect transistor (MOSFET) to enable non-volatile data retention. The foundational device structure is the floating-gate MOSFET, developed by Dawon Kahng and Simon Sze in 1967, consisting of a control gate—typically made of poly-silicon—overlying a conductive floating gate that serves as the charge storage element.[36] This floating gate is electrically isolated, allowing injected charges to modulate the transistor's threshold voltage and represent binary states without continuous power supply.[36] The isolation of the floating gate is achieved through surrounding oxide layers: a thin tunnel oxide (typically silicon dioxide, ~7–10 nm thick) between the floating gate and the substrate channel, and a thicker blocking oxide or inter-poly dielectric between the floating gate and control gate. These oxide layers provide high potential barriers (approximately 3.1 eV for SiO₂) that trap electrons on the floating gate, ensuring long-term non-volatility by minimizing thermal emission or leakage currents under normal operating conditions.[37] Fowler-Nordheim tunneling governs the quantum mechanical transport of charges across these oxides, enabling programming and erasure by applying high electric fields (~10 MV/cm) to bend the potential barrier into a triangular shape, allowing electrons to tunnel through without significant impact ionization.[38] The tunneling current density JJ is described by the Fowler-Nordheim equation:
J=q3E28πhϕexp(8π2mϕ33qhE) J = \frac{q^3 E^2}{8 \pi h \phi} \exp\left( -\frac{8 \pi \sqrt{2 m \phi^3}}{3 q h E} \right)
where qq is the electron charge, EE is the electric field strength, ϕ\phi is the work function or barrier height, hh is Planck's constant, and mm is the electron effective mass.[38] As an alternative to the continuous conductive floating gate, charge-trap flash (CTF) employs a non-conductive nitride layer (typically Si₃N₄) as the charge storage medium, where electrons are captured in discrete traps rather than delocalized across a conductor.[39] This structure mitigates inter-cell capacitive coupling effects that plague floating-gate devices during scaling, as charge redistribution in a shared floating gate can inadvertently alter neighboring cell thresholds; in CTF, localized trapping confines interference to adjacent oxide regions.[39] Like floating-gate variants, CTF maintains non-volatility through surrounding oxide layers that isolate the nitride trap sites, with similar Fowler-Nordheim mechanisms for charge injection, though the discrete traps enhance reliability in densely packed arrays.[39]

Programming and erasing

Programming in flash memory cells typically involves injecting electrons onto the floating gate to store a logical '0', while erasing removes these electrons to reset the cell to a logical '1'. In some early flash memory designs, such as certain NOR-type cells, programming was achieved through hot-carrier injection, where high-energy electrons generated near the drain are accelerated into the floating gate under a positive gate voltage of around 12 V and a drain voltage of 6-7 V. Programming and erasing mechanisms vary by architecture: NOR flash typically uses channel hot electron injection for programming and Fowler-Nordheim (FN) tunneling for erasing, while NAND flash employs FN tunneling for both operations, enabling quantum mechanical tunneling of electrons through a thin oxide layer under high electric fields exceeding 10 MV/cm.[40] These operations require high voltages, typically 15-20 V for programming and up to 20 V for erasing, far exceeding the standard supply voltages of 1.8-5 V in integrated circuits. To generate these voltages internally without external high-voltage supplies, flash memory chips incorporate on-chip charge pump circuits, such as Dickson or cross-coupled types, that boost the low supply voltage through capacitive multiplication and regulation. The erase process involves bulk erasure of multiple cells simultaneously, often an entire block, by applying a high positive voltage (around 15-20 V) to the substrate or source/drain regions while grounding the control gate, facilitating FN tunneling of electrons from the floating gate to the substrate and thereby resetting the cells to the erased '1' state with a low threshold voltage.[41] This collective erasure contrasts with byte-level operations in other memory types and ensures efficient clearing of large data sectors. To achieve precise control during programming and prevent over-programming that could lead to threshold voltage overshoot, flash memory employs incremental stepping pulse programming (ISPP), where programming pulses of increasing amplitude (typically stepping by 0.2-0.5 V) are applied iteratively, followed by verification reads to adjust the next pulse until the target threshold voltage is reached. This method, introduced in early NAND flash designs, tightens the distribution of programmed cell threshold voltages, enhancing reliability in multi-level cell applications.

Architectural variations

Flash memory architectures vary primarily between NOR and NAND types, each optimized for different access patterns and density requirements. In NOR flash, memory cells are arranged in parallel rows, with one end of each cell connected to a source line and the other directly to a bit line, mimicking a NOR gate structure. This parallel organization enables random access akin to RAM, where address lines map the entire memory range for short read times. As a result, NOR is particularly suitable for executing code directly from the memory without needing to load it into a separate RAM.[42] NAND flash, by contrast, connects multiple memory cells—typically 32 to 128—in series to form strings, which are then grouped into pages and blocks for organized storage. This serial string configuration achieves higher density, with a unit cell area approximately 60% smaller than NOR's due to reduced wiring overhead. Access operations in NAND are page-based, typically involving 2KB pages plus spare areas for error correction, making it efficient for sequential read and write patterns but less ideal for random access.[43] To overcome planar scaling limits, 3D integration has become prominent in NAND architectures, stacking multiple layers of memory cells vertically to boost capacity while maintaining cost efficiency. Modern 3D NAND architectures typically employ charge trap flash (CTF) for the charge storage layer to enable better scaling and minimize cell-to-cell interference compared to traditional floating-gate designs.[44] In these designs, vertical channels run through the stacked layers, surrounded by gate-all-around structures for control. Bit Cost Scalable (BiCS) technology, developed by Toshiba, exemplifies this approach with vertically stacked gates—including lower and upper select gates alongside control gates—formed around polycrystalline silicon channels in a gate-first process.[45] Specifically, in vertical NAND, fabrication involves etching channel holes through the entire stack of layers using plasma techniques, then filling these holes with polysilicon to create the conductive channel path essential for charge transport.[46] This vertical orientation allows for hundreds of layers in modern implementations, significantly enhancing bit density over traditional 2D layouts.[45]

Flash memory types

NOR flash

NOR flash memory employs a parallel array structure where memory cells are connected such that each cell's drain is tied to a shared bit line, and sources are connected to a common source line, enabling individual access to bytes or words for random read and write operations.[47] This configuration, often referred to as a NOR-type array, contrasts with series-connected architectures by allowing direct addressing without the need for block-level operations, which supports efficient code execution directly from the memory.[48] Programming in NOR flash is achieved through channel hot electron (CHE) injection, where high voltages on the control gate and drain accelerate electrons from the channel into the floating gate, raising the threshold voltage to store a logic '0'.[49] Erasing occurs via Fowler-Nordheim (FN) tunneling, in which electrons are removed from the floating gate to the substrate under a high negative bias on the control gate, lowering the threshold voltage for a logic '1' state; this process typically affects sectors or blocks simultaneously.[49] These mechanisms ensure reliable non-volatile storage but require careful voltage management to avoid over-programming. NOR flash typically offers an endurance of up to 100,000 program/erase (P/E) cycles per cell, providing robust longevity for applications demanding frequent updates, with densities reaching 1-2 Gb in commercial devices.[5] Its key advantage lies in execute-in-place (XIP) capability, facilitated by fast random-access reads and the ability to perform byte/word writes, allowing microcontrollers to run code directly from the flash without loading into RAM, thus reducing system costs and boot times in embedded environments.[50]

NAND flash

NAND flash is the predominant form of flash memory in modern storage, valued for its high density, low cost per bit, and scalability. It serves as the primary storage medium in a range of consumer and enterprise products, including solid-state drives (SSDs, including NVMe-based models), USB pendrives (flash drives), microSD cards, and internal storage in smartphones and tablets.[51][52] NAND flash memory utilizes a distinctive string-based architecture to achieve high storage density. In this design, a NAND string comprises 32 to 128 memory cells connected in series, forming a compact vertical or horizontal chain that minimizes interconnects and maximizes efficiency.[53] At each end of the string, select transistors—typically a string select transistor (SST) and a ground select transistor (GST)—are integrated to isolate the string during operations and connect it to the bit line and source line, respectively.[54] This serial arrangement, as detailed in core architectural variations, enables efficient sharing of control lines across multiple strings, contributing to the overall scalability of NAND arrays.[55] Programming in NAND flash occurs at the page level, where data is written simultaneously across all cells in a row, with typical page sizes ranging from 4 to 16 KB including spare areas for metadata.[55] Erasure, however, is a block-level operation that resets an entire group of pages—usually 128 to 512 KB in size—to a uniform erased state, as individual cell erasure is not feasible due to the shared substrate in the string structure.[55] These granularities optimize for sequential access patterns, distinguishing NAND from other flash types by prioritizing bulk operations over fine-grained updates. To boost throughput, contemporary NAND controllers leverage multi-plane operations, partitioning each die into independent planes that can execute concurrent reads, programs, or erases without interference. This parallelism, often supporting 2 to 4 planes per die, can multiply effective bandwidth by allowing interleaved commands across planes. Integrated error correction further enhances reliability, with low-density parity-check (LDPC) codes becoming standard for correcting raw bit error rates that increase with shrinking cell sizes and multi-bit storage.[56] LDPC's iterative decoding provides superior performance over earlier BCH codes, enabling sustained operation in high-density environments.[56] NAND flash maintains density leadership through advancements in 3D stacking, where cells are layered vertically in a charge-trap architecture to overcome planar scaling limits.[57] By 2025, this has enabled single-die capacities up to 2 Tb via over 300-layer stacks, supporting quad-level cell (QLC) technology for cost-effective mass storage.[57] Such vertical integration not only amplifies bit density but also improves endurance and speed compared to two-dimensional predecessors.[57]

Advanced and emerging variants

Flash memory has evolved beyond basic single-level cell (SLC) configurations to include multi-level cell (MLC) variants that store multiple bits per cell, enabling higher storage density at the cost of reduced endurance and reliability. SLC stores 1 bit per cell and offers high endurance, typically supporting 50,000 to 100,000 write/erase cycles, making it suitable for applications requiring frequent updates.[3] In contrast, MLC (2 bits/cell), TLC (3 bits/cell), QLC (4 bits/cell), and emerging PLC (5 bits/cell) architectures increase density by distinguishing more voltage states, but they exhibit progressively lower endurance—often dropping to 1,000–3,000 cycles for TLC and below 1,000 for QLC—due to increased susceptibility to read/write disturbances and charge retention issues.[58] These trade-offs prioritize capacity for consumer storage while necessitating advanced error correction to maintain reliability.[3] Advancements in 3D NOR flash address density limitations of planar designs, with Macronix pioneering a 3D NOR architecture that stacks memory layers vertically to achieve higher capacities and faster read speeds. Debuted at electronica 2024, this technology reduces reliance on DRAM by integrating compute-in-memory capabilities, enabling efficient AI inference at the edge through in-situ processing that minimizes data movement.[59][27] The 3D structure supports up to 32 layers initially, improving performance for embedded AI tasks while maintaining NOR's random access advantages.[60] High Bandwidth Flash (HBF), a NAND-based variant developed by SanDisk, targets AI workloads by delivering DRAM-like speeds in a denser, non-volatile package to overcome memory bandwidth bottlenecks. Announced in 2025, HBF leverages advanced wafer bonding and BiCS NAND stacking to provide 8 to 16 times the capacity of High Bandwidth Memory (HBM) while matching its read bandwidth, enabling larger AI models to reside directly on GPUs.[32] In collaboration with SK Hynix for standardization, initial samples are slated for late 2026, with prototypes demonstrated at Flash Memory Summit 2025 focusing on AI inference acceleration.[61][62] Flash evolutions toward persistent memory interfaces draw inspiration from technologies like Optane, with 3D stackable architectures enabling storage-class memory (SCM) roles that bridge DRAM speed and NAND capacity. Macronix's 3D AND-type flash, for instance, supports fast-read SCM operations in high-density configurations, facilitating byte-addressable persistence for data-intensive computing without full DRAM replacement.[63] While hybrids with MRAM or FeRAM explore enhanced endurance, flash-centric variants emphasize scalable, cost-effective persistence for AI and edge applications.[27]

Physical and performance characteristics

Capacity and scalability

Flash memory's capacity has advanced significantly through innovations in cell density and vertical stacking, enabling terabyte-scale storage in compact forms. Modern NAND flash commonly employs multi-level cell (MLC) technologies, storing multiple bits per cell to boost density without proportionally increasing physical size. For instance, triple-level cell (TLC) configurations store three bits per cell, while quad-level cell (QLC) achieves four bits per cell, with QLC now comprising over 20% of the PC market in 2025.[64][65][64] Experimental demonstrations have even reached seven bits per cell in 3D flash prototypes, hinting at further density gains.[66] Vertical scaling via 3D architectures further amplifies capacity by stacking memory cells in layers, with leading manufacturers producing over 200 layers by 2025. Samsung's eighth-generation V-NAND, for example, utilizes 236 layers.[67] Similarly, SK Hynix's 238-layer TLC process supports high-volume production, while Micron has mass-produced 238-layer NAND.[67][68] By late 2025, advancements like Samsung's tenth-generation V-NAND exceeding 400 layers have entered mass production, enabling even higher densities.[69] This vertical stacking approach, unlike the extreme horizontal scaling required for logic chips, relies on deep ultraviolet (DUV) lithography with multi-patterning techniques for patterning critical features like channel holes and etching, allowing high layer counts without EUV for core array fabrication. Innovations such as YMTC's Xtacking architecture, which fabricates the memory array and peripheral circuitry separately before bonding, reduce process complexity and support stacks exceeding 200 layers.[70][71][31] These multi-hundred-layer stacks enable SSDs with capacities exceeding 8 TB in standard form factors, driven by the shift from planar to vertical channel structures. However, scaling to higher densities introduces physical challenges that limit further improvements. Cell-to-cell interference, where programming one cell affects neighboring ones, persists as a key issue, though 3D NAND reduces it by about 40% compared to planar designs due to greater physical separation.[72] In tall 3D stacks exceeding 200 layers, string current reduction becomes prominent, as the elongated channel paths increase resistance and diminish drive current, complicating read and write operations.[73][74] Material engineering, such as optimized dielectrics and channel materials, is employed to mitigate these effects and sustain reliability.[75] Looking ahead, projections indicate continued capacity expansion, with petabyte-scale SSDs entering production by 2030 through layer counts surpassing 1,000.[76] Enterprise SSD market shipments are forecasted to reach 1,078 exabytes annually by 2030, fueled by AI and data center demands.[77] Emerging use of extreme ultraviolet (EUV) lithography supports sub-10nm nodes for peripheral circuitry and finer z-pitch scaling below 50 nm in advanced 3D NAND generations.[78][79] These advancements have driven down costs, with 3D stacking contributing to a long-term decline from higher levels in prior generations.[80] Overall, NAND flash density has increased over a million-fold since its inception, primarily through bit-per-cell multiplication and layer stacking.[81]

Speed and endurance

Flash memory's performance is characterized by its read and write speeds, which vary significantly between NOR and NAND architectures, as well as endurance limits defined by program/erase (P/E) cycles. NOR flash excels in random read operations, achieving transfer rates up to 400 MB/s through direct memory access and fast sensing mechanisms, making it suitable for code execution in embedded systems. In contrast, NAND flash prioritizes sequential throughput, with modern NVMe-based SSDs delivering over 10 GB/s in sequential reads, as demonstrated by enterprise drives like the Micron 7600 series reaching 12 GB/s. These speeds are enabled by parallel data paths and high-bandwidth interfaces, though random reads in NAND are typically slower due to its block-oriented structure. NAND flash is a non-volatile memory that retains data without power, but it exhibits significantly higher access latency and slower random read and write performance compared to volatile DRAM, which is optimized for low-latency random access in main memory applications. As a result, NAND flash is ideal for mass storage and portable devices, where high density and non-volatility outweigh the need for DRAM-like speeds.[82] Write endurance in flash memory is constrained by the number of P/E cycles a cell can withstand before degradation, with single-level cell (SLC) NAND offering up to 100,000 cycles for high-reliability applications.[83] Multi-level variants trade endurance for density: triple-level cell (TLC) sustains around 3,000 cycles, while quad-level cell (QLC) drops below 1,000 cycles, limiting its use in write-intensive scenarios.[84] To mitigate uneven wear, wear-leveling algorithms distribute writes across cells, extending overall device lifespan by balancing usage.[85] Read speeds in flash are fundamentally limited by sensing amplifiers, which detect small voltage differences in cells and typically operate in the range of 50-100 ns per access, bottlenecking parallel operations in dense arrays.[86] Performance optimizations, such as SLC caching in TLC NAND drives, temporarily map writes to pseudo-SLC regions for faster initial throughput—up to several GB/s—before folding data to native TLC, as seen in Micron's Adaptive Write Technology.[87] In enterprise benchmarks by 2025, SSDs like the Phison Pascari X200P achieve over 3 million random read IOPS, highlighting optimizations in controller design and NAND stacking for high-concurrency workloads.[88]

Limitations and reliability

One fundamental limitation of flash memory is the block erasure requirement, which necessitates erasing an entire block of cells before reprogramming any portion of it, as individual bits or pages cannot be directly overwritten. This constraint arises from the physics of charge storage in floating-gate or charge-trap structures, where erasing involves applying a high voltage to remove electrons collectively from the block. As a result, operations like garbage collection in flash-based storage systems lead to write amplification, where significantly more data is written to the medium than the user intends, increasing overhead and wear.[89][90] Data retention in flash memory is another key constraint. Manufacturers commonly specify retention periods of around 10 years at room temperature for commercial devices under low-wear conditions, but these are not worst-case guarantees. For unpowered storage after reaching rated endurance, JEDEC standards (JESD218) require a minimum of 1 year at 30°C for consumer-grade devices and 3 months at 40°C for enterprise-grade devices. Higher-quality NAND (such as certain TLC variants up to 3 years) or enterprise types (potentially up to 10 years under favorable conditions) may achieve longer retention, though actual duration varies based on usage, temperature, and wear. NAND flash leaks charge when unpowered, limiting data retention primarily due to this gradual leakage through the tunnel oxide. The primary mechanism involves thermal emission of electrons from the storage layer, exacerbated by stress-induced defects that create leakage paths, leading to threshold voltage shifts and potential data errors over time. High temperatures accelerate this process exponentially, following Arrhenius-like behavior observed in accelerated bake tests across multiple technology nodes.[91][92][93][94] Memory wear manifests primarily through progressive degradation of the tunnel oxide layer during repeated program/erase cycles, culminating in irreversible breakdown that traps excessive charge or creates conductive paths, thereby limiting the device's lifespan. This oxide wear is driven by phenomena such as anode hole injection and local field enhancements at the silicon-oxide interface, which accumulate defects and reduce the insulating properties over cycles typically ranging from thousands to hundreds of thousands, depending on the architecture. Additionally, read disturb effects arise from repeated read operations on a cell or page, where the pass voltage applied to unselected cells in the array causes electron injection or hole trapping, gradually shifting their threshold voltages toward erroneous states.[90][95] In dense memory arrays, program and erase disturbs further compromise reliability, as high voltages applied to target cells inadvertently affect neighboring cells through coupling or leakage currents, leading to unintended threshold voltage alterations. For instance, during programming, adjacent cells may experience charge gain via substrate injection, while erase operations can induce soft breakdowns in nearby oxides. Exposure to X-ray radiation introduces additional risks by generating trapped charges in the gate dielectrics or oxide layers, resulting in charge loss or gain that degrades read margins, particularly evident after high-dose exposures or during subsequent retention periods. These limitations collectively underscore the need for careful management of operational stresses to maintain flash memory integrity.[96][97][98] For long-term archival storage, SSDs are generally more reliable than USB flash drives. Both use NAND flash memory, which leaks charge when unpowered, but SSDs benefit from superior error correction, advanced wear-leveling, higher-quality controllers, and better overall build quality. USB flash drives often employ cheaper NAND and simpler controllers, making them more prone to failure and data corruption over extended periods. However, neither is ideal for true long-term archival spanning decades; periodic powering on to refresh data or alternatives such as hard disk drives (HDDs) are recommended.[91]

Applications

Embedded and firmware uses

Flash memory plays a crucial role in embedded systems and firmware applications, where non-volatility, low power consumption, and compact form factors are essential for boot processes and code execution in resource-constrained devices. Serial NOR flash, typically interfaced via SPI, is extensively used for storing BIOS and UEFI firmware in personal computers, safeguarding critical settings like UEFI variables and preventing rollback attacks through features such as replay-protected memory block (RPMC). These devices offer densities ranging from 512 Kbit to 512 Mbit in standard configurations, with stacked variants like SpiStack enabling up to 512 MB or more by combining multiple dies for code storage needs. [99] [100] [101] This architecture supports execute-in-place capabilities, allowing direct code execution from the flash without RAM transfer. In smartphones and portable devices, embedded standards such as eMMC and UFS provide integrated NAND flash solutions for operating system boot and application storage. eMMC, or embedded MultiMediaCard, functions as a managed NAND interface that simplifies integration and delivers reliable performance for mobile mass storage. [102] UFS has emerged as the preferred successor, with UFS 4.0 offering sequential read speeds of up to 4.2 GB/s and write speeds up to 2.8 GB/s, facilitating rapid data access in high-end smartphones as of 2025. [103] [104] Field-programmable gate arrays (FPGAs) rely on flash memory to store configuration bitstreams, which are loaded into the FPGA's volatile SRAM at power-on or boot to define the device's logic functionality. SPI NOR flash is commonly selected for this purpose due to its fast random access and compatibility with FPGA configuration modes, ensuring reliable reconfiguration without persistent external programming. [105] [106] Automotive and industrial embedded systems demand flash memory qualified under AEC-Q100 standards to endure extreme conditions, including temperatures from -40°C to +125°C and vibrations. AEC-Q100-compliant NOR flash, such as Infineon's SEMPER series, incorporates error correction and cyclic redundancy checks for enhanced reliability in safety-critical applications like engine control units and industrial controllers. [107] Similarly, managed NAND solutions like Micron's UFS meet these qualifications, supporting robust operation in vehicle infotainment and sensor systems. [108] [109]

Storage and computing integration

Solid-state drives (SSDs) based on flash memory have become the primary storage solution in modern computing systems, largely replacing traditional hard disk drives (HDDs) due to their superior speed, reliability, and energy efficiency. The NVMe protocol, optimized for flash storage, has evolved significantly to leverage high-bandwidth interfaces like PCIe 5.0, enabling sequential read and write speeds exceeding 14 GB/s in contemporary implementations.[110] By 2025, consumer-grade SSDs routinely offer capacities greater than 8 TB, such as the Samsung 9100 Pro, facilitating seamless integration into personal computers, laptops, and workstations for operating system boot times and application loading that are orders of magnitude faster than HDDs.[111] Hybrid storage systems combine SSDs with remaining HDD tiers for cost-effective tiering, where flash handles frequent access patterns while HDDs manage bulk data, optimizing overall system performance in desktops and servers.[112] Specialized flash file systems address the unique constraints of NAND flash, such as limited write cycles and block-level operations, to ensure efficient storage integration. The Flash-Friendly File System (F2FS), developed by Samsung, employs multi-head logging and hot/cold data separation to implement wear-leveling, distributing writes evenly across flash blocks to extend device lifespan, while relying on the underlying flash translation layer (FTL) for bad block detection and management during garbage collection.[113] Similarly, Yet Another Flash File System (YAFFS), designed for embedded NAND environments, achieves wear-leveling through garbage collection that relocates valid data and erases dirty blocks, with bad blocks explicitly marked using spare area bytes during formatting and scanning to prevent data corruption.[114] These file systems enable direct flash access in computing setups, minimizing overhead from traditional file systems ill-suited for flash's erase-before-write mechanism, which is handled at the block level as detailed in reliability discussions.[113] Flash memory has also explored roles as a persistent alternative to volatile RAM, bridging the gap between DRAM speed and non-volatile storage. Intel's Optane, utilizing 3D XPoint technology, served as a byte-addressable persistent memory module that accelerated data-intensive workloads by providing DRAM-like latency with data retention across power cycles, but production was phased out by September 2022 due to market challenges.[115] Post-Optane research continues in hybrid persistent memory architectures, including combinations of embedded DRAM (eDRAM) with flash for caching and persistence, aiming to sustain low-latency access in memory hierarchies without full DRAM replacement.[116] In data centers, all-flash arrays (AFAs) dominate high-performance storage by eliminating HDD bottlenecks, delivering latencies under 100 μs for random reads compared to 5-10 ms on HDDs, which significantly reduces tail latency in cloud services and virtualized environments.[112] This shift enables scalable computing integration, where AFAs support NVMe-over-Fabrics for disaggregated storage, enhancing throughput for big data analytics and AI training while lowering power consumption relative to hybrid HDD setups.[117]

Specialized and future roles

Flash memory is increasingly adapted for archival storage applications, particularly through high-retention quad-level cell (QLC) variants designed for cold data that requires long-term preservation with minimal access. These QLC cells, storing four bits per cell in 3D NAND structures, achieve data retention of up to 1 year at 55°C for lightly used cells when combined with advanced error correction codes (ECC) such as low-density parity-check (LDPC) algorithms to mitigate retention-induced bit errors. For instance, predictive models for error bit placement in 3D QLC NAND enable optimized data layout in archival systems, ensuring reliability for backup and long-term digital preservation by compensating for charge leakage over time.[118][119][120] In AI and edge computing, high-bandwidth flash (HBF) emerges as a specialized variant that facilitates in-memory processing by providing NAND-based memory with bandwidth approaching high-bandwidth memory (HBM), up to 64 GB/s, while offering 8 to 16 times the capacity for storing large AI models directly on-device. HBF architectures, developed by companies like SanDisk, enable mixture-of-experts AI inference at the edge by parallelizing access to multiple 3D NAND arrays, reducing latency for tasks like real-time image recognition in smartphones. Complementing this, compute-in-memory (CIM) implementations using NOR flash minimize data movement between storage and processors by performing matrix-vector multiplications within the memory array, leveraging split-gate NOR cells for low-power analog computations that store weights non-volatily and achieve up to 2.7 times better energy efficiency in deep neural network inference compared to traditional von Neumann architectures.[121][122][123][124][125] For Internet of Things (IoT) devices and wearables, ultra-low power flash variants prioritize extended battery life through optimized serial NOR architectures with deep power-down currents as low as 7 nA and active currents under 4 mA, enabling always-on functionality in energy-constrained environments like sensors and fitness trackers. These include Macronix's MX25R series, which reduces power consumption by 60% over standard NOR flash via efficient read/write operations at 1.65V to 3.6V, supporting firmware storage and data logging in medical wearables. Additionally, SD Express cards, leveraging PCIe and NVMe protocols over the SD interface, provide high-speed portable storage up to 2 TB with read speeds exceeding 985 MB/s, ideal for high-resolution video capture in portable IoT cameras and AR glasses without compromising form factor.[126][127][128][129] Flash memory is also utilized in space missions, where radiation-hardened variants withstand cosmic rays and extreme environments in satellites and probes, providing reliable non-volatile storage for telemetry data and onboard computing.[130] Looking ahead, flash memory is poised for integration with quantum-resistant encryption to safeguard data against future quantum computing threats, incorporating post-quantum cryptography (PQC) algorithms like Kyber directly into storage controllers for secure key encapsulation in embedded systems. By 2030, flash-based disaggregated memory pools using Compute Express Link (CXL) interfaces are expected to enable scalable, shared memory architectures in data centers, allowing dynamic allocation of NAND resources across multiple compute nodes to support AI workloads, with projections indicating significant CXL adoption in memory systems. This evolution builds on emerging penta-level cell (PLC) variants, which store five bits per cell for higher density in such pooled systems.[131][132][133][134][135]

Industry overview

Key manufacturers

The NAND flash market is an oligopoly dominated by major players including Samsung, SK Hynix, Micron, Kioxia, and SanDisk, with Samsung and SK Hynix as leaders.[136] Samsung Electronics is the leading manufacturer of NAND flash memory, renowned for pioneering 3D V-NAND technology that stacks memory cells vertically to increase density and capacity. In 2025, Samsung commands approximately 31% of the global NAND flash market share, driven by its advancements in high-layer-count 3D NAND and broad portfolio spanning consumer to enterprise applications.[137] SK Hynix, another dominant player, has established itself as a pioneer in PLC (penta-level cell) NAND, enabling five bits per cell for enhanced storage efficiency, and maintains a strong presence in enterprise-grade SSDs optimized for data centers and AI workloads. The company bolstered its NAND capabilities through the acquisition of Intel's NAND and SSD business in 2021, with the transaction fully completed by early 2025, allowing SK Hynix to integrate Intel's technology and expand its production footprint. In 2025, SK Hynix holds about 18% of the NAND market, with its SK Group affiliates reaching a 21% revenue share in the second quarter.[137][138][139][140] Micron Technology, SanDisk, and Kioxia form a critical tier of NAND producers, often collaborating on technology development to advance layer counts and cell technologies like QLC (quad-level cell), which stores four bits per cell for cost-effective high-capacity storage. These firms have jointly pushed QLC adoption in 3D NAND, with Micron achieving first production of 200+ layer QLC in 2024 for client and data center use, while Kioxia and SanDisk introduced 218-layer BiCS FLASH supporting both TLC and QLC configurations. Emerging collaborations, such as those involving China's Yangtze Memory Technologies Corp. (YMTC), highlight international efforts to scale production, with YMTC partnering on advanced bonding techniques for next-generation NAND. In the NAND market, these players collectively hold significant shares, with Micron, SanDisk, and Kioxia each around 10-15% in 2025.[141][142][143][144] Among other notable manufacturers, Macronix International leads in NOR flash memory through its innovative SPI NOR and 3D NOR technologies tailored for embedded applications requiring fast random access. SanDisk, which became independent in early 2025 following a corporate spin-off from Western Digital, drives innovations in high-bandwidth flash (HBF), a NAND-based architecture designed to rival HBM for AI inference with superior capacity and edge computing suitability, including collaborations for standardization with partners like SK Hynix.[145][61][146] Flash memory, particularly NAND flash, is the most dominant type of read-only memory (ROM) produced by manufacturers today, due to its high production volume driven by demand in solid-state drives (SSDs), USB drives, smartphones, tablets, memory cards, embedded systems, Internet of Things (IoT) devices, artificial intelligence (AI), and cloud storage. Advantages include low cost per bit, large capacities from hundreds of gigabytes to terabytes, and scalability enabled by 3D NAND technology. Other ROM types, such as mask ROM for fixed-data applications like low-cost microcontrollers and EEPROM for small-scale configurations, are used in niche areas, but NAND flash leads in global production volume.[147][148] The prices of memory cards, which rely on NAND flash as their core storage component, are primarily determined by the supply and demand of NAND flash chips. NAND flash prices have fluctuated significantly in recent years. In 2023, average contract prices declined sharply by approximately 20-40% throughout the year due to oversupply and weak demand. In 2024, prices rebounded significantly as manufacturers cut production and demand recovered, particularly from smartphones, PCs, and AI-related applications, with contract prices rising by 10-30% in several quarters. In 2025, prices continued to face upward pressure, with increases of 5-15% or more, driven by supply constraints and growing demand from AI data centers, enterprise storage, and consumer electronics. Recent increases in NAND flash prices have been attributed to strong demand from AI data centers, capacity tightening by suppliers including Samsung, SK Hynix, and Micron who prioritize allocation to high-profit enterprise products over consumer segments, and shortages in wafer supply.[149][150][151] The global NAND flash market reached approximately $65 billion in 2025, fueled primarily by surging demand from AI applications and data centers, which accounted for a significant portion of enterprise solid-state drive (SSD) deployments. Annual bit shipments for NAND flash exceeded 3,200 exabits in 2024 and are projected to grow by 8-10% in 2025, reflecting robust inventory replenishment in consumer electronics and AI server builds. This expansion underscores the sector's scale, with total memory revenues, including NAND, approaching $200 billion for the year.[152][153][68] Pricing dynamics in 2025 showed notable volatility for NAND, with contract and spot prices surging significantly year-over-year amid supply constraints from production cuts—such as SK Hynix and Micron reducing output by around 10% in the second half—and heightened AI-driven procurement. Prices more than doubled from mid-2025 levels, with TLC 1-terabit NAND rising from $4.80 in July to $10.70 in November, leading to overall year-over-year increases exceeding 100% by late 2025. In contrast, NOR flash prices remained relatively stable through much of 2025, though late-quarter pressures from cost escalations and supply tightness led to modest upward adjustments of 5-10%.[154][155][156][157][158] These SSD price surges were largely attributed to enterprise demand for massive, high-speed NAND storage in AI setups, causing sharp rises in NAND wafer prices. As supply was reallocated to prioritize this enterprise demand, consumer SSD prices doubled in some cases. Major suppliers implemented production cuts to sustain elevated pricing amid the ongoing demand-supply imbalance, with NAND flash prices surging by 246% from the first quarter of 2025.[159][160][161] The rise in flash memory prices during periods of high AI demand stems from the rapid escalation in procurement for AI data centers, which outstrips available NAND production capacity, leading to inventory depletion and significant price surges. Hyperscalers have a strong incentive to negotiate multi-year price-stability deals with NAND flash suppliers, as fluctuations disrupt cost forecasting and capital planning for their massive data centers, thereby affecting supplier margins in the tech industry.[162] This supply-demand imbalance has driven NAND prices higher, with forecasts indicating continued increases into 2026 as AI workloads expand, potentially raising contract prices by 33-38% in the first quarter alone. Such dynamics highlight the sector's vulnerability to shifts in AI-related consumption, where enterprise priorities redirect resources away from consumer markets, with AI emerging as the new dominant demand driver replacing traditional consumer electronics like phones and PCs as the growth engine for the storage industry; this leads to AI consuming production capacity, causing universal price rises and forming a structural bull market.[163][164][163][165][166] In early 2026, amid persistent supply tightness and robust demand particularly from AI applications, major suppliers Samsung and SK hynix shifted toward shorter-term memory supply contracts, including those with post-settlement pricing clauses, reflecting a return of pricing power to suppliers. This adjustment enabled record-high NAND flash operating margins in the 40-50% range for the first half of 2026.[167][168][169] Furthermore, the pursuit of advanced stacked NAND technologies, such as High Bandwidth Flash (HBF), introduces additional manufacturing and supply chain challenges. HBF, which involves vertically stacking multiple 3D NAND dies using techniques like wafer bonding, faces complexities in interconnection and scalability that may limit initial production volumes and require reallocation of fabrication resources, potentially exacerbating shortages akin to those in high-demand memory components. As of late 2025, HBF remains at least two years from commercial availability due to these constraints.[33][170] Key growth drivers include escalating AI workloads necessitating high-performance storage solutions, such as those integrating advanced layer stacking for enhanced density, contributing to a projected 25-30% compound annual growth rate (CAGR) for the enterprise flash segment through 2030. This trajectory is supported by innovations in AI-optimized SSDs, which are expected to capture a larger share of data center investments. However, challenges persist, including US-China trade tensions that have imposed export controls on critical technologies, severely impacting Chinese firm YMTC's access to advanced equipment and market participation. Additionally, sustainability concerns loom large, as semiconductor fabrication facilities consume vast amounts of energy—often equivalent to small cities—prompting industry-wide efforts to reduce emissions and improve eco-friendly manufacturing processes.[171][172][173][174]

References

User Avatar
No comments yet.