Recent from talks
Nothing was collected or created yet.
NVM Express
View on Wikipedia| NVM Express | |
|---|---|
| Non-Volatile Memory Host Controller Interface Specification | |
| Abbreviation | NVMe |
| Status | Published |
| Year started | 2011 |
| Latest version | 2.3 July 31, 2025 |
| Organization | NVM Express, Inc. (since 2014) NVM Express Work Group (before 2014) |
| Website | nvmexpress |
NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via the PCI Express bus. The initial NVM stands for non-volatile memory, which is often NAND flash memory that comes in several physical form factors, including solid-state drives (SSDs), PCIe add-in cards, and M.2 cards, the successor to mSATA cards. NVM Express, as a logical-device interface, has been designed to capitalize on the low latency and internal parallelism of solid-state storage devices.[1]
Architecturally, the logic for NVMe is physically stored within and executed by the NVMe controller chip that is physically co-located with the storage media, usually an SSD. Version changes for NVMe, e.g., 1.3 to 1.4, are incorporated within the storage media, and do not affect PCIe-compatible components such as motherboards and CPUs.[2]
By its design, NVM Express allows host hardware and software to fully exploit the levels of parallelism possible in modern SSDs. As a result, NVM Express reduces I/O overhead and brings various performance improvements relative to previous logical-device interfaces, including multiple long command queues, and reduced latency. The previous interface protocols like AHCI were developed for use with far slower hard disk drives (HDD) where a very lengthy delay (relative to CPU operations) exists between a request and data transfer, where data speeds are much slower than RAM speeds, and where disk rotation and seek time give rise to further optimization requirements.
NVM Express devices are chiefly available in the miniature M.2 form factor, while standard-sized PCI Express expansion cards[3] and 2.5-inch form-factor devices that provide a four-lane PCI Express interface through the U.2 connector (formerly known as SFF-8639) are also available.[4][5]
Specifications
[edit]Specifications for NVMe released to date include:[6]
- 1.0e (January 2013)
- 1.1b (July 2014) that adds standardized Command Sets to achieve better compatibility across different NVMe devices, Management Interface that provides standardized tools for managing NVMe devices, simplifying administration and Transport Specifications that defines how NVMe commands are transported over various physical interfaces, enhancing interoperability.[7]
- 1.2 (November 2014)
- 1.2a (October 2015)
- 1.2b (June 2016)
- 1.2.1 (June 2016) that introduces the following new features over version 1.1b: Multi-Queue to supports multiple I/O queues, enhancing data throughput and performance, Namespace Management that allows for dynamic creation, deletion, and resizing of namespaces, providing greater flexibility, and Endurance Management to monitor and manage SSD wear levels, optimizing performance and extending drive life.[8]
- 1.3 (May 2017)
- 1.3a (October 2017)
- 1.3b (May 2018)
- 1.3c (May 2018)
- 1.3d (March 2019) that since version 1.2.1 added Namespace Sharing to allow multiple hosts accessing a single namespace, facilitating shared storage environments, Namespace Reservation to provides mechanisms for hosts to reserve namespaces, preventing conflicts and ensuring data integrity, and Namespace Priority that sets priority levels for different namespaces, optimizing performance for critical workloads.[9][10]
- 1.4 (June 2019)
- 1.4a (March 2020)
- 1.4b (September 2020)
- 1.4c (June 2021), that has the following new features compared to 1.3d: IO Determinism to ensure consistent latency and performance by isolating workloads, Namespace Write Protect for preventing data corruption or unauthorized modifications, Persistent Event Log that stores event logs in non-volatile memory, aiding in diagnostics and troubleshooting, and Verify Command that checks the integrity of data.[11][12]
- 2.0 (May 2021)[13]
- 2.0a (July 2021)
- 2.0b (January 2022)
- 2.0c (October 2022)
- 2.0d (January 2024),[14] that, compared to 1.4c, introduces Zoned Namespaces (ZNS) to organize data into zones for efficient write operations, reducing write amplification and improving SSD longevity, Key Value (KV) for efficient storage and retrieval of key-value pairs directly on the NVMe device, bypassing traditional file systems, Endurance Group Management to manages groups of SSDs based on their endurance, optimizing usage and extending lifespan.[15][14][16]
- 2.0e (July 2024)
- 2.1 (August 2024)[17] that introduces Live Migration to maintaining service availability during migration, Key Per I/O for applying encryption keys at a per-operation level, NVMe-MI High Availability Out of Band Management for managing NVMe devices outside of regular data paths, and NVMe Network Boot / UEFI for booting NVMe devices over a network.[18]
- 2.2 (March 2025)
- 2.3 (August 2025)
Background
[edit]Historically, most SSDs used buses such as SATA,[19] SAS,[20][21] or Fibre Channel for interfacing with the rest of a computer system. Since SSDs became available in mass markets, SATA has become the most typical way for connecting SSDs in personal computers; however, SATA was designed primarily for interfacing with mechanical hard disk drives (HDDs), and it became increasingly inadequate for SSDs, which improved in speed over time.[22] For example, within about five years of mass market mainstream adoption (2005–2010) many SSDs were already held back by the comparatively slow data rates available for hard drives—unlike hard disk drives, some SSDs are limited by the maximum throughput of SATA.
High-end SSDs had been made using the PCI Express bus before NVMe, but using non-standard specification interfaces, using a SAS to PCIe bridge[23] or by emulating a hardware RAID controller.[24] By standardizing the interface of SSDs, operating systems only need one common device driver to work with all SSDs adhering to the specification. It also means that each SSD manufacturer does not have to design specific interface drivers. This is similar to how USB mass storage devices are built to follow the USB mass-storage device class specification and work with all computers, with no per-device drivers needed.[25]
NVM Express devices are also used as the building block of the burst buffer storage in many leading supercomputers, such as Fugaku Supercomputer, Summit Supercomputer and Sierra Supercomputer, etc.[26][27]
History
[edit]The first details of a new standard for accessing non-volatile memory emerged at the Intel Developer Forum 2007, when NVMHCI was shown as the host-side protocol of a proposed architectural design that had Open NAND Flash Interface Working Group (ONFI) on the memory (flash) chips side.[28] A NVMHCI working group led by Intel was formed that year. The NVMHCI 1.0 specification was completed in April 2008 and released on Intel's web site.[29][30][31]
Technical work on NVMe began in the second half of 2009.[32] The NVMe specifications were developed by the NVM Express Workgroup, which consists of more than 90 companies; Amber Huffman of Intel was the working group's chair. Version 1.0 of the specification was released on 1 March 2011,[33] while version 1.1 of the specification was released on 11 October 2012.[34] Major features added in version 1.1 are multi-path I/O (with namespace sharing) and arbitrary-length scatter-gather I/O. It is expected that future revisions will significantly enhance namespace management.[32] Because of its feature focus, NVMe 1.1 was initially called "Enterprise NVMHCI".[35] An update for the base NVMe specification, called version 1.0e, was released in January 2013.[36] In June 2011, a Promoter Group led by seven companies was formed.
The first commercially available NVMe chipsets were released by Integrated Device Technology (89HF16P04AG3 and 89HF32P08AG3) in August 2012.[37][38] The first NVMe drive, Samsung's XS1715 enterprise drive, was announced in July 2013; according to Samsung, this drive supported 3 GB/s read speeds, six times faster than their previous enterprise offerings.[39] The LSI SandForce SF3700 controller family, released in November 2013, also supports NVMe.[40][41] A Kingston HyperX "prosumer" product using this controller was showcased at the Consumer Electronics Show 2014 and promised similar performance.[42][43] In June 2014, Intel announced their first NVM Express products, the Intel SSD data center family that interfaces with the host through PCI Express bus, which includes the DC P3700 series, the DC P3600 series, and the DC P3500 series.[44] As of November 2014[update], NVMe drives are commercially available.
In March 2014, the group incorporated to become NVM Express, Inc., which as of November 2014[update] consists of more than 65 companies from across the industry. NVM Express specifications are owned and maintained by NVM Express, Inc., which also promotes industry awareness of NVM Express as an industry-wide standard. NVM Express, Inc. is directed by a thirteen-member board of directors selected from the Promoter Group, which includes Cisco, Dell, EMC, HGST, Intel, Micron, Microsoft, NetApp, Oracle, PMC, Samsung, SanDisk and Seagate.[45]
In September 2016, the CompactFlash Association announced that it would be releasing a new memory card specification, CFexpress, which uses NVMe.[citation needed]
NVMe Host Memory Buffer (HMB) feature added in version 1.2 of the NVMe specification.[46] HMB allows SSDs to use the host's DRAM, which can improve the I/O performance for DRAM-less SSDs.[47] For example, HMB can be used for cache the FTL table by the SSD controller, which can improve I/O performance.[48] NVMe 2.0 added optional Zoned Namespaces (ZNS) feature and Key-Value (KV) feature, and support for rotating media such as hard disk drives. ZNS and KV allows data to be mapped directly to its physical location in flash memory to directly access data on an SSD.[49] ZNS and KV can also decrease write amplification of flash media.
Form factors
[edit]There are many form factors of NVMe solid-state drive, such as AIC, U.2, U.3, M.2 etc.
AIC (add-in card)
[edit]Almost all early NVMe solid-state drives are HHHL (half height, half length) or FHHL (full height, half length) PCI Express cards, with a PCIe 2.0 or 3.0 interface. A HHHL NVMe solid-state drive card is easy to insert into a PCIe slot of a server.
SATA Express, U.2 and U.3 (SFF-8639)
[edit]SATA Express allows the use of two PCI Express 2.0 or 3.0 lanes and two SATA 3.0 (6 Gbit/s) ports through the same host-side SATA Express connector (but not both at the same time). SATA Express supports NVMe as the logical device interface for attached PCI Express storage devices. It is electrically compatible with MultiLink SAS, so a backplane can support both at the same time.
U.2, formerly known as SFF-8639, uses the same physical port as SATA Express but allows up to four PCI Express lanes. Available servers can combine up to 48 U.2 NVMe solid-state drives.[50]
U.3 (SFF-TA-1001) is built on the U.2 spec and uses the same SFF-8639 connector. Unlike in U.2, a single "tri-mode" (PCIe/SATA/SAS) backplane receptacle can handle all three types of connections; the controller automatically detects the type of connection used. This is unlike U.2, where users need to use separate controllers for SATA/SAS and NVMe. U.3 devices are required to be backwards-compatible with U.2 hosts, but U.2 drives are not compatible with U.3 hosts.[51][52]
M.2
[edit]M.2, formerly known as the Next Generation Form Factor (NGFF), uses a M.2 NVMe solid-state drive computer bus. Interfaces provided through the M.2 connector are PCI Express 3.0 or higher (up to four lanes).
EDSFF
[edit]CFexpress
[edit]CFexpress is a NVMe-based removable memory card. Three form-factors are specified, each with 1, 2 or 4 PCIe lanes and a different physical size.
NVMe-oF
[edit]NVM Express over Fabrics (NVMe-oF) is the concept of using a transport protocol over a network to connect remote NVMe devices, contrary to regular NVMe where physical NVMe devices are connected to a PCIe bus either directly or over a PCIe switch to a PCIe bus. In August 2017, a standard for using NVMe over Fibre Channel (FC) was submitted by the standards organization International Committee for Information Technology Standards (ICITS), and this combination is often referred to as FC-NVMe or sometimes NVMe/FC.[53]
As of May 2021, supported NVMe transport protocols are:
- FC, FC-NVMe[53][54]
- TCP, NVMe/TCP[55]
- Ethernet, RoCE v1/v2 (RDMA over converged Ethernet)[56]
- InfiniBand, NVMe over InfiniBand or NVMe/IB[57]
The standard for NVMe over Fabrics was published by NVM Express, Inc. in 2016.[58][59]
The following software implements the NVMe-oF protocol:
- Linux NVMe-oF initiator and target.[60] RoCE transport was supported initially, and with Linux kernel 5.x, native support for TCP was added.[61]
- Storage Performance Development Kit (SPDK) NVMe-oF initiator and target drivers.[62] Both RoCE and TCP transports are supported.[63][64]
- StarWind NVMe-oF initiator[65] and target for Linux and Microsoft Windows, supporting both RoCE & TCP, and Fibre Channel transports.[66]
- Lightbits Labs NVMe over TCP target[67] for various Linux distributions[68] & public clouds.
- Bloombase StoreSafe Intelligent Storage Firewall supports NVMe over RoCE, TCP, and Fibre Channel for transparent storage security protection.
- NetApp ONTAP supports iSCSI and NVMe over TCP[69] targets.
- Simplyblock storage platform with NVMe over Fabrics support.[70]
Comparison with AHCI
[edit]The Advanced Host Controller Interface (AHCI) has the benefit of wide software compatibility, but has the downside of not delivering optimal performance when used with SSDs connected via the PCI Express bus. As a logical-device interface, AHCI was developed when the purpose of a host bus adapter (HBA) in a system was to connect the CPU/memory subsystem with a much slower storage subsystem based on rotating magnetic media. As a result, AHCI introduces certain inefficiencies when used with SSD devices, which behave much more like RAM than like spinning media.[71]
The NVMe device interface has been designed from the ground up, capitalizing on the lower latency and parallelism of PCI Express SSDs, and complementing the parallelism of contemporary CPUs, platforms and applications. At a high level, the basic advantages of NVMe over AHCI relate to its ability to exploit parallelism in host hardware and software, manifested by the differences in command queue depths, efficiency of interrupt processing, the number of uncacheable register accesses, etc., resulting in various performance improvements.[71][72]: 17–18
The table below summarizes high-level differences between the NVMe and AHCI logical-device interfaces.
| AHCI | NVMe | |
|---|---|---|
| Maximum queue depth | One command queue; Up to 32 commands per queue |
Up to 65535 queues;[73] Up to 65536 commands per queue |
| Uncacheable register accesses (2000 cycles each) |
Up to six per non-queued command; Up to nine per queued command |
Up to two per command |
| Interrupt | A single interrupt | Up to 2048 MSI-X interrupts |
| Parallelism and multiple threads |
Requires synchronization lock to issue a command |
No locking |
| Efficiency for 4 KB commands |
Command parameters require two serialized host DRAM fetches |
Gets command parameters in one 64-byte fetch |
| Data transmission | Usually half-duplex | Full-duplex |
| Host Memory Buffer (HMB) | No | Yes |
Operating system support
[edit]
- ChromeOS
- On February 24, 2015, support for booting from NVM Express devices was added to ChromeOS.[75][76]
- DragonFly BSD
- The first release of DragonFly BSD with NVMe support is version 4.6.[77]
- FreeBSD
- Intel sponsored a NVM Express driver for FreeBSD's head and stable/9 branches.[78][79] The nvd(4) and nvme(4) drivers are included in the GENERIC kernel configuration by default since FreeBSD version 10.2 in 2015.[80]
- Genode
- Support for consumer-grade NVMe was added to the Genode framework as part of the 18.05[81] release.
- iOS
- With the release of the iPhone 6S and 6S Plus, Apple introduced the first mobile deployment of NVMe over PCIe in smartphones.[85] Apple followed these releases with the release of the first-generation iPad Pro and first-generation iPhone SE that also use NVMe over PCIe.[86]
- Linux
- Intel published an NVM Express driver for Linux on 3 March 2011,[87][88][89] which was merged into the Linux kernel mainline on 18 January 2012 and released as part of version 3.3 of the Linux kernel on 19 March 2012.[90] Linux kernel supports NVMe Host Memory Buffer[91] from version 4.13.1[92] with default maximum size 128 MB.[93] Linux kernel supports NVMe Zoned Namespaces start from version 5.9.
- macOS
- Apple introduced software support for NVM Express in Yosemite 10.10.3. The NVMe hardware interface was introduced in the 2016 MacBook and MacBook Pro.[94]
- NetBSD
- NetBSD added support for NVMe in NetBSD 8.0.[95] The implementation is derived from OpenBSD 6.0.
- OpenBSD
- Development work required to support NVMe in OpenBSD has been started in April 2014 by a senior developer formerly responsible for USB 2.0 and AHCI support.[96] Support for NVMe has been enabled in the OpenBSD 6.0 release.[97]
- OS/2
- Arca Noae provides an NVMe driver for ArcaOS, as of April, 2021. The driver requires advanced interrupts as provided by the ACPI PSD running in advanced interrupt mode (mode 2), thus requiring the SMP kernel, as well.[98]
- VMware
- Intel has provided an NVMe driver for VMware,[100] which is included in vSphere 6.0 and later builds, supporting various NVMe devices.[101] As of vSphere 6 update 1, VMware's VSAN software-defined storage subsystem also supports NVMe devices.[102]
- Windows
- Microsoft added native support for NVMe to Windows 8.1 and Windows Server 2012 R2.[72][103] Native drivers for Windows 7 and Windows Server 2008 R2 have been added in updates.[104] Many vendors have released their own Windows drivers for their devices as well. There are also manually customized installer files available to install a specific vendor's driver to any NVMe card, such as using a Samsung NVMe driver with a non-Samsung NVMe device, which may be needed for additional features, performance, and stability.[105]
- Support for NVMe HMB was added in Windows 10 Anniversary Update (Version 1607) in 2016.[46] In Microsoft Windows from Windows 10 1607 to Windows 11 23H2, the maximum HMB size is 64 MB. Windows 11 24H2 updates the maximum HMB size to 1/64 of system RAM.[106]
- Support for NVMe ZNS and KV was added in Windows 10 version 21H2 and Windows 11 in 2021.[107] The OpenFabrics Alliance maintains an open-source NVMe Windows Driver for Windows 7/8/8.1 and Windows Server 2008R2/2012/2012R2, developed from the baseline code submitted by several promoter companies in the NVMe workgroup, specifically IDT, Intel, and LSI.[108] The current release is 1.5 from December 2016.[109] The Windows built-in NVMe driver does not support hardware acceleration; hardware acceleration required vendor drivers.[110]
Software support
[edit]Management tools
[edit]
nvme-cli on Linuxnvmecontrol
[edit]The nvmecontrol tool is used to control an NVMe disk from the command line on FreeBSD. It was added in FreeBSD 9.2.[113]
nvme-cli
[edit]NVM-Express user space tooling for Linux.[114]
See also
[edit]References
[edit]- ^ "NVM Express". NVM Express, Inc. Archived from the original on 2019-12-05. Retrieved 2017-01-24.
NVMe is designed from the ground up to deliver high bandwidth and low latency storage access for current and future NVM technologies.
- ^ Tallis, Billy (June 14, 2019). "NVMe 1.4 Specification Published: Further Optimizing Performance and Reliability". AnandTech. Archived from the original on 2021-01-27.
- ^ Drew Riley (2014-08-13). "Intel SSD DC P3700 800GB and 1.6TB Review: The Future of Storage". Tom's Hardware. Retrieved 2014-11-21.
- ^ "Intel Solid-State Drive DC P3600 Series" (PDF). Intel. 2015. pp. 18, 20–22. Archived from the original (PDF) on Oct 28, 2015. Retrieved 2015-04-11.
- ^ Paul Alcorn (2015-06-05). "SFFWG Renames PCIe SSD SFF-8639 Connector To U.2". Tom's Hardware. Retrieved 2015-06-09.
- ^ NVMe Specifications
- ^ "Specifications - NVM Express". 10 January 2020. Archived from the original on 25 July 2024. Retrieved 10 July 2024.
- ^ "NVM Express Releases 1.2 Specification - NVM Express". 12 November 2014. Archived from the original on 25 July 2024. Retrieved 23 November 2024.
- ^ "WEBCAST: NVME 1.3 – LEARN WHat's NEW - NVM Express". 30 June 2017. Archived from the original on 13 April 2024. Retrieved 23 November 2024.
- ^ "Changes in NVMe Revision 1.3 - NVM Express". May 2017.
- ^ "Answering Your Questions: NVMe™ 1.4 Features and Compliance: Everything You Need to Know - NVM Express". 16 October 2019. Archived from the original on 14 July 2024. Retrieved 23 November 2024.
- ^ "NVMe 1.4 Features and Compliance: Everything You Need to Know - NVM Express". 2 October 2019.
- ^ "NVM Express Announces the Rearchitected NVMe 2.0 Library of Specifications" (Press release). Beaverton, Oregon, USA: NVM Express, Inc. June 3, 2021. Archived from the original on 2023-01-18. Retrieved 2024-03-31.
- ^ a b "NVM Express Base Specification 2.0d" (PDF). nvmexpress.org. NVM Express, Inc. January 11, 2024. Archived (PDF) from the original on 2024-03-26. Retrieved 2024-03-26.
- ^ "Everything You Need to Know About the NVMe 2.0 Specifications and New Technical Proposals - NVM Express". 3 June 2021.
- ^ "Everything You Need to Know About the NVMe® 2.0 Specifications and New Technical Proposals".
- ^ "NVM Express® Base Specification, Revision 2.1" (PDF). nvmexpress.org. NVM Express, Inc. August 5, 2024. Retrieved 2024-08-10.
- ^ "Everything You Need to Know: An Essential Overview of NVM Express® 2.1 Base Specification and New Key Features". 6 August 2024. Archived from the original on 13 September 2024. Retrieved 23 November 2024.
- ^ "AnandTech Forums: Technology, Hardware, Software, and Deals". www.anandtech.com. Archived from the original on June 6, 2014. Retrieved 2025-08-01.
- ^ "FMS 2012: HGST Unveils Worlds First 12Gb/s SAS Enterprise SSD :: TweakTown USA Edition". Archived from the original on 2012-08-28. Retrieved 2025-08-01.
- ^ "STEC s840 Enterprise SSD Review - StorageReview.com". www.storagereview.com. Retrieved 2025-08-01.
- ^ Walker, Don H. "A Comparison of NVMe and AHCI" (PDF). 31 July 2012. SATA-IO. Archived from the original (PDF) on 12 February 2019. Retrieved 3 July 2013.
- ^ "AnandTech Forums: Technology, Hardware, Software, and Deals". www.anandtech.com. Archived from the original on June 6, 2014. Retrieved 2025-08-01.
- ^ "ASUS ROG RAIDR Express 240GB PCIe SSD Review". 6 December 2013.
- ^ "NVM Express Explained" (PDF). nvmexpress.org. 9 April 2014. Archived (PDF) from the original on 24 August 2015. Retrieved 21 March 2015.
- ^ "Using LC's Sierra Systems". hpc.llnl.gov. Retrieved 2020-06-25.
- ^ "SummitDev User Guide". olcf.ornl.gov. Archived from the original on 2020-08-06. Retrieved 2020-06-25.
- ^ "Speeding up Flash... in a flash". The Inquirer. 2007-10-13. Archived from the original on September 18, 2009. Retrieved 2014-01-11.
- ^ "Extending the NVMHCI Standard to Enterprise" (PDF). Santa Clara, CA USA: Flash Memory Summit. August 2009. Archived from the original (PDF) on 2017-06-17.
- ^ "Flash new standard tips up". The Inquirer. 2008-04-16. Archived from the original on January 11, 2014. Retrieved 2014-01-11.
- ^ Amber Huffman (August 2008). "NVMHCI: The Optimized Interface for Caches and SSDs" (PDF). Santa Clara, CA USA: Flash Memory Summit. Archived (PDF) from the original on 2016-03-04. Retrieved 2014-01-11.
- ^ a b Peter Onufryk (2013). "What's New in NVMe 1.1 and Future Directions" (PDF). Santa Clara, CA USA: Flash Memory Summit.
- ^ "New Promoter Group Formed to Advance NVM Express" (PDF). Press release. June 1, 2011. Archived (PDF) from the original on December 30, 2013. Retrieved September 18, 2013.
- ^ Amber Huffman, ed. (October 11, 2012). "NVM Express Revision 1.1" (PDF). Specification. Retrieved September 18, 2013.
- ^ David A. Deming (2013-06-08). "PCIe-based Storage" (PDF). snia.org. Archived from the original (PDF) on 2013-09-20. Retrieved 2014-01-12.
- ^ Amber Huffman, ed. (January 23, 2013). "NVM Express Revision 1.0e" (PDF). Specification. Retrieved September 18, 2013.
- ^ "IDT releases two NVMe PCI-Express SSD controllers". The Inquirer. 2012-08-21. Archived from the original on August 24, 2012. Retrieved 2014-01-11.
- ^ "IDT Shows Off The First NVMe PCIe SSD Processor and Reference Design - FMS 2012 Update". The SSD Review. 2012-08-24. Archived from the original on 2016-01-01. Retrieved 2014-01-11.
- ^ "Samsung Announces Industry's First 2.5-inch NVMe SSD | StorageReview.com - Storage Reviews". StorageReview.com. 2013-07-18. Archived from the original on 2014-01-10. Retrieved 2014-01-11.
- ^ "LSI SF3700 SandForce Flash Controller Line Unveiled | StorageReview.com - Storage Reviews". StorageReview.com. 2013-11-18. Archived from the original on 2014-01-11. Retrieved 2014-01-11.
- ^ "LSI Introduces Blazing Fast SF3700 Series SSD Controller, Supports Both PCIe and SATA 6 Gbps". hothardware.com. Archived from the original on 5 March 2016. Retrieved 21 March 2015.
- ^ Jane McEntegart (7 January 2014). "Kingston Unveils First PCIe SSD: 1800 MB/s Read Speeds". Tom's Hardware. Retrieved 21 March 2015.
- ^ "Kingston HyperX Predator PCI Express SSD Unveiled With LSI SandForce SF3700 PCIe Flash Controller". hothardware.com. Archived from the original on 28 May 2016. Retrieved 21 March 2015.
- ^ "Intel® Solid-State Drive Data Center Family for PCIe*". Intel. Retrieved 21 March 2015.
- ^ "NVM Express Organization History". NVM Express. Archived from the original on 23 November 2015. Retrieved 23 December 2015.
- ^ a b Tallis, Billy (June 14, 2018). "The Toshiba RC100 SSD Review: Tiny Drive In A Big Market". AnandTech. Archived from the original on June 14, 2018. Retrieved 2024-03-30.
- ^ Kim, Kyusik; Kim, Taeseok (2020). "HMB in DRAM-less NVMe SSDS: Their usage and effects on performance". PLOS ONE. 15 (3) e0229645. Bibcode:2020PLoSO..1529645K. doi:10.1371/journal.pone.0229645. PMC 7051071. PMID 32119705.
- ^ Kim, Kyusik; Kim, Seongmin; Kim, Taeseok (2020-06-24). "HMB-I/O: Fast Track for Handling Urgent I/Os in Nonvolatile Memory Express Solid-State Drives". Applied Sciences. 10 (12): 4341. doi:10.3390/app10124341. ISSN 2076-3417.
- ^ "NVMe Gets Refactored". 30 June 2021. Archived from the original on 27 February 2024. Retrieved 27 February 2024.
- ^ "All-Flash NVME Servers for Advanced Computing Supermicro". Supermicro. Retrieved 2022-07-22.
- ^ Siebenmann, Chris. "U.2, U.3, and other server NVMe drive connector types (in mid 2022)". Retrieved 2025-01-22.
- ^ McRobert, Kyle. "What you need to know about U.3". Quarch Technology. Retrieved 2025-01-22.
- ^ a b "NVMe over Fibre Channel (NVMe over FC) or FC-NVMe standard". Tech Target. January 1, 2018. Retrieved May 26, 2021.
- ^ "FC-NVMe rev 1.14 (T11/16-020vB)" (PDF). INCITS. April 19, 2017. Archived from the original (PDF) on April 10, 2022. Retrieved May 26, 2021.
- ^ "NVMe-oF Specification". NVMexpress. 15 April 2020. Retrieved May 26, 2021.
- ^ "Supplement to InfiniBandTMArchitecture Specification Volume 1 Release 1.2.1". Infiniband. September 2, 2014. Archived from the original on March 9, 2016. Retrieved May 26, 2021.
- ^ "What is NVMe-oF?". Storage Review. June 27, 2020. Retrieved May 26, 2021.
- ^ "NVM Express over Fabrics Revision 1.0" (PDF). NVM Express, Inc. 5 June 2016. Archived (PDF) from the original on 30 January 2019. Retrieved 24 April 2018.
- ^ Woolf, David (February 9, 2018). "What NVMe over Fabrics Means for Data Storage". Archived from the original on April 14, 2018. Retrieved April 24, 2018.
- ^ Hellwig, Christoph (July 17, 2016). "NVMe Over Fabrics Support in Linux" (PDF). Archived (PDF) from the original on April 14, 2018. Retrieved April 24, 2018.
- ^ Petros Koutoupis (June 10, 2019). "Data in a Flash, Part III: NVMe over Fabrics Using TCP". Linux Journal. Archived from the original on April 27, 2021. Retrieved May 26, 2021.
- ^ Stern, Jonathan (7 June 2016). "Announcing the SPDK NVMf Target".
- ^ "SPDKNVMe-oFRDMA (Target & Initiator) Performance Report" (PDF). SPDK. February 1, 2021. Retrieved May 26, 2021.
- ^ "SPDKNVMe-oFTCP (Target & Initiator) Performance Report" (PDF). SPDK. February 1, 2020. Archived (PDF) from the original on May 25, 2021. Retrieved May 26, 2021.
- ^ "Hands On with StarWind NVMe-oF Initiator for Windows". StorageReview. October 6, 2021. Archived from the original on October 7, 2021. Retrieved October 6, 2021.
- ^ "StarWind SAN & NAS over Fibre Channel". StorageReview. July 20, 2022. Archived from the original on July 20, 2022. Retrieved July 20, 2022.
- ^ "Intel planning big Lightbits NVMe/TCP storage push". Blocks & Files. June 9, 2022. Archived from the original on July 6, 2022. Retrieved June 9, 2022.
- ^ "LightBits Super SSD brings NVMe on vanilla Ethernet". ComputerWeekly. April 29, 2021. Retrieved April 29, 2021.
- ^ "Announcing NVMe/TCP for ONTAP". www.netapp.com. Archived from the original on 2024-07-17. Retrieved 2025-01-23.
- ^ Schmidt, Michael (2024-05-22). "How We Built Our Distributed Data Placement Algorithm". simplyblock. Retrieved 2025-01-23.
- ^ a b c Dave Landsman (2013-08-09). "AHCI and NVMe as Interfaces for SATA Express Devices – Overview" (PDF). SATA-IO. Archived (PDF) from the original on 2013-10-05. Retrieved 2013-10-02.
- ^ a b Andy Herron (2013). "Advancements in Storage and File Systems in Windows 8.1" (PDF). snia.org. Archived from the original (PDF) on 2014-01-10. Retrieved 2014-01-11.
- ^ Amber Huffman (March 9, 2020). "NVM Express Base Specification Revision 1.4a" (PDF). Specification. section 1.4 Theory of Operation, p. 7. Archived (PDF) from the original on December 13, 2023. Retrieved May 16, 2020.
- ^ Werner Fischer; Georg Schönberger (2015-06-01). "Linux Storage Stack Diagram". Thomas-Krenn.AG. Archived from the original on 2019-06-29. Retrieved 2015-06-08.
- ^ "ChromeOS adds boot support for NVM Express". NVM Express. 24 February 2015. Retrieved 21 March 2015.
- ^ Akers, Jason B. (Jan 22, 2015). "4f503189f7339c667b045ab80a949964ecbaf93e - chromiumos/platform/depthcharge". Git at Google. Archived from the original on 23 August 2017. Retrieved 21 March 2015.
- ^ "release46". DragonFly BSD. Archived from the original on 2016-09-04. Retrieved 2016-09-08.
- ^ "Log of /head/sys/dev/nvme". FreeBSD source tree. The FreeBSD Project. Archived from the original on 29 May 2013. Retrieved 16 October 2012.
- ^ "Log of /stable/9/sys/dev/nvme". FreeBSD source tree. The FreeBSD Project. Archived from the original on 16 February 2018. Retrieved 3 July 2013.
- ^ "FreeBSD 10.2-RELEASE Release Notes". The FreeBSD Project. Archived from the original on 18 June 2017. Retrieved 5 August 2015.
- ^ "Release notes for the Genode OS Framework 18.05". genode.org.
- ^ "#9910 NVMe devices support". dev.haiku-os.org. Archived from the original on 2016-08-06. Retrieved 2019-04-18.
- ^ "NVMe Driver Now Available - Haiku Project". www.haiku-os.org. Retrieved 2016-07-28.
- ^ "4053 Add NVME Driver Support to Illumos". github.com. Archived from the original on 2017-05-10. Retrieved 2016-05-23.
- ^ Ho, Joshua (September 28, 2015). "iPhone 6s and iPhone 6s Plus Preliminary Results". AnandTech. Archived from the original on 2016-05-26. Retrieved 2016-06-01.
- ^ Chester, Brandon (May 16, 2016). "The iPhone SE Review". AnandTech. Archived from the original on May 20, 2016.
- ^ Matthew Wilcox (2011-03-03). "NVM Express driver". LWN.net. Archived from the original on 2012-07-17. Retrieved 2013-11-05.
- ^ Keith Busch (2013-08-12). "Linux NVMe Driver" (PDF). flashmemorysummit.com. Archived (PDF) from the original on 2013-11-05. Retrieved 2013-11-05.
- ^ "IDF13 Hands-on Lab: Compiling the NVM Express Linux Open Source Driver and SSD Linux Benchmarks and Optimizations" (PDF). activeevents.com. 2013. Archived from the original (PDF) on 2014-01-11. Retrieved 2014-01-11.
- ^ "Merge git://git.infradead.org/users/willy/linux-nvme". kernel.org. 2012-01-18. Retrieved 2013-11-05.
- ^ Kim, K.; Kim, T. (2020). "HMB in DRAM-less NVMe SSDs: Their usage and effects on performance". PLOS ONE. 15 (3) e0229645. Bibcode:2020PLoSO..1529645K. doi:10.1371/journal.pone.0229645. PMC 7051071. PMID 32119705.
- ^ "Linux 4.13 has been released on Sun, 3 Sep 2017". Archived from the original on 29 October 2017. Retrieved 16 October 2021.
- ^ "Pci.c « host « nvme « drivers - kernel/Git/Stable/Linux.git - Linux kernel stable tree". Archived from the original on 2021-10-16. Retrieved 2021-10-16.
- ^ "Faster 'NVM Express' SSD Interface Arrives on Retina MacBook and OS X 10.10.3". macrumors.com. 11 April 2015. Archived from the original on 23 August 2017. Retrieved 11 April 2015.
- ^ "nvme -- Non-Volatile Memory Host Controller Interface". NetBSD manual pages. 2021-05-16. Retrieved 2021-05-16.
- ^ David Gwynne (2014-04-16). "non volatile memory express controller (/sys/dev/ic/nvme.c)". BSD Cross Reference. Archived from the original on 2014-04-28. Retrieved 2014-04-27.
- ^ David Gwynne (2016-04-14). "man 4 nvme". OpenBSD man page. Archived from the original on 2016-08-21. Retrieved 2016-08-07.
- ^ "NVME". Arca Noae wiki. Arca Noae, LLC. 2021-04-03. Retrieved 2021-06-08.
- ^ "nvme(7D)". Oracle. Archived from the original on 2015-12-09. Retrieved 2014-12-02.
- ^ "Intel Solid-State for NVMe Drivers". intel.com. 2015-09-25. Archived from the original on 2016-03-25. Retrieved 2016-03-17.
- ^ "VMware Compatibility Guide for NVMe devices". vmware.com. Archived from the original on 2016-03-25. Retrieved 2016-03-17.
- ^ "VSAN Now Supporting NVMe Devices". vmware.com. 2015-11-11. Archived from the original on 2016-03-25. Retrieved 2016-03-17.
- ^ "Windows 8.1 to support hybrid disks and adds native NVMe driver". Myce.com. 2013-09-06. Archived from the original on 2014-01-10. Retrieved 2014-01-11.
- ^ "Update to support NVM Express by using native drivers in Windows 7 or Windows Server 2008 R2". Microsoft. 2014-11-13. Archived from the original on 2014-11-29. Retrieved 2014-11-17.
- ^ "Recommended AHCI/RAID and NVMe Drivers". 10 May 2013. Archived from the original on 24 February 2021. Retrieved 19 February 2021.
- ^ "NVM Express (NVMe) Innovations in Windows" (PDF). Archived from the original (PDF) on 2024-09-05.
- ^ lorihollasch (2023-08-09). "NVMe Feature and Extended Capability Support - Windows drivers". learn.microsoft.com. Retrieved 2024-04-11.
- ^ "Windows NVM Express". Project web site. Archived from the original on June 12, 2013. Retrieved September 18, 2013.
- ^ "Nvmewin - Revision 157: /Releases". Archived from the original on 2017-05-10. Retrieved 2016-08-13.
- ^ lorihollasch. "StorNVMe Command Set Support - Windows drivers". learn.microsoft.com. Retrieved 2025-04-11.
- ^ "ChangeLog/1.6". qemu.org. Archived from the original on 29 September 2018. Retrieved 21 March 2015.
- ^ "Download EDK II from". SourceForge.net. Archived from the original on 2013-12-31. Retrieved 2014-01-11.
- ^ "NVM Express control utility". The FreeBSD Project. 2018-03-12. Retrieved 2019-07-12.
- ^ "GitHub - linux-nvme/nvme-cli: NVMe management command line interface". linux-nvme. 2019-03-26. Retrieved 2019-03-27.
External links
[edit]- NVM Express Specifications
- CompactFlash Association
- LFCS: Preparing Linux for nonvolatile memory devices, LWN.net, April 19, 2013, by Jonathan Corbet
- Multipathing PCI Express Storage, Linux Foundation, March 12, 2015, by Keith Busch
- NVMe, NVMe-oF and RDMA for network engineers, August 2020, by Jerome Tissieres
NVM Express
View on GrokipediaFundamentals
Overview
NVM Express (NVMe) is an open logical device interface and command set that enables host software to communicate with non-volatile memory subsystems, such as solid-state drives (SSDs), across multiple transports including PCI Express (PCIe), RDMA over Converged Ethernet (RoCE), Fibre Channel (FC), and TCP/IP.[1] Designed specifically for the performance characteristics of non-volatile memory media like NAND flash, NVMe optimizes access by minimizing protocol overhead and maximizing parallelism, allowing systems to achieve low-latency operations under 10 microseconds end-to-end.[2] Unlike legacy block storage protocols such as AHCI over SATA, which were originally developed for rotational hard disk drives and impose higher latency due to complex command processing and limited queuing, NVMe streamlines the datapath to reduce CPU overhead and enable higher throughput and IOPS for SSDs.[6] At its core, an NVMe implementation consists of a host controller that manages the interface between the host system and the storage device, namespaces that represent logical partitions of the storage capacity for organization and access control, and paired submission/completion queues for handling I/O commands efficiently.[2] The submission queues allow the host to send commands to the controller, while completion queues return status updates, supporting asynchronous processing without the need for polling in many cases. This architecture leverages the inherent low latency and high internal parallelism of modern SSDs, enabling massive scalability in multi-core environments.[7] A key feature of NVMe is its support for up to 65,535 I/O queues (plus one administrative queue) with up to 65,536 commands per queue, far exceeding the single queue and 32-command limit of AHCI, to facilitate parallel command execution across numerous processor cores and threads.[8] This queue depth and multiplicity reduce bottlenecks, allowing NVMe to fully utilize the bandwidth of PCIe interfaces, such as up to 4 GB/s with PCIe Gen3 x4 lanes, and extend to networked fabrics for enterprise-scale storage.[2][7]Background and Motivation
The evolution of storage interfaces prior to NVM Express (NVMe) was dominated by protocols like the Advanced Host Controller Interface (AHCI) and Serial ATA (SATA), which were engineered primarily for hard disk drives (HDDs) with their mechanical, serial nature. These HDD-centric designs imposed serial command processing and significant overhead, rendering them inefficient for solid-state drives (SSDs) that demand low latency and massive parallelism.[9][10] SSDs leverage a high degree of internal parallelism through multiple independent NAND flash channels connected to numerous flash dies, enabling thousands of concurrent read and write operations to maximize throughput. However, pre-NVMe SSDs connected via AHCI were constrained to roughly one queue with a depth of 32 commands, creating a severe bottleneck that prevented full utilization of PCIe bandwidth and stifled the devices' inherent capabilities.[11][10] The primary motivation for NVMe was to develop a PCIe-optimized protocol that eliminates legacy bottlenecks, allowing SSDs to operate at their full potential by shifting from serial to parallel command processing with support for up to 64,000 queues and 64,000 commands per queue. This design enables efficient exploitation of PCIe’s high bandwidth while delivering the low-latency performance required for both enterprise data centers and consumer applications.[9][1]History and Development
Formation of the Consortium
The NVM Express Promoter Group was established on June 1, 2011, by leading technology companies to develop and promote an open standard for non-volatile memory (NVM) storage devices over the PCI Express (PCIe) interface, addressing the need for optimized communication between host software and solid-state drives (SSDs).[12] The initial promoter members included Cisco, Dell, EMC, Intel, LSI, Micron, Oracle, Samsung, and SanDisk, with seven companies—Cisco, Dell, EMC, Integrated Device Technology (IDT), Intel, NetApp, and Oracle—holding permanent seats on the 13-member board to guide the group's efforts.[13] This formation built on prior work from the NVMHCI Work Group, aiming to enable scalable, high-performance storage solutions through collaborative specification development.[12] In 2014, the original NVM Express Work Group was formally incorporated as the non-profit organization NVM Express, Inc., in Delaware, transitioning from an informal promoter structure to a dedicated consortium responsible for managing and advancing the NVMe specifications.[5] Today, the consortium comprises over 100 member companies, ranging from semiconductor manufacturers to system integrators, organized into specialized work groups focused on specification development, compliance testing, and marketing initiatives to ensure broad industry adoption.[1] The promoter group, now including entities like Advanced Micro Devices, Google, Hewlett Packard Enterprise, Meta, and Microsoft, provides strategic direction through its board.[14] The University of New Hampshire InterOperability Laboratory (UNH-IOL) has played a pivotal role in the consortium's formation and ongoing operations since 2011, when early NVMe contributors engaged the lab to develop interoperability testing frameworks.[15] UNH-IOL supports conformance programs by creating test plans, software tools, and hosting plugfest events that verify NVMe solutions for quality and compatibility, fostering ecosystem-wide interoperability without endorsing specific products.[16] This collaboration has been essential for validating specifications and accelerating market readiness.[17] The consortium's scope is deliberately limited to defining protocols for host software communication with NVM subsystems, emphasizing logical command sets, queues, and data transfer mechanisms across various transports, while excluding physical layer specifications that are handled by standards bodies like PCI-SIG.[18] This focus ensures NVMe remains a transport-agnostic standard optimized for low-latency, parallel access to non-volatile memory.[1]Specification Releases and Milestones
The NVM Express (NVMe) specification began with its initial release, version 1.0, on March 1, 2011, establishing a streamlined protocol optimized for PCI Express (PCIe)-based solid-state drives (SSDs) to overcome the limitations of legacy interfaces like AHCI. This foundational specification defined the core command set, queueing model, and low-latency operations tailored for non-volatile memory, enabling up to 64,000 queues with 64,000 commands per queue for parallel processing.[18] Version 1.1, released on October 11, 2012, introduced advanced power management features, including Autonomous Power State Transition (APST) to allow devices to dynamically adjust power states for energy efficiency without host intervention, and support for multiple power states to balance performance and consumption in client systems. Subsequent updates in this era focused on enhancing reliability and scalability. NVMe 1.2, published on November 3, 2014, added support for namespaces, enabling a single controller to manage multiple virtual storage partitions as independent logical units, which facilitated multi-tenant environments and improved resource allocation in shared storage setups.[2] The specification evolved further to address networked storage needs with NVMe 1.3, ratified on May 1, 2017, which incorporated enhancements for NVMe over Fabrics (NVMe-oF) integration, including directive support for stream identification and sanitize commands to improve data security and performance in distributed systems. Building on this, NVMe 1.4, released on June 10, 2019, expanded device capabilities with features like non-operational power states for deeper idle modes and improved error reporting, laying groundwork for broader ecosystem adoption. A major architectural shift occurred with NVMe 2.0 on June 3, 2021, which restructured the specifications into a modular family of 11 documents for easier development and maintenance, while introducing support for zoned namespaces (ZNS) to optimize write efficiency by organizing storage into sequential zones, reducing overhead in flash-based media. All versions maintain backward compatibility, ensuring newer devices function seamlessly with prior host implementations.[19][20] Key milestones in NVMe adoption include the introduction of consumer-grade PCIe SSDs in 2014, such as early M.2 form factor drives, which brought high-speed storage to personal computing and accelerated mainstream integration in laptops and desktops. By 2015, enterprise adoption surged with the deployment of NVMe in data centers, driven by hyperscalers seeking low-latency performance for virtualization and big data workloads, marking a shift from SAS/SATA dominance in server environments. Since 2023, the NVMe consortium has adopted an annual Engineering Change Notice (ECN) process to incrementally add features, with 13 ratified ECNs that year focusing on scalability and reliability. Notable among recent advancements is Technical Proposal 4159 (TP4159), ratified in 2024, which defines PCIe infrastructure for live migration, enabling seamless controller handoff in virtualized setups to minimize downtime during maintenance or load balancing.[21] In 2025, the NVMe 2.3 specifications, released on August 5, updated all 11 core documents with emphases on sustainability and power configuration, including Power Limit Config for administrator-defined maximum power draw to optimize energy use in dense deployments, and enhanced reporting for carbon footprint tracking to support eco-friendly data center operations. These updates underscore NVMe's ongoing evolution toward efficient, modular storage solutions across client, enterprise, and cloud applications.[4][22]Technical Specifications
Protocol Architecture
The NVMe protocol architecture is structured in layers to facilitate efficient communication between host software and non-volatile memory storage devices, primarily over the PCIe interface. At the base level, the transport layer, such as NVMe over PCIe, handles the physical and link-layer delivery of commands and data across the PCIe bus, mapping NVMe operations to PCIe memory-mapped I/O registers and supporting high-speed data transfer without the overhead of legacy protocols.[23] The controller layer manages administrative and I/O operations through dedicated queues, while the NVM subsystem encompasses one or more controllers, namespaces (logical storage partitions), and the underlying non-volatile memory media, enabling scalable access to storage resources.[24] In the operational flow, the host submits commands to submission queues (SQs) in main memory, which the controller polls or is notified of via updates to doorbell registers—dedicated hardware registers that signal the arrival of new commands without requiring constant polling. The controller processes these commands, executes I/O operations on the NVM, and posts completion entries to associated completion queues (CQs) in host memory, notifying the host through efficient mechanisms to minimize latency. This paired queue model supports parallel processing, with the host managing queue arbitration and the controller handling execution.[24] Key features of the architecture include asymmetric queue pairs, where multiple SQs can associate with a single CQ to optimize resource use and reduce interrupt overhead; MSI-X interrupts, which enable vectored interrupts for precise completion notifications, significantly lowering CPU utilization compared to legacy interrupt schemes; and support for multipath I/O, allowing redundant paths to controllers for enhanced reliability and performance in enterprise environments. Error handling is integrated through asynchronous event mechanisms, where the controller reports status changes, errors, or health issues directly to the host via dedicated admin commands, ensuring robust operation without disrupting ongoing I/O.[24][23]Command Set and Queues
The NVMe protocol defines a streamlined command set divided into administrative (Admin) and input/output (I/O) categories, enabling efficient management and data transfer operations on non-volatile memory devices. Admin commands are essential for controller initialization, configuration, and maintenance, submitted exclusively to a dedicated Admin Submission Queue (SQ) and processed by the controller before I/O operations can commence. Examples include the Identify command, which retrieves detailed information about the controller, namespaces, and supported features; the Set Features command, used to configure controller parameters such as interrupt coalescing or power management; the Get Log Page command, for retrieving operational logs like error or health status; and the Abort command, to cancel pending I/O submissions.[24] In contrast, I/O commands handle data access within namespaces and are submitted to I/O SQs, supporting high-volume workloads with minimal overhead. Core examples encompass the Read command for retrieving logical block data, the Write command for storing data to specified logical blocks, and the Flush command, which ensures that buffered data and metadata in volatile cache are committed to non-volatile media, guaranteeing persistence across power loss.[25] Additional optional I/O commands, such as Compare for data verification or Write Uncorrectable for intentional error injection in testing, extend functionality while maintaining a lean core set of just three mandatory commands to reduce protocol complexity.[24] NVMe's queue mechanics leverage paired Submission Queues and Completion Queues (CQs) to facilitate asynchronous command processing, with queues implemented as circular buffers in host memory for low-latency access. Each queue pair consists of an SQ where the host enqueues 64-byte command entries (including opcode, namespace ID, data pointers, and metadata) and a corresponding CQ where the controller posts 16-byte completion entries (indicating status, error codes, and command identifiers). A single mandatory Admin queue pair handles all Admin commands, while up to 65,535 I/O queue pairs can be created via the Create I/O Submission Queue and Create I/O Completion Queue Admin commands, each supporting up to 65,536 entries to accommodate deep command pipelines.[24] The host advances the SQ tail doorbell register to notify the controller of new submissions, and the controller updates the CQ head after processing, with phase tags toggling to signal new entries without polling the entire queue. Multiple SQs may share a single CQ to optimize resource use, and all queues are identified by unique queue IDs assigned during creation.[24] To maximize parallelism, NVMe permits out-of-order command execution and completion within and across queues, decoupling submission order from processing sequence to exploit non-volatile memory's low latency and parallelism. The controller processes commands from SQs based on internal arbitration, returning completions to the associated CQ with a unique command identifier (CID) that allows the host to match and reorder results if needed, without enforcing strict in-order delivery. This design supports multi-threaded environments by distributing workloads across queues, one per CPU core or thread, reducing contention compared to legacy single-queue protocols. Queue priorities further enhance this by classifying I/O SQs into 4 priority classes (Urgent, High, Medium, and Low) via the 2-bit QPRIO field in the Create I/O Submission Queue command, using Weighted Round Robin with Urgent Priority Class arbitration, where the Urgent class has strict priority over the other three classes, which are serviced proportionally based on weights from 0 to 255.[24] Queue IDs serve as the basis for this prioritization, enabling fine-grained control over latency-sensitive versus throughput-oriented traffic. The aggregate queue depth in NVMe, calculated as the product of the number of queues and entries per queue (up to 65,535 queues × 65,536 entries), yields a theoretical maximum of over 4 billion outstanding commands, facilitating terabit-scale throughput in high-performance computing and data center environments by saturating PCIe bandwidth with minimal host intervention.[24] This depth, combined with efficient doorbell mechanisms and interrupt moderation, ensures scalable I/O submission rates exceeding millions of operations per second on modern controllers.[24]Physical Interfaces
Add-in Cards and Consumer Form Factors
Add-in cards (AIC) represent one of the primary physical implementations for NVMe in consumer and desktop environments, typically taking the form of half-height, half-length (HHHL) or full-height, half-length (FHHL) PCIe cards that plug directly into available PCIe slots on motherboards.[2] These cards support NVMe SSDs over PCIe interfaces, commonly utilizing x4 lanes for single-drive configurations, though multi-drive AICs can leverage x8 or higher lane widths to accommodate multiple M.2 slots or U.3 connectors for enhanced storage capacity in high-performance consumer builds like gaming PCs.[26] Early NVMe AICs were designed around PCIe 3.0 x4, providing sequential read/write speeds up to approximately 3.5 GB/s, while modern variants support PCIe 4.0 x4 for doubled bandwidth, reaching up to 7 GB/s, and as of 2025, PCIe 5.0 x4 enables up to 14 GB/s in consumer applications.[27] The M.2 form factor offers a compact, versatile connector widely adopted in consumer laptops, ultrabooks, and compact desktops, enabling NVMe SSDs to interface directly with the system's PCIe bus without additional adapters.[2] M.2 slots use keyed connectors, with the B-key supporting PCIe x2 (up to ~2 GB/s) or SATA for legacy compatibility, and the M-key enabling full PCIe x4 operation for NVMe, which is essential for high-speed storage in mobile devices.[28] M.2 NVMe drives commonly leverage PCIe 3.0 x4 for practical speeds of up to 3.5 GB/s or PCIe 4.0 x4 for up to 7 GB/s, and as of 2025, PCIe 5.0 x4 supports up to 14 GB/s, allowing consumer systems to achieve rapid boot times and application loading without the bulk of traditional 2.5-inch drives.[29] CFexpress extends NVMe capabilities into portable consumer devices like digital cameras and camcorders, providing an SD card-like form factor that uses PCIe and NVMe protocols for high-speed data transfer in burst photography and 8K video recording.[30] Available in Type A (x1 PCIe lanes) and Type B (x2 lanes) variants, CFexpress Type B cards support PCIe Gen 4 x2 with NVMe 1.4 in the CFexpress 4.0 specification (announced 2023), delivering read speeds up to approximately 3.5 GB/s and write speeds up to 3 GB/s; earlier CFexpress 2.0 versions used PCIe Gen 3 x2 with NVMe 1.3 for up to 1.7 GB/s read and 1.5 GB/s write, while maintaining compatibility with existing camera slots through adapters for M.2 NVMe modules.[31] This form factor prioritizes durability and thermal management for field use, with capacities scaling to several terabytes in consumer-grade implementations.[32] SATA Express serves as a transitional connector in some consumer motherboards, bridging legacy SATA interfaces with NVMe over PCIe for backward compatibility while enabling higher performance in mixed-storage setups.[33] Defined to use two PCIe 3.0 lanes (up to approximately 1 GB/s per lane, total 2 GB/s) alongside dual SATA 3.0 ports, it allows NVMe devices to operate at PCIe speeds when connected, or fall back to AHCI/SATA mode for older drives, though adoption has been limited in favor of direct M.2 slots.[34] This design facilitates upgrades in consumer PCs without requiring full PCIe slot usage, supporting NVMe protocol for sequential speeds approaching 2 GB/s in compatible configurations.[35]Enterprise and Specialized Form Factors
Enterprise and specialized form factors for NVMe emphasize durability, high density, and seamless integration in server environments, enabling scalable storage solutions with enhanced reliability for data centers. These designs prioritize hot-swappability, redundancy, and optimized thermal management to support mission-critical workloads, contrasting with consumer-oriented compact interfaces by focusing on rack-scale deployment and serviceability.[36] The U.2 form factor, defined by the SFF-8639 connector specification, is a 2.5-inch hot-swappable drive widely adopted in enterprise servers and storage arrays. It supports PCIe interfaces for NVMe, while maintaining backward compatibility with SAS and SATA protocols through the same connector, allowing flexible upgrades without hardware changes. The design accommodates heights up to 15 mm, which facilitates greater 3D NAND stacking for higher capacities—often exceeding 30 TB per drive—while preserving compatibility with standard 7 mm and 9.5 mm server bays. Additionally, U.2 enables dual-port configurations, providing redundancy via two independent PCIe x2 paths for failover in high-availability setups, reducing downtime in clustered environments. U.3 extends this with additional interface detection pins to enable tri-mode support (SAS, SATA, PCIe/NVMe), while the connector handles up to 25 W for more demanding NVMe SSDs without external power cables. As of 2025, both support PCIe 5.0 and early PCIe 6.0 implementations.[37][36][38][39] EDSFF (Enterprise and Data Center Standard Form Factor) introduces tray-based designs optimized for dense, airflow-efficient data center deployments, addressing limitations of traditional 2.5-inch drives in hyperscale environments. The E1.S variant, a compact 110 mm x 32 mm module, fits vertically in 1U servers as a high-performance alternative to M.2, supporting up to 70 W power delivery and PCIe x4 lanes for NVMe SSDs with superior thermal dissipation through integrated heat sinks. E1.L extends this to 314 mm length for maximum capacity in 1U storage nodes, enabling up to 60 TB per tray while consolidating multiple drives per slot to boost rack density. The E3.S form factor, at 112 mm x 76 mm, serves as a direct U.2 replacement in 2U servers, offering horizontal or vertical orientation with enhanced signal integrity for PCIe 5.0 and, as of 2025, PCIe 6.0 in NVMe evolutions, thus improving serviceability and cooling in multi-drive configurations. These tray systems reduce operational costs by simplifying hot-plug operations and optimizing front-to-back airflow in high-density racks. As of 2025, EDSFF supports emerging PCIe 6.0 SSDs for data center applications.[40][41] In specialized applications, OCP NIC 3.0 integrates NVMe storage directly into open compute network interface cards, facilitating composable infrastructure where compute, storage, and networking resources are dynamically pooled and allocated. This small form factor adapter supports PCIe Gen5 x16 lanes and NVMe SSD modules, such as dual M.2 drives, enabling disaggregated storage access over fabrics for cloud-scale efficiency without dedicated drive bays. By embedding NVMe capabilities in NIC slots, it enhances scalability in OCP-compliant servers, allowing seamless resource orchestration in AI and big data workloads.[42][43][44]NVMe over Fabrics
Core Concepts
NVMe over Fabrics (NVMe-oF) is a protocol specification that extends the base NVMe interface to operate over network fabrics beyond PCIe, enabling hosts to access non-volatile memory subsystems in disaggregated storage environments.[45] This extension maintains the core NVMe command set and queueing model while adapting it for remote communication, allowing block storage devices to be shared across a network without requiring protocol translation layers.[46] Central to NVMe-oF are capsules, which encapsulate NVMe commands, responses, and optional data or scatter-gather lists for transmission over the fabric.[45] Discovery services, provided by dedicated discovery controllers within NVM subsystems, allow hosts to retrieve discovery log pages that list available subsystems and their transport-specific addresses.[46] Controller discovery occurs through these log pages, enabling hosts to connect to remote controllers using a well-known namespace qualified name, such as nqn.2014-08.org.nvmexpress.discovery.[47] The specification delivers unified NVMe semantics for both local and remote storage access, preserving the efficiency of NVMe's submission and completion queues across network boundaries.[47] This approach reduces latency compared to traditional protocols like iSCSI or Fibre Channel, adding no more than 10 microseconds of overhead over native NVMe devices in optimized implementations.[47] NVMe-oF 1.0, released on June 5, 2016, standardized support for RDMA and TCP transports, facilitating block storage over Ethernet with direct data placement and without intermediate protocol translation.[45][48]Supported Transports and Applications
NVMe over Fabrics (NVMe-oF) supports several network transports to enable remote access to NVMe storage devices, each optimized for different fabric types and performance requirements. The Fibre Channel transport, known as FC-NVMe, maps NVMe capsules onto Fibre Channel frames, leveraging the existing FC infrastructure for high-reliability enterprise environments.[46] For RDMA-based fabrics, NVMe-oF utilizes RoCE (RDMA over Converged Ethernet), iWARP (Internet Wide Area RDMA Protocol), and InfiniBand, which provide low-latency, direct memory access over Ethernet or specialized networks, minimizing CPU overhead in data center deployments.[49] Additionally, the TCP transport (NVMe/TCP) operates over standard Ethernet, offering a cost-effective option without requiring specialized hardware like RDMA-capable NICs.[50] These transports find applications in diverse scenarios demanding scalable, low-latency storage. In cloud storage environments, NVMe-oF facilitates disaggregated architectures where compute and storage resources are independently scaled, supporting multi-tenant workloads with consistent performance across distributed systems.[51] Hyper-converged infrastructure (HCI) benefits from NVMe-oF's ability to unify compute, storage, and networking in software-defined clusters, enabling efficient resource pooling and workload mobility in virtualized data centers. For AI workloads, NVMe-oF delivers the high-throughput, low-latency remote access essential for training large models, where rapid data ingestion from shared storage pools accelerates GPU-intensive processing.[52] Key features across these transports include support for asymmetric I/O, where host and controller capabilities can differ to optimize network efficiency, multipathing for fault-tolerant path redundancy, and security through the NVMe Security Protocol, which provides authentication and encryption mechanisms like Diffie-Hellman CHAP.[46] NVMe/TCP version 1.0, ratified in 2019, enables deployment over 100GbE and higher-speed Ethernet fabrics, while the 2025 Revision 1.2 update introduces rapid path failure recovery to enhance resilience in dynamic networks.[53]Comparisons with Legacy Protocols
Versus AHCI and SATA
The Advanced Host Controller Interface (AHCI), designed primarily for SATA-connected hard disk drives, imposes several limitations when used with solid-state drives (SSDs). It supports only a single command queue per port with a maximum depth of 32 commands, leading to serial processing that bottlenecks parallelism for high-speed storage devices.[10] Additionally, AHCI requires up to nine register read/write operations per command issue and completion cycle, resulting in high CPU overhead and increased latency, particularly under heavy workloads typical of SSDs.[10] These constraints make AHCI inefficient for leveraging the full potential of non-volatile memory, as it was not optimized for the low-latency characteristics of flash-based storage. In contrast, NVM Express (NVMe) addresses these shortcomings through its native design for PCI Express (PCIe)-connected SSDs, enabling up to 65,535 queues with each supporting a depth of 65,536 commands for massive parallelism.[10] This queue structure, combined with streamlined command processing that requires only two register writes per cycle, significantly reduces overhead and latency—often achieving 2-3 times faster command completion compared to AHCI.[6] NVMe's direct PCIe integration eliminates the need for intermediate translation layers, allowing SSDs to operate closer to their hardware limits without the serial bottlenecks of SATA/AHCI. Performance metrics highlight these differences starkly. NVMe SSDs routinely deliver over 500,000 random 4K IOPS in read/write operations, far surpassing AHCI/SATA SSDs, which are typically limited to around 100,000 IOPS due to interface constraints.[54] Sequential throughput also benefits, with NVMe reaching multi-gigabyte-per-second speeds on PCIe lanes, while AHCI/SATA caps at approximately 600 MB/s. Regarding power efficiency, NVMe provides finer-grained power management with up to 32 dynamic states within its active mode, enabling lower idle and active power consumption for equivalent workloads compared to AHCI's coarser SATA power states, which incur higher overhead from polling and interrupts.[10][55] Another key distinction lies in logical partitioning: AHCI uses port multipliers to connect multiple SATA devices behind a single host port, but this introduces shared bandwidth and increased latency across devices.[10] NVMe, however, employs namespaces to create multiple independent logical partitions within a single physical device, supporting parallel access without the multiplexing overhead of port multipliers.[10] This makes NVMe more suitable for virtualized environments requiring isolated storage volumes. Real-world performance benefits of PCIe Gen4 NVMe SSDs over SATA III SSDs include dramatically faster file transfers, such as a 50 GB file completing in under 10 seconds on NVMe compared to over a minute on SATA; shorter game load times, reduced by 30–50% (e.g., from 25 seconds to around 10 seconds with technologies like Microsoft DirectStorage); quicker system responsiveness, with improvements of 35–45% in multitasking and application launches; and fast boot times of under 10 seconds for Windows versus 20–30 seconds on SATA. These advantages are particularly beneficial for modern games and direct storage technologies, resulting in a noticeably snappier overall system experience.[56][57]Versus SCSI and Other Standards
NVM Express (NVMe) differs fundamentally from SCSI protocols, such as those used in Serial Attached SCSI (SAS) and Fibre Channel (FC), in its command queuing mechanism and overall architecture. SCSI employs tagged command queuing, supporting up to 256 tags per logical unit number (LUN), which limits parallelism to a single queue per device with moderate depth.[8] In contrast, NVMe utilizes lightweight submission and completion queues, enabling up to 65,535 queues per controller, each with a depth of up to 65,536 commands, facilitating massive parallelism tailored to flash storage's capabilities. This design reduces protocol stack depth and overhead, particularly for small I/O operations, where SCSI's more complex command processing and LUN-based addressing introduce higher latency and CPU utilization compared to NVMe's streamlined approach.[58] Compared to Ethernet-based iSCSI, which encapsulates SCSI commands over TCP/IP, NVMe—especially in its over-fabrics extensions—avoids translation layers that map SCSI semantics to NVMe operations, eliminating unnecessary overhead and enabling direct, efficient access to non-volatile memory.[59] iSCSI's reliance on SCSI's block-oriented model results in added latency from protocol encapsulation and processing, whereas NVMe provides native support for low-latency flash I/O without such intermediaries.[60] NVMe offers distinct advantages in enterprise and hyperscale environments, including lower latency optimized for flash media—achieving low-microsecond access times (under 10 μs) versus SCSI's higher overhead—and superior scalability for parallel access across hundreds of drives.[58] It integrates seamlessly with zoned storage through the Zoned Namespace (ZNS) command set, reducing write amplification and enhancing endurance for large-scale flash deployments, unlike SCSI's Zoned Block Commands (ZBC), which are less optimized for NVMe's queue architecture.[61] In comparison to emerging standards like Compute Express Link (CXL), which emphasizes memory semantics for coherent, cache-line access to persistent memory, NVMe focuses on block storage semantics with explicit I/O commands, though NVMe over CXL hybrids bridge the two for optimized data movement in disaggregated systems.[62]Implementation and Support
Operating System Integration
The Linux kernel has included native support for NVM Express (NVMe) devices since version 3.3, released in March 2012, via the integratednvme driver module.[63] The NVMe driver framework in the kernel, including the core nvme module for local PCIe devices and additional transport drivers for NVMe over Fabrics (NVMe-oF), enables high-performance I/O queues and administrative commands directly from the kernel.[64] As of 2025, recent kernel releases, such as version 6.13, have incorporated enhancements for NVMe 2.0 and later specifications, including improved power limit configurations to cap device power draw and expanded zoned namespace (ZNS) capabilities for sequential-write-optimized storage, with initial ZNS support dating back to kernel 5.9.[65][66][22]
Microsoft's Windows operating systems utilize the StorNVMe driver for NVMe integration, introduced in Windows 8.1 and Windows Server 2012 R2.[67] This inbox driver handles NVMe command sets for local SSDs, with boot support added in the 8.1 release.[68] As of Windows Server 2025, native support for NVMe-oF has been added, including transports like TCP (with RDMA planned in updates) for networked storage in enterprise environments.[69] Later versions, including Windows 10 version 1903 and Windows 11, have refined features such as namespace management and error handling.[70]
FreeBSD provides kernel-level NVMe support through the nvme(4) driver, which initializes controllers, manages per-CPU I/O queue pairs, and exposes namespaces as block devices for high-throughput operations. This driver integrates with the CAM subsystem for SCSI-like compatibility while leveraging NVMe's native parallelism.[71]
macOS offers limited native NVMe support, primarily for Apple-proprietary SSDs in Mac hardware, with third-party kernel extensions required for broader compatibility with non-Apple NVMe drives to address sector size and power state issues.[72]
In mobile and embedded contexts, iOS integrates NVMe as the underlying protocol for internal storage in iPhone and iPad devices, utilizing custom PCIe-based controllers for optimized flash access. Android supports embedded NVMe in select high-end or specialized devices, though universal flash storage (UFS) remains predominant; kernel drivers handle NVMe where implemented for faster I/O in automotive and tablet variants.
Software Drivers and Tools
Software drivers and tools for NVMe enable efficient deployment, management, and administration of NVMe devices, often operating in user space to bypass kernel overhead for performance-critical applications or provide command-line interfaces for diagnostics and configuration. These components include libraries for command construction and execution, as well as utilities for tasks like device identification, health monitoring, and firmware management. They are essential for developers integrating NVMe into custom storage stacks and administrators maintaining SSD fleets in enterprise environments.[73] Key user-space drivers facilitate direct NVMe access without kernel intervention. The Storage Performance Development Kit (SPDK) provides a polled-mode, asynchronous, lockless NVMe driver that enables zero-copy data transfers to and from NVMe SSDs, supporting both local PCIe devices and remote NVMe over Fabrics (NVMe-oF) connections. This driver is embedded in applications for high-throughput scenarios, such as NVMe-oF target implementations, and includes a full user-space block stack for building scalable storage solutions.[73][74] For low-level NAND access, the NVMe Open Channel specification extends the NVMe protocol to allow host-managed flash translation layers on Open-Channel SSDs, where the host directly controls geometry-aware operations like block allocation and wear leveling. This approach, defined in the Open-Channel SSD Interface Specification, enables optimized data placement and reduces SSD controller overhead, with supporting drivers like LightNVM providing the interface in Linux environments for custom flash management.[75][76] Management tools offer platform-specific utilities for NVMe administration. On Linux, nvme-cli serves as a comprehensive command-line interface for NVMe devices, supporting operations such as controller and namespace identification (nvme id-ctrl and nvme id-ns), device resets (nvme reset), and NVMe-oF discovery for remote targets. It is built on the libnvme library, which supplies C-based type definitions for NVMe structures, enumerations, helper functions for command construction and decoding, and utilities for scanning and managing devices, including support for authentication via OpenSSL and Python bindings.[77][78][79]
In FreeBSD, nvmecontrol provides analogous functionality, allowing users to list controllers and namespaces (nvmecontrol devlist), retrieve identification data (nvmecontrol identify), perform namespace management (creation, attachment, and deletion via nvmecontrol ns), and run performance tests (nvmecontrol perftest) with configurable parameters like queue depth and I/O size. Both nvme-cli and nvmecontrol access log pages for error reporting and vendor-specific extensions.[80]
These tools incorporate essential features for ongoing NVMe maintenance. Firmware updates are handled through commands like nvme fw-download and nvme fw-commit in nvme-cli, which support downloading images to controller slots and activating them immediately or on reset, ensuring compatibility with multi-slot firmware designs. SMART monitoring is available via nvme smart-log, which reports attributes such as temperature, power-on hours, media errors, and endurance metrics like percentage used, aiding in predictive failure analysis. Multipath configuration is facilitated by NVMe-oF support in nvme-cli, enabling discovery and connection to redundant paths for fault-tolerant setups. Additionally, nvme-cli incorporates support for 2025 Engineering Change Notices (ECNs), including configurable device personality mechanisms that allow secure host modifications to NVM subsystem configurations for streamlined inventory management.[77][81][4]
