Block-level storage

Block-level storageMain

Community hub

Block-level storage

7 pages, 0 posts

0 subscribers

Recent from talks

Be the first to start a discussion here.

Recent from talks

Be the first to start a discussion here.

Contribute something

About hubMembersContent overviewUpdatesRules

Main reference articles

Block-level storage

View on Wikipedia

from Wikipedia

Block-level storage is a concept in cloud-hosted data persistence where cloud services emulate the behaviour of a traditional block device, such as a physical hard drive.^[1]

Storage in such services is organised as blocks. This emulates the type of behaviour seen in traditional disks or tape storage through storage virtualization. Blocks are identified by an arbitrary and assigned identifier by which they may be stored and retrieved, but this has no obvious meaning in terms of files or documents. A file system must be applied on top of the block-level storage to map 'files' onto a sequence of blocks.

Amazon EBS (elastic block store) is an example of a cloud block store.^[2] Cloud block-level storage will usually offer facilities such as replication for reliability, or backup services.^[3]

Block-level storage is in contrast to an object store or 'bucket store', such as Amazon S3 (simple storage service), or to a database. These operate at a higher level of abstraction and are able to work with entities such as files, documents, images, videos or database records.^[4]

Instance stores are another form of cloud-hosted block-level storage. These are provided as part of an 'instance', such as an Amazon EC2 (elastic compute cloud) service.^[5] As EC2 instances are primarily provided as compute resources, rather than storage resources, their storage is less robust. Their contents will be lost if the cloud instance is stopped.^[6] As these stores are part of the instance's virtual server they offer higher performance and bandwidth to the instance. They are best used for temporary storage such as caching or temporary files, with persistent storage held on a different type of server.

At one time, block-level storage was provided by storage area networks (SAN) and NAS provided file-level storage.^[7] With the shift from on-premises hosting to cloud services, this distinction has shifted.^[8] Even block-storage is now seen as distinct servers (thus NAS), rather than the previous array of bare disks.

References

[edit]

^ Wittig, Andreas; Wittig, Michael (2015). Amazon web services in action. Manning press. pp. 204–206. ISBN 978-1-61729-288-0.
^ Wittig & Wittig (2015), pp. 216–217.
^ Wittig & Wittig (2015), pp. 210–211.
^ Taneja, Arun. "How an object store differs from file and block storage". TechTarget.com. Archived from the original on 2016-02-13. Retrieved 2016-02-22.
^ Wittig & Wittig (2015), pp. 212–214.
^ Wittig & Wittig (2015), p. 212.
^ "What is file level storage versus block level storage?". Stonefly. Archived from the original on 15 October 2012.
^ Wittig & Wittig (2015), p. 205.

v t e Storage virtualization
Block-level storage Disk aggregation Distributed file systems Comparison File virtualization Logical disk RAID Software-defined storage Virtual disk Virtual file system

Revisions and contributors Edit on Wikipedia Read on Wikipedia

View on Grokipedia

from Grokipedia

Block-level storage, also known as block storage, is a data storage architecture that organizes information into fixed-size blocks, each assigned a unique identifier for direct access and management, commonly deployed in storage area networks (SANs), cloud environments, and virtualized systems to enable high-performance operations.^[1]^[2]^[3] This approach treats storage as raw volumes presented to servers or applications, allowing the operating system or software to handle file systems independently on top of the blocks, which facilitates efficient read/write operations without the overhead of hierarchical file structures.^[4]^[1] In block-level storage, data is divided into equally sized blocks—typically ranging from a few kilobytes to several megabytes—stored independently across hardware or cloud infrastructure, and retrieved via a lookup table or address system that reassembles them as needed.^[3]^[2] This low-level abstraction provides hardware independence, supporting multiple operating systems and enabling scalability by adding volumes dynamically, making it ideal for demanding workloads such as databases, virtual machines, and containerized applications that require low latency and high input/output operations per second (IOPS).^[1]^[4] Key advantages of block-level storage include superior performance due to multiple access paths and direct block-level I/O, flexibility in partitioning across environments, and reliability when combined with technologies like RAID for redundancy, though it lacks built-in extensive metadata support, which is managed at the application layer.^[2]^[3] Compared to file-level storage (NAS), which uses hierarchical directories and is better for shared file access but slower due to single-path dependencies, block storage offers faster retrieval for structured data but is less suited for collaborative environments.^[1]^[4] In contrast to object storage, which stores data as discrete objects with rich, customizable metadata for unstructured files like backups or media, block storage excels in transactional scenarios but can be more expensive and less scalable for massive, infrequently accessed datasets.^[2]^[3] Block-level storage has become foundational in modern data management, powering services like Block Storage as a Service (BaaS) in public clouds from providers such as AWS, Azure, and Google Cloud, as well as enterprise solutions integrated with virtualization platforms like VMware.^[1]^[2] Despite its strengths, challenges include higher costs relative to object storage and the need for additional layers to handle redundancy or metadata, positioning it as a preferred choice for performance-critical infrastructure in data centers and hybrid clouds.^[4]^[3]

Fundamentals

Definition

Block-level storage is a method of storing data in fixed-size contiguous blocks, typically ranging from 512 bytes to 4 KB or larger, where the data is treated as raw storage without any inherent file system structure.^[5]^[1] Each block operates independently and is assigned a unique identifier, allowing the storage system to manage and retrieve data efficiently at a low level close to the hardware.^[2]^[6] In this storage model, data is accessed through arbitrary byte offsets that map to specific blocks, enabling direct read and write operations without navigating a hierarchical file structure.^[4]^[7] However, practical organization of these blocks into files and directories requires an overlay file system, such as NTFS for Windows or ext4 for Linux, which interprets the raw blocks and provides user-friendly abstraction.^[1]^[6] Blocks in block-level storage represent the smallest addressable unit of data, emulating the sectors found on physical disk devices and distinguishing this approach from raw, unformatted device access.^[1]^[8] For instance, a hard disk drive (HDD) or solid-state drive (SSD) presents its capacity as a linear sequence of such blocks to the operating system.^[7]^[1]

Key Characteristics

Block-level storage utilizes fixed-size blocks, commonly 512 bytes to 64 KB depending on the implementation, which enable efficient random access to data portions independent of file boundaries. This structure supports low-latency read and write operations by allowing direct manipulation of individual blocks without needing to traverse hierarchical file structures.^[9] A core feature is the logical block addressing (LBA) scheme, which assigns a unique sequential identifier to each block for precise, direct access across the storage medium. This addressing method abstracts the physical layout of the storage device, facilitating reliable data retrieval and modification at the block level.^[9]^[10] File systems built atop block-level storage exhibit statefulness in their operational model, necessitating the maintenance of metadata to preserve data consistency, especially in scenarios involving interruptions like system crashes. These file systems commonly employ journaling techniques to log metadata changes, ensuring atomic updates and rapid recovery without full rescans.^[11] Performance scalability in block-level storage is constrained by key metrics such as input/output operations per second (IOPS) for random access and throughput for data transfer rates, both of which are significantly influenced by block size selection. Smaller blocks maximize IOPS for fine-grained operations, while larger blocks minimize overhead in sequential workloads, enhancing overall throughput efficiency.^[12]^[13] File systems provide a higher-level abstraction by mapping these addressable blocks into organized files and directories for user applications.

Comparisons with Other Storage Types

Versus File-Level Storage

Block-level storage operates at the raw data layer, where data is stored and accessed in fixed-size blocks with unique identifiers, providing low-level control without inherent file system abstraction. In contrast, file-level storage organizes data into hierarchical structures of files and directories, allowing direct access via file paths and abstracting away the underlying block management through a file system.^[14]^[15] This abstraction in file-level storage is facilitated by protocols such as NFS for Unix/Linux environments or SMB for Windows systems, enabling seamless file sharing over networks.^[5]^[1] Regarding access granularity, block-level storage treats data as undifferentiated blocks, requiring users or applications to mount a file system—such as ext4 or NTFS—on the storage volume to perform file-level operations like reading or writing specific files. File-level storage, however, natively manages file naming, permissions, and sharing mechanisms at the protocol level, simplifying access without needing additional file system layers.^[14]^[15] This difference means block-level storage offers greater flexibility in how blocks are addressed and assembled but demands more configuration for end-user file interactions.^[5] Historically, block-level storage has been implemented through Storage Area Networks (SANs), which provide dedicated, high-speed networks for block access, emerging in the 1990s to support enterprise data centers requiring raw performance. File-level storage, on the other hand, is typically delivered via Network Attached Storage (NAS) appliances, which integrate file system services and became popular for collaborative environments in the same era, prioritizing ease of file sharing over low-level control.^[1]^[15] In terms of performance trade-offs, block-level storage delivers lower latency and higher throughput for workloads involving frequent random reads and writes, such as databases or virtual machines, due to its direct block addressing and multiple concurrent data paths. File-level storage excels in simplicity for shared file access scenarios, like document collaboration, but can introduce overhead from file system metadata management, making it less efficient for latency-sensitive applications.^[5]^[14]

Versus Object Storage

Block-level storage organizes data into fixed-size blocks arranged in a flat array, enabling direct positional access to any block via unique identifiers, which allows for efficient random read and write operations.^[14] In contrast, object storage treats data as discrete, immutable objects, each comprising the data itself, a unique identifier (such as a key), and associated rich metadata, with access typically facilitated through HTTP-based protocols like REST APIs.^[16] This structural difference means block-level storage supports low-level, byte-addressable operations suitable for applications requiring frequent modifications, while object storage emphasizes holistic object retrieval and is less suited for in-place edits.^[17] Scalability models diverge significantly between the two paradigms. Block-level storage primarily scales vertically by expanding the capacity or performance of individual volumes, which is effective for structured workloads but can introduce bottlenecks in highly distributed environments.^[4] Object storage, however, excels in horizontal scalability, distributing objects across numerous nodes in a cluster to handle exabyte-scale datasets without a central point of failure, making it ideal for massive, infrequently accessed archives.^[18] For instance, fixed block sizes in block-level systems, typically ranging from 512 bytes to 4 MB depending on the implementation, facilitate this vertical growth but limit seamless expansion compared to object storage's flexible, metadata-driven partitioning.^[14]^[19] At the abstraction layer, block-level storage requires an overlying file system or database to impose hierarchical organization and manage data semantics, providing a raw, unstructured foundation for operating systems and applications.^[4] Object storage operates in a schemaless manner, bypassing traditional file hierarchies in favor of a flat namespace where objects are addressed directly by ID, which suits unstructured data such as media files, logs, or backups that benefit from embedded metadata for search and retrieval.^[16] This makes object storage particularly advantageous for modern distributed systems handling diverse, non-relational data without the overhead of file system maintenance.^[17] Practical examples illustrate these distinctions in cloud environments. Amazon Elastic Block Store (EBS) exemplifies block-level storage, offering persistent volumes attachable to EC2 instances for database hosting, where blocks can be updated partially to support transactional integrity.^[9] Conversely, Amazon Simple Storage Service (S3) represents object storage, storing files as immutable objects that cannot be partially updated—instead requiring full object replacement—which prioritizes durability and global accessibility over fine-grained modifications.^[9]

Technical Implementation

Block Devices and Access Protocols

Block devices serve as logical abstractions in operating systems that represent storage hardware accessible in fixed-size blocks, allowing applications to perform read and write operations as if interacting with a raw disk. In Linux, for instance, these are typically exposed as device files such as /dev/sda, which emulates a SCSI or ATA disk and supports commands for block-level I/O through the kernel's block layer.^[20]^[21]^[22] Key protocols enable the transport of these block-level commands across various interfaces. The iSCSI protocol, standardized by the IETF, provides IP-based access to block storage over Ethernet networks by encapsulating SCSI commands within TCP/IP packets, allowing initiators to connect to remote targets for seamless block I/O.^[23] Fibre Channel, developed as a high-speed serial protocol for storage area networks (SANs), facilitates low-latency, high-throughput block access in fabric topologies, supporting distances up to hundreds of kilometers via fiber optics and enabling switched interconnects for multiple hosts and storage arrays.^[24] NVMe, an optimized protocol from the NVM Express consortium, delivers low-latency access to SSDs over the PCIe bus, leveraging parallel command queues and reducing overhead compared to legacy protocols like SCSI, with support for up to 65,535 queues and 64,000 commands per queue.^[25]^[26] Access to block devices occurs through direct or networked methods. Direct Attached Storage (DAS) connects storage directly to a host via local interfaces like SATA, SAS, or PCIe, providing high-performance, low-latency block access without network intermediaries, ideal for single-host environments.^[27] In networked setups, such as SANs, access is mediated by Logical Unit Numbers (LUNs), where LUN masking at the storage controller or host bus adapter level restricts visibility and access to specific LUNs, ensuring secure isolation for virtualized hosts by mapping only authorized initiators to target volumes.^[28] Modern extensions like NVMe over Fabrics (NVMe-oF) extend NVMe's efficiency to remote block access over network fabrics, supporting transports such as RDMA to minimize CPU involvement and latency, enabling disaggregated storage pools with performance approaching local PCIe attachments.^[29]^[30]

Management and Operations

Volume management in block-level storage involves tools that abstract physical devices into flexible logical units, enabling dynamic allocation and reconfiguration. The Logical Volume Manager (LVM) in Linux systems serves as a primary tool for this purpose, allowing administrators to create, resize, stripe data across multiple devices for performance, and mirror volumes for redundancy without disrupting ongoing operations.^[31]^[32] LVM achieves this by layering logical volumes over physical extents, which can be adjusted online to accommodate growing data needs or hardware changes.^[33] Data operations in block-level storage facilitate efficient handling of volumes through mechanisms like snapshots, cloning, and thin provisioning. Snapshots create point-in-time copies using copy-on-write (CoW), where unchanged data blocks remain shared between the original volume and snapshot, and only modified blocks are duplicated to minimize storage overhead.^[34] Cloning produces a writable duplicate of a volume, often leveraging snapshot technology in systems like LVM to enable rapid replication for testing or development without full data copying upfront.^[35] Thin provisioning allocates storage on-demand, presenting larger virtual volumes to applications while consuming physical space only as data is written, thus optimizing resource utilization in environments with variable workloads.^[36] Error handling at the block level commonly integrates Redundant Array of Independent Disks (RAID) configurations to provide fault tolerance and data protection. RAID 0 employs striping to distribute data across drives for enhanced performance but offers no redundancy, making it suitable for non-critical, high-speed applications.^[37] In contrast, RAID 5 uses distributed parity across multiple drives to enable recovery from a single drive failure, balancing capacity, performance, and reliability by calculating parity blocks to reconstruct lost data.^[37] These RAID levels are implemented via software tools like mdadm in Linux, which manage array assembly, monitoring, and rebuilding to maintain data integrity during hardware faults.^[38] Backup strategies for block-level storage emphasize capturing raw volumes to ensure comprehensive data preservation and rapid recovery. Block-level backups copy the entire volume at the disk sector level, bypassing file systems to include all data, metadata, and boot configurations, which supports bare-metal restores that rebuild systems from scratch on new hardware.^[39] This approach accelerates recovery by restoring volumes directly, enabling operating systems and applications to boot without additional reconfiguration, particularly in disaster scenarios.^[40] Tools like dd in Linux or commercial solutions facilitate these raw backups, ensuring fidelity to the original storage state.^[41]

Applications and Use Cases

In On-Premises Environments

In on-premises environments, block-level storage is widely deployed for hosting high-performance databases such as Oracle and MySQL, where low-latency access and high IOPS are essential for transaction processing and data-intensive workloads. These systems treat storage as raw blocks, enabling direct, efficient I/O operations that support the demanding requirements of online transaction processing (OLTP) applications. For instance, Oracle databases leverage block storage to achieve up to 99.9999% availability and significantly faster backups, reducing project delivery times by up to 30%.^[42] Similarly, MySQL deployments benefit from block storage's ability to handle concurrent reads and writes with minimal overhead, ensuring consistent performance in enterprise settings.^[43] Virtualization platforms like VMware further exemplify block-level storage's role in on-premises setups, where it functions as the foundation for virtual machine (VM) disks. Virtual disks, often provisioned as VMDKs on VMFS datastores, provide scalable block access for VMs running database instances, allowing multiple virtualized servers to share underlying physical storage resources without performance degradation. Best practices recommend using paravirtualized SCSI adapters and eager zeroed thick provisioning for I/O-intensive Oracle workloads to optimize throughput and reduce latency.^[44] This configuration supports database consolidation, enabling efficient resource utilization across physical hosts in data centers. Enterprise on-premises infrastructures commonly utilize Storage Area Network (SAN) arrays from vendors such as Dell EMC and NetApp to deliver shared block storage accessible by multiple servers via protocols like Fibre Channel or iSCSI. These SAN systems centralize block-level data in a pooled environment, facilitating high-availability clustering and seamless scalability for mission-critical applications. NetApp's ONTAP-based SAN solutions, for example, guarantee 100% data availability and support unified management for block workloads, ensuring operational continuity.^[45] To enhance performance, on-premises block storage often incorporates SSD caching tiers to accelerate access to frequently used "hot" data, while hybrid arrays blend HDDs for cost-effective capacity with flash storage for speed. NetApp FAS systems exemplify this approach, combining hybrid flash with HDD tiers to balance IOPS demands and storage economics in SAN deployments.^[46] Security is bolstered through block-level encryption, such as dm-crypt integrated with LUKS in Linux environments, which protects data at rest on entire block devices using strong ciphers like AES-XTS.^[47] Redundancy is typically achieved via RAID configurations within these arrays to mitigate hardware failures.^[44]

In Cloud and Virtualized Systems

In cloud environments, block-level storage is commonly provided as persistent volumes that can be dynamically attached to virtual machines (VMs), enabling scalable and flexible data management. Amazon Elastic Block Store (EBS) offers durable block storage volumes that attach to EC2 instances, functioning like physical hard drives, with support for elastic volume modifications to increase capacity or adjust performance without downtime.^[48] Similarly, Google Cloud Persistent Disk provides high-performance block storage for Compute Engine VMs, allowing dynamic resizing and integration with various instance types for workloads requiring low-latency access.^[49] Azure Managed Disks serve as block-level volumes for Azure VMs, supporting scalability up to 50,000 disks per subscription per region and offering performance tiers like Premium SSD for IO-intensive applications.^[50] In virtualized systems, block-level storage integrates seamlessly at the hypervisor level to abstract physical resources for guest operating systems. For instance, VMware vSphere uses VMDK (Virtual Machine Disk) files to represent virtual block devices, storing data on underlying block storage while supporting features like snapshots and thin provisioning for efficient resource utilization in virtualized data centers.^[51] Cloud providers enhance this with high-availability options, such as EBS volumes that automatically replicate data across multiple servers within an Availability Zone (AZ) with up to 99.999% durability for high-endurance types like io2 Block Express, or Google Cloud's Regional Persistent Disks, which synchronously replicate data across two zones in a region to withstand zonal failures.^[48]^[52] Azure's zone-redundant storage (ZRS) for managed disks synchronously replicates across three AZs, achieving 99.9999999999% (12 nines) durability and enabling recovery from zone outages by force-detaching and reattaching disks.^[53] Key features include automated backups and ephemeral options for performance-sensitive tasks. EBS snapshots create point-in-time, incremental backups stored durably in Amazon S3, which can be automated via Amazon Data Lifecycle Manager for retention policies up to years.^[54] Google Persistent Disk supports snapshots for backups, while Azure Managed Disks integrate with Azure Backup for automated protection. For ephemeral high-performance needs, AWS EC2 instance stores provide temporary block storage physically attached to the host, ideal for caches or scratch data but lost upon instance stop or termination, contrasting with persistent volumes like EBS.^[55]^[50]^[56] Modern trends emphasize dynamic provisioning in containerized environments, where Kubernetes uses Container Storage Interface (CSI) drivers to automatically create and manage block storage volumes. A PersistentVolumeClaim (PVC) triggers the CSI driver to provision a raw block device based on a StorageClass, supporting expansion and attachment to pods in microservices architectures for stateful applications.^[57] This enables elastic scaling in orchestrators like Google Kubernetes Engine or Azure Kubernetes Service, where CSI integrates with underlying cloud block storage for seamless, on-demand allocation.^[58]

Advantages and Disadvantages

Benefits

Block-level storage provides high performance through its low-level access mechanism, which delivers superior input/output operations per second (IOPS) and throughput, making it particularly suitable for transactional workloads such as databases that require rapid read and write operations.^[15]^[6] This direct block-level interface minimizes overhead compared to higher-level abstractions, enabling low latency for applications demanding consistent and fast data access.^[1] A key advantage is its flexibility, allowing entire storage volumes to be easily migrated between different systems or environments without significant reconfiguration, as blocks can be rerouted simply by updating the destination path.^[15] Furthermore, block-level storage supports a wide range of file systems and operating systems without being locked into specific protocols, facilitating seamless integration across diverse infrastructures.^[6]^[1] Efficiency is enhanced by features like thin provisioning, which allocates storage on demand rather than reserving it upfront, thereby optimizing resource utilization and reducing waste in dynamic environments.^[59]^[60] Snapshots further contribute to efficiency by creating point-in-time copies of volumes with minimal additional storage overhead, as they reference existing data blocks, which minimizes downtime during backups and supports quick recovery operations.^[61]^[62] Block-level storage offers strong compatibility by emulating physical disk devices, which simplifies the integration of legacy applications in virtualized or cloud environments, treating volumes as standard block devices accessible via protocols like iSCSI or [Fibre Channel](/page/Fibre Channel).^[6]^[1] This disk-like behavior ensures broad interoperability with existing hardware and software stacks.^[15] Additionally, optimizing block sizes to match specific workloads can further improve performance by aligning data transfer units with application requirements, though this requires careful configuration.^[1]

Limitations

Block-level storage imposes significant management overhead, as it requires administrators to handle separate file systems for tasks such as formatting, mounting, and recovery, often necessitating specialized expertise.^[63] Unlike more abstracted storage types, this direct involvement can increase operational complexity and resource demands, particularly in environments without dedicated storage teams.^[15] Scalability presents challenges for block-level storage, as it is tightly coupled to individual servers or hosts, making it difficult to share volumes across distributed systems without implementing complex protocols like iSCSI or Fibre Channel.^[15] This architecture limits its suitability for massive unstructured data workloads, where object storage offers greater horizontal scaling through simpler distribution mechanisms.^[9] In cloud environments, block-level storage often incurs higher costs due to provisioned capacity billing models, where users pay for allocated storage regardless of actual usage, potentially leading to over-provisioning and inefficient resource utilization.^[64] For instance, services like Amazon EBS charge per gigabyte provisioned per month, contrasting with the usage-based pricing of object storage that aligns costs more closely with consumed data.^[9] Raw block exposure in block-level storage can pose security risks, as direct access may allow bypassing file system checks and lead to data damage if not properly managed.^[65] Additionally, the limited built-in metadata—typically just unique block identifiers—restricts native support for granular access control, requiring overlying systems to enforce security policies and increasing the potential for misconfiguration vulnerabilities.^[9]^[15]

History and Evolution

Origins in Early Computing

The origins of block-level storage trace back to the mid-20th century, when computing systems relied primarily on sequential-access media like magnetic tapes for data storage. The introduction of random-access disk drives revolutionized this paradigm by enabling direct access to fixed-size data blocks, allowing for more efficient data management and retrieval. In 1956, IBM unveiled the 305 Random Access Method of Accounting and Control (RAMAC) system, which incorporated the IBM 350 Disk Storage Unit—the world's first commercial hard disk drive. This device featured 50 rotating platters, each 24 inches in diameter, capable of storing up to 5 million characters (approximately 3.75 MB) in fixed blocks of 100 characters each, facilitating random access times of about 600 milliseconds compared to the sequential nature of tapes.^[66]^[67] By the 1970s, advancements in disk technology further solidified the block as the fundamental unit of storage on hard disk drives (HDDs). IBM's 3340 drive, introduced in 1973 and codenamed "Winchester" after the rifle model, pioneered a sealed enclosure with low-mass read/write heads that landed on lubricated platters only when spun down, improving reliability and density while maintaining block-based access. This design influenced subsequent HDDs, establishing standardized block sizes typically ranging from 256 to 512 bytes per sector. Concurrently, in 1980, Seagate Technology released the ST-506, the first 5.25-inch HDD with 5 MB capacity, which utilized the ST-506 interface—a parallel, asynchronous protocol that became an industry standard for connecting block-oriented disk drives to controllers, supporting up to seven devices per bus and block-level read/write operations.^[68]^[69] Operating systems of the era integrated these hardware innovations by abstracting disks as block devices, treating them uniformly as files for simplified management. In the 1970s, the UNIX operating system, developed at Bell Labs, exemplified this approach by representing disks through special files in the /dev directory, such as /dev/sd0 for block devices, allowing applications to read and write data in fixed blocks via system calls like read() and write(). UNIX file systems, including precursors to the Unix File System (UFS), mapped logical file structures onto these physical blocks using inodes—data structures that indexed block addresses—enabling efficient allocation and access without regard to the underlying hardware details. This uniform interface laid the groundwork for portable file system implementations across diverse disk technologies.^[70] Standardization efforts in the 1980s formalized block-level interactions between hosts and storage peripherals. The American National Standards Institute (ANSI) approved the Small Computer System Interface (SCSI) standard, X3.131-1986, which defined a bus protocol and command set for block-oriented devices like HDDs, including operations such as READ(6) and WRITE(6) to transfer data in 512-byte blocks. This standard promoted interoperability, allowing multiple vendors' disks to function seamlessly in systems ranging from workstations to servers, and it emphasized error correction and queuing for reliable block access.^[71]

Modern Developments

In the late 1990s and early 2000s, the rise of Storage Area Networks (SANs) marked a significant evolution in block-level storage, primarily driven by the adoption of Fibre Channel technology. Developed by the American National Standards Institute (ANSI) in the early 1990s, Fibre Channel provided a high-speed serial interface capable of transferring large data volumes at speeds up to 1 Gbps initially, enabling dedicated storage networks separate from local area networks (LANs). This architecture allowed multiple servers to access shared block storage devices with low latency and high reliability, addressing the limitations of direct-attached storage in enterprise environments. By the mid-2000s, Fibre Channel SANs had become the standard for mission-critical applications, supporting topologies like arbitrated loops and switched fabrics to scale connectivity.^[72]^[73] Complementing this growth, the Internet Small Computer Systems Interface (iSCSI) protocol, pioneered by IBM and Cisco in 1998, further democratized block-level access by encapsulating SCSI commands over standard TCP/IP networks. This innovation eliminated the need for specialized Fibre Channel hardware, allowing organizations to leverage existing Ethernet infrastructure for remote block storage at lower costs, with initial implementations supporting speeds up to 1 Gbps. Ratified as an Internet Engineering Task Force (IETF) standard in 2004 (RFC 3720), iSCSI facilitated the integration of block storage into IP-based SANs, broadening adoption in small to medium enterprises and paving the way for software-defined storage solutions.^[74] The 2010s witnessed a pivotal shift toward cloud-based block storage, exemplified by Amazon Web Services (AWS) launching Elastic Block Store (EBS) on August 20, 2008, which provided persistent, resizable block-level volumes attachable to EC2 instances for high-performance workloads. EBS volumes offered features like snapshots for backups and elastic resizing, with throughput scaling to gigabytes per second by the mid-2010s. While AWS introduced Multi-AZ deployments for services like RDS in 2010 to enable synchronous replication across AZs for databases, EBS achieves 99.999% durability through automatic replication within a single AZ, supplemented by asynchronous snapshots and cross-region replication for broader resilience and disaster recovery. This cloud transition enabled dynamic provisioning and pay-as-you-go models, fundamentally altering block storage management in distributed systems.^[75]^[76] During the 2000s, the adoption of Serial Attached SCSI (SAS) in 2004 and Serial ATA (SATA) in 2003 bridged the transition from HDDs to solid-state drives (SSDs), enabling higher-speed block access and scalability in enterprise environments. The advent of SSDs and the Non-Volatile Memory Express (NVMe) protocol in 2011 revolutionized block storage performance by optimizing the interface for flash-based media, reducing latency to microseconds and increasing IOPS by orders of magnitude compared to traditional hard disk drives (HDDs). The NVMe 1.0 specification, developed by a consortium including Intel and the National Storage Industry Consortium, introduced up to 64,000 queues with 64,000 commands each, leveraging PCIe lanes for parallel processing and eliminating SCSI overhead. This shift enabled SSDs to deliver sustained throughputs exceeding 3 GB/s in enterprise arrays, transforming block storage for latency-sensitive applications like databases and virtualization. Building on this, NVMe over Fabrics (NVMe-oF), with development starting in 2014 under the NVM Express organization, extended NVMe's efficiency to networked environments over Ethernet, Fibre Channel, and InfiniBand, achieving near-local performance with sub-millisecond latencies in disaggregated setups.^[77]^[78]^[30] By 2024 and into 2025, block-level storage has increasingly integrated with AI workloads, emphasizing ultra-low latency solutions like NVMe-oF to handle the massive, real-time data demands of machine learning training and inference. Hyperscale data centers, operated by providers such as AWS and Google, have adopted disaggregated storage architectures, where block volumes are pooled and dynamically allocated across compute nodes via protocols like NVMe-oF over RDMA, reducing costs and improving scalability for AI clusters processing petabytes of data. This trend, driven by AI's exponential growth, has projected the global block storage market to reach USD 77.26 billion by 2032, with innovations in zoned namespace (ZNS) SSDs further optimizing endurance and throughput for hyperscale environments.^[79]^[80]^[81]

History

Block-level storage

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Block-level storage

See also

References